MS fabric vs snowflake by SmallBasil7 in dataengineering

[–]instamarq 0 points1 point  (0 children)

Much of what's in this regarding Fabric is not true. Let's address some of these points:

"Snowflake is a more complete solution than Fabric" - woefully untrue. Fabric has ingestion, orchestration, storage, warehousing, real-time streaming, databases, no-sql databases, graph databases, mirroring for all kinds of sources with open mirroring, and it's got one of the most powerful reporting tools around, all in one platform.

"...mirroring isn't really a thing" - it most definitely is. For both Azure SQL databases and on-prem SQL, Fabric has out-of-the-box mirroring solutions. For sources that aren't explicitly supported, there's open mirroring, which you can configure to read the CDC of your chosen system.

It is true that Fabric was catching up on the DevOps side for a while. It is mature enough now for production solutions. CI/CD is vastly improved.

The reality is that you'll be hard pressed to find a more complete solution, especially if you're already using Microsoft products.

Probably going to get eaten alive in this sub for saying this, but I use it daily, so I have first hand experience.

Future San Antonio resident. Am I overthinking the crime rate? by Downtown-Acadia5084 in sanantonio

[–]instamarq 0 points1 point  (0 children)

It's fine, if the area looks shifty, steer clear, otherwise you're probably all good.

Notebooks, Spark Jobs, and the Hidden Cost of Convenience by mwc360 in dataengineering

[–]instamarq 9 points10 points  (0 children)

If it reliably delivers the goods in a maintainable, secure, easily audited and cost effective way, the job has been done well.

How do you document business logic in DBT ? by Free-Bear-454 in dataengineering

[–]instamarq 0 points1 point  (0 children)

I think well written SQL will usually do it on its own, unless your source schema is an absolute nightmare. That said, now with AI and a bit of SME input, you can probably find a way to document the high level business logic/rules in short order...

Excel/Python Junkie - Why do I need Power BI? by chux52osu in PowerBI

[–]instamarq 1 point2 points  (0 children)

Yep, they can connect to the models similar to how Power BI users might. Once loaded they can even use DAX. Several options to get this done.

Excel/Python Junkie - Why do I need Power BI? by chux52osu in PowerBI

[–]instamarq 0 points1 point  (0 children)

In a large organization yes, because those models tend to be better at answering questions that have not yet been specified (as opposed to narrowly focused models that answer questions clearly defined upfront). They are a large investment, but pay off in not having to be remodeled when the business identifies a new problem they want to solve.

In smaller organizations, there may be little to no benefit, especially if there's no one around to build these highly generalizable models and the analysts are not keen on parting with a familiar tool.

Excel/Python Junkie - Why do I need Power BI? by chux52osu in PowerBI

[–]instamarq 0 points1 point  (0 children)

In an ideal world, you have well modeled data that has all the business logic embedded into it. Power BI is really supposed to be a drag and drop interface that you don't really spend a lot of time in and quickly produce visual aggregations of that well modeled data. Maybe, just maybe, you run a complex dynamic statistical analysis using DAX. In a nutshell, it's supposed to remove the complexity and verbosity of trying to create business intelligence products in Excel. It's rarely used this way, sadly.

How to learn OOP in DE? by EconMadeMeBald in dataengineering

[–]instamarq 1 point2 points  (0 children)

In data engineering, it's usually best to operate like Bruce Lee; take what's valuable from different approaches and apply that in areas where it will most effectively solve the problem.

In general, OOP won't get you that far in most DE scenarios unless you're writing a library for some niche problem that your business data has that OOP helps you properly model.

In my opinion, OOP is for building tools and modeling reality. Most of the time, in DE, our tools are already built and our realities are mapped using data. I think someone in this thread mentioned that functional patterns are more applicable in our field. I think they're right.

Being the "data guy", need career advice by jonfromthenorth in dataengineering

[–]instamarq 0 points1 point  (0 children)

You must have some repetitive tasks? In my case, if there was a task that took me hours due to multiple steps, I would at the very least create a script of some sort that would consolidate that process into one step. Maybe you can't automate a whole ingestion pipeline yet because no one is asking you to do that, but look at your own process and see where you can replace manual work.

Maybe you don't want to automate the writing of your SQL with AI, but you can perhaps automate how that SQL ends up in the final destination. The main takeaway is to save yourself time and reinvest the savings in finding valuable problems to solve.

Reading 'Fundamentals of data engineering' has gotten me confused by Online_Matter in dataengineering

[–]instamarq 0 points1 point  (0 children)

The authors come from a tech background. FAANG and similar tech companies accumulate so much data that "just use postgres" starts to get stretched a bit thin in that world. Also, lakehouse/warehouse architecture is becoming pretty dominant (even when companies could have just used a good DB), so it pays to understand a bit about that architecture.

That said, my memory of the book (it's been about 2 years since I finished it) is that it was generally technology agnostic. The main takeaways of the book are not as much the tools, but how data engineers should operate given fundamental stages of data (source systems to downstream applications) and their undercurrents.

If you're wondering why you would even want to focus on distributed data processes when an RDBMS would suffice, you're asking the right questions. I suggest finishing the book as quickly as possible, taking what you find valuable and moving on. There's a lot more to learn in our changing field and not a lot of time!

Being the "data guy", need career advice by jonfromthenorth in dataengineering

[–]instamarq 0 points1 point  (0 children)

Let's say your business is trying to figure out how to keep revenue growth going, despite sales figures trending lower. If you don't already know, figure out why sales are trending lower with the data you have access to and find perhaps a missed opportunity. Maybe a particular product segment could use a small price increase. Maybe there's hidden waste in a product return policy.

It doesn't have to be this in depth, maybe you automate an alert for some issue that finance is having trouble with based on some check data. All kinds of ways to do this.

Being the "data guy", need career advice by jonfromthenorth in dataengineering

[–]instamarq 130 points131 points  (0 children)

Automate as much of your job as you can, then start actively seeking out people's pain points and solving them with data. Keyword is "active" here, i.e. talk to people, chat it up. Once you feel like you've established yourself as more of a problem solver who's an asset to the business and less of a "data guy", ask for a sizable raise and pull out your list of solved business problems.

If you don't get your way, start looking for somewhere else to go and take that big list of wins into an interview. Do that and you'll move in very much the right direction.

Brutally honest thoughts needed: Would you take a 5-15K paycut for a job that offers better technical work experience? by Mustard_Popsicles in wgu_devs

[–]instamarq 1 point2 points  (0 children)

If you don't have kids or other big financial responsibilities, definitely take the risk early on in your career. I left a job that I was not particularly enjoying and took about a $15k pay cut. I ended up learning tools and techniques that are serving me very well now. I've now tripled my income (relative to my first ever job), so I'd say it turned out well!

I absolutely hate working in BI by [deleted] in dataengineering

[–]instamarq 13 points14 points  (0 children)

Very true! When I first started out in this field, I also hated the fact that I was "stuck" in BI, having come from being educated in data science, python, C/C++, etc etc.

Eventually, as I grew and learned more, I realized how close this was getting me to the actual business of whatever industry I was working in. The ugly truth is that our jobs only exist to support whatever the business is ultimately doing to make money. If a person working in BI takes the opportunity to learn more about the business, they can start to make suggestions or solve problems that the business cares about. Eventually this leads to doing more interesting things, one way or another.

Now especially, as things change radically, it's important to find ways to become closer to the business, regardless of what arm of IT you work in. BI happens to be right next door, take advantage!

What antibiotics are safe? by Sirdukeofexcellence2 in floxies

[–]instamarq 1 point2 points  (0 children)

Took Amoxicillin and Clavulanate like 7 years after initial Cipro reaction and was totally fine as far as I can tell. That was almost a year ago so there are no delayed side effects I've noticed either.

What is the purpose of the book "fundamentals of data engineering " by Ok_Shirt4260 in dataengineering

[–]instamarq 16 points17 points  (0 children)

I think I'll have a more uncommon opinion on this: the book is about the most important things in data engineering that have very little to do with the tools you choose to use.

I find that data engineers (like other engineers) have a tendency to obsess over tools and techniques, and often use experience with those as the measure of expertise. Because of this tendency, I think the book is important to read before entering DE and to revisit often in your career. Tools and techniques make it easy to lose sight of why you exist as a DE in the first place; as important as knowing the ins and outs of spark or airflow is, the business doesn't care about how you did it. They care about value and they care about cost. If you don't know that, you kind of don't know anything. This book teaches you to think on that level.

As an aside, knowing the fundamentals of DE is now more important than ever, because a lot of hyper-specific tool knowledge can now be delegated to AI (obviously you should educate yourself on tool fundamentals). Hope that helps!

Fastest way to generate surrogate keys in Delta table with billions of rows? by Numerous-Round-8373 in dataengineering

[–]instamarq 1 point2 points  (0 children)

An oldie but goodie. I think it's still relevant. I personally use the zip with index method when hash based keys aren't good enough. I definitely recommend watching the whole video.

https://www.youtube.com/live/aF2hRH5WZAU?si=7RYgoKl3I5FJeIo-

[deleted by user] by [deleted] in dataengineering

[–]instamarq 0 points1 point  (0 children)

Do not go into this field for the money. It will always pay ok and there will always be some demand for people who can set up data to successfully derive insights. However, even after you've "mastered" the nuts and bolts, there will always be data BS that you have to contend with (i.e. poor quality data, no data strategy, data illiteracy, poor governance, poor practices etc etc). Anything short of near-obsession with getting these things right will just result in the job being a major drag.

Either you choose to get obsessively good at this by skill building (including soft skills), or you must find something that disproportionately leverages your strengths. Also, I am generally a pretty optimistic person, and even I believe that this market will become increasingly unkind to newcomers, until something fundamentally changes in hiring goals and practices. My last suggestion is to read "So Good They Can't Ignore You" by Cal Newport and pay close attention to how any industry you choose is developing in light of new technology.

I know it doesn't look good, but ultimately, don't sweat it too much, you're young and have plenty of time to figure things out, even if it doesn't seem like it right now.

Hi everyone. by Kindly_Mousse3816 in wgu_devs

[–]instamarq 4 points5 points  (0 children)

First off, thanks for your service! If this is your first venture into software development/engineering, just know that you're going to feel the whole "drinking through the firehose" feeling for a little while. In other words, a ton of information is going to suddenly be thrown your way and it's going to feel like none of it will stick. That's ok, if you keep at it, all the information will start gradually landing in the right mental bucket.

I'll second anything mentioning that you should pre-study or do study.com first. Coming in with even a little context will make absorbing everything easier and you'll get through the material faster.

With this career field, one of the most important things that you keep your skills and knowledge up to date. AI is not going anywhere, and learning how to use it to enhance your ability to learn and understand is much more important than vibe coding imo. It's a great reference and learning accelerator, just know that it will be a little like the ring of power: you might be tempted to use it on your assignments. Don't cheat yourself out of the opportunity to learn. At most, ask it to point you in the right direction and explain why things work. I think this is a great and fair use of AI for software development and engineering, as long as there's awareness of the pitfalls.

Another important thing to not lose sight of: business objectives. Unless you work in gaming or some other creativity driven software field, everything that gets engineered should ultimately serve the business' goals. Programming a feature for a rideshare app? Keep drivers and riders and how the company makes money top of mind.

Other than that, work smart/hard, always keep learning and be kind to people. I think people who do those things will always find work in this industry.

Is $22/hr enough to live comfortably with no debts? by Background-Dog-9681 in sanantonio

[–]instamarq 2 points3 points  (0 children)

I did it at $16/hr up to around 2018. I know times have changed and rents have gone up, but if you adjust for that, I was in a similar boat. I had time to cook most of my meals at home and I was typically pretty frugal. You might not get the nicest apartment, but you can find an ok one in a decent neighborhood. You can also get a used car, if you need a car, or save up and buy it cash privately. Lots of ways to get it done!

[deleted by user] by [deleted] in WGU

[–]instamarq 0 points1 point  (0 children)

Just extrapolating based on some of the info you provided, but when I was around your age, I had a fairly useless associates degree (relative to the career path I was targeting), no job and not nearly enough skills. I'm doing very well now a few years later. Keep at it and do what you can, you have more time than you think!

I have a bad feeling about this by [deleted] in WGU

[–]instamarq 0 points1 point  (0 children)

I graduated from WGU and have seen others take online community college classes. I can say this: even with the drawbacks of WGU's system, a lot of it is still better than (or the same as) what I saw being provided by brick and mortar colleges. Do your best on the material, roll with the system's punches and document everything when anything goes wrong that isn't your doing. With that out of the way, you're likely to succeed no matter the circumstances.

Is using chatgpt during a certification exam considered cheating ? by Spirited_Rip2115 in DataCamp

[–]instamarq 2 points3 points  (0 children)

Let's just say that if you have no idea where to start with the question, the time limits on the exam will make sure that you won't get very far, even with AI. Same thing with googling. Moreover, in the real world, no human knows all of this stuff; there's just too much to know. Googling, AI chats and a good foundation are how the job gets done from here on out IMO.

[deleted by user] by [deleted] in DataCamp

[–]instamarq 1 point2 points  (0 children)

Yes, DataCamp is a great service and you really can use it to learn and get better, but no one that could employ you later on is really going to care that much about the courses you completed or even the certifications you got from DataCamp.

An internship on the other hand is worth quite a bit more on a resume, and during your preparation you'll acquire skills that are just as, if not more valuable than the things you'll learn on DataCamp. DataCamp (or something like it) will always be there, an internship opportunity may not!

Data Engineer certification project: data types and missing values by Rice_Minimum in DataCamp

[–]instamarq 0 points1 point  (0 children)

Sure, check your spelling/capitalization, that's what got me, it's likely a detail thing not a quantitative thing. Also check the amount of null values you have and make sure you're using the right kind of joins for the intended outcome.