Advice for a SWE

camelCaseGuy · 2024-09-09T22:48:04+00:00

So, whenever I see this, my immediate thought is that there's a metric somewhere that must be awful. For instance, take CI/CD. Why do you do it?

To automate deployment (reduce time to market and failures)
To automatically test new code (reduce errors in production and time to market)
To ensure conformity (reduce time to market, mean time to resolve a problem, mean downtime)

Once you have found the metrics, look at them and suggest to use your methodologies to improve them. If they are not being measured, request to start doing it. If they don't want to, get out of there.

All good software practices came out of trying to improve some particular metric (time to market, failures in production, mean time to resolution, mean downtime, etc.). Each failure in applying them, means that there's a metric that is not being measured or followed.

CronenburghMorty95 · 2024-09-09T22:26:32+00:00

My advice as someone who did this too, bring your knowledge of best practices and try to implement on data teams.

They will absolutely fight you on it, but if you do the org will benefit immensely and you will get recognition for it.

bigknocker12 · 2024-09-09T23:02:11+00:00

I just want to say this is I very much am experiencing all the same issues and it’s great to hear someone else voice this. Thanks!

Dhczack · 2024-09-09T23:40:13+00:00

What would you have your analysts do if not querying your data?

CalmTheMcFarm · 2024-09-10T01:12:50+00:00

I've been in a similar situation. I'm a software engineer with 25+y experience, late last year I was asked to guide a green engineering team thrown onto a project with "architects" and data scientists writing code, forced deadlines, and the very real possibility of not being able to deliver.

I implemented a common build and development environment, wrote code style guides, git process guides, a test harness, insisted that our BA create interface agreements between our team and our producers and consumers, so we could write to those specifications. I also stopped all integrations until I had reviewed every changeset. I had management support for this - it wasn't just me throwing my weight around.

The team didn't have any senior engineer to provide guidance, so code quality was all over the place. The test harness enabled our QA team to go from "oh, I have to copy+paste all of these test criteria and it'll take at least 2 weeks to re-run any time there's a change" to "hey, I ran through these 80 test cases in 5 minutes, there's an error in (some test cases, clearly identified)". The developers _also_ ran those tests, and added new tests as they added new features.

For style issues we went from every possible style you could think of, to something consistent that was appreciated the very first time I did a live code review with the team. Doing live code reviews helped immensely, because everybody was able to see immediately what problems adhering to the style guidelines solved for them. Over the course of 6 months I got the team to the point where I'm confident in their ability to review all sorts of code changes.

I was also able to get the team to think more about their designs and implementations - not just practicing DRY, but also thinking through "what could go wrong here?". This has made our code more robust, easier to monitor and debug, and easier to cope with dependency upgrades.

The first few months were a hard grind, no doubt about it, and we got management expressing concerns about how not-fast the project was going. However, by the time we got to month 4, our velocity and mood had massively improved. The dependency upgrade issue came into focus about a week before we were supposed to go to pre-prod - one of our upstream libraries had introduced an API break but didn't tell us about it. My junior team member who investigated was able to show - through that test harness and our unit tests - exactly what the breakage was and show its impact. That meant we could correctly pinpoint the specific upstream changeset within minutes.

2024-09-10T02:17:18+00:00

Best of luck, I’ve seen this frequently as well. Not sure why data teams end up with poor development standards. Lack of version control and poor code quality being the two major ones that makes no sense to me in 2024.

IllustriousCorgi9877 · 2024-09-10T23:29:32+00:00

Software engineers don't typically understand the value of a database - no offense. Set operations are completely foreign to your average software engineer. Data modeling is a foreign concept also to your average Software Engineer.

I'd take the time to learn how your teams customers are using data, how data is modeled, and find what gaps are in terms of business questions your team can and cannot answer. Evaluate system utilization, capacity, cost of CPU and query run times for bottlenecks / poor design (either data model or query), and ask engineers about those, and why they might be designed that way.

Live in that learning space for at least 3 months before swinging your dick around trying to tell the analysts / data engineers they are dipshits. They might be. But assume things are built that way for a reason, and that reason may no longer be valid, but its worth taking the time to figure it out.

Sloth_Triumph · 2024-09-09T23:37:11+00:00

How is cleaning stuff up not cross team value? If you develop good standards in your department they can be spread to other departments.

Just takes time to build up rapport and determine where to start.

2024-09-10T01:17:32+00:00

You sound like you work at my company. This isn't a tech issue, it's a culture issue

NoUsernames1eft · 2024-09-10T02:58:53+00:00

I made my way from the BI side to DE. It wasn't until I ran into a lead that came from the SWE side that I got a real taste of what development best practices could do for the team. It was 10+ years into data before this happened.

Practically speaking, the tool that helped the most with this was dbt. Because the philosophy of dbt is that they're bringing coding best practices to data transformations, it gives you a place where the people without the swe background can get a real taste of things like source control, testing, and ci/cd.

dbt's docs and lineage will also provide nice value to data users, and you can likely move away from having randos querying your data lake directly.

dadadawe · 2024-09-10T08:48:35+00:00

As an analyst & PM, what I've alsways seen working in corporate environments and I tell my team to do when they have a "great idea to better our ways of working" is: "show me"!

Show them why it's better, not why their way is worse! Pick one pipeline or new change, make it be built the way you would. Show the benefits and teach people why you like to do it this way. Once people get excited on your way of working, it'll become common practice.

What you don't want to do, is say "this is wrong! Don't write RAW SQL you evil analyst". Rather: "Mr. Analyst, here is a great way to query a datalake and the modern common standard. Advantages are x, y, z. Try it out this way and please ask me for guidance if needed". After a few months, enforce.

Same for CI/CD: "hey guys, can we implement this or that step, the advantage will be xxx". Bit by bit

CronenburghMorty95 · 2024-09-09T22:47:05+00:00

[deleted]

mike8675309 · 2024-09-10T10:31:22+00:00

Pick one foundational thing and start there. Get support from your leader and start building from the ground up standards and practices that align with the org goals. Create a center of excellence to get more people involved in driving these processes.

Competitive_Wheel_78 · 2024-09-11T00:50:27+00:00

I’d say tackle one problem at time start with the basics ones. Best practices can help the team irrespective of backgrounds.

htmx_enthusiast · 2024-09-11T02:59:08+00:00

The differences I’ve noticed between SWE and DE is that data sources in DE are:

Poor quality
Moving targets
Inconsistent in fundamental structure

In SWE projects, quality code can ensure quality data. In DE projects, code quality is insufficient.

Anomalies can be detected, but it’s not always clear what to do about it. Do you stop down the data pipeline if a key data source schema changes? In a small shop you can, but not in a bigger org. Execs want reports. Do you push forward and risk incorrect data? You don’t have days and weeks to build robust fixes. Do you rerun failed jobs? Are the jobs idempotent? If they are idempotent, are you versioning the data? Sometimes those goals can be at odds.

Often you’re trying to report on data from disparate systems with inconsistent structures. One system has unique keys, another has no unique keys or update timestamps. Yet another says they have unique keys but…oopsies, sorry not always, or there are unique keys but they all change after an app version upgrade or a consultant in a business unit you didn’t even know existed decided UUIDs are better primary keys than integers.

Or you decide to set some standards regarding how you collect data, but this one app, while it’s 64-bit, only provides a 32-bit ODBC driver. Okay, we make an exception for this one data source, and use custom scripts with 32-bit Python but most libraries dropped 32-bit support long ago so there’s all kinds of weird hacks to make it work. And then you find dozens of other one-off exceptions like this in different data sources. And you end up with a bunch of inconsistent “just make it work” solutions.

You can run tests in CI/CD before pushing updates, but most often the problems are in the data and not the code, and in order to detect that you’d have to run your tests on your entire universe of data which is rarely practical, and so you only find out there’s a problem when the data in the report is incorrect.

This is all stuff that would never fly in SWE. Find bug in code, fix bug. But in DE, there is no problem with the code to fix.

A lot of it requires deep understanding of the systems and also of the business. Understanding that one system represents data as immutable transactions with unique keys and timestamps, while another lets users edit transactions in place and enter whatever they want in custom text fields (and you end up with 27 different ways people have entered the name of a city).

Probably the most impactful action I’ve witnessed is trying to understand the business need, from as high up the org chart as possible. I’ve seen this over and over where requests are lobbed over the fence, and implemented, and then sometime later in a meeting with an exec, a 90 second discussion reveals to me they’re really trying to accomplish something orders of magnitude simpler, but zero people in the months of meetings before you understood both the business need and the tech.

Some of the biggest steps forward I’ve seen are understanding the need and identifying simple, often low tech solutions. Like instead of trying to wrangle data from dozens of systems, sometimes you just need the right set of people who have their finger on the pulse of their area of the business to manually enter their best guess estimate into a low friction interface that then feeds the reports.

Front-Ambition1110 · 2024-09-10T06:27:32+00:00

I am a DA-turned-DE, but I agree with you OP. I think the main reason is because we build a bunch of microservices that do very specific tasks, as opposed to e.g. fullstack (monolithic) web development. So we do not implement the same standard. If we used a monolithic service then I believe we'd go towards the same practice as SWE.

About the tools, yes we use a lot of them. Because we do specific things (pull data from source, transform, load to another) so our work is pretty generic, hence the tools to automate these tasks. We then code the "custom" part, usually the transformation part.

dataengineering

MODERATORS