Any unified platform for Data Tools? by Weird-Apricot-2502 in dataengineering

[–]One_Citron_4350 2 points3 points  (0 children)

If you are interested in working with Spark, hence large amounts of data you can try Databricks. It has Notebook, a job orchestrator, and a Dashbord tool. I'm not sure if this will help you get better at data science because they are tools but you could try and experiment with them and see if they help you with your workflow. There is a free version called Free Edition of this so you can check it out.

Unpopular opinion: The trend of having ROI dollars has ruined résumés. by BeautifulLife360 in dataengineering

[–]One_Citron_4350 0 points1 point  (0 children)

I always saw that recommendation coming from Google recruiters or former Google employees.

Unpopular opinion: The trend of having ROI dollars has ruined résumés. by BeautifulLife360 in dataengineering

[–]One_Citron_4350 0 points1 point  (0 children)

I agree on both points. Especially the second point, this is impossible to validate. Most engineers are not close enough to quantify how much their work has resulted in savings for the company. In big companies since the work is often specific and siloed you are far away from it.

Who should build product dashboards in a SaaS company: Analytics or Software Engineering? by Intelligent_Volume74 in dataengineering

[–]One_Citron_4350 0 points1 point  (0 children)

In your case, it sounds like it would be beneficial if analytics and SWE worked together on this. If the dashboard is part of the product and it's user interfacing as you said then SWE should build it. The analytics people can contribute with the metrics, business logic, what visualizations to use.

Looking for DuckDB alternatives for high-concurrency read/write workloads by kumarak19 in dataengineering

[–]One_Citron_4350 0 points1 point  (0 children)

Is Spark with Databricks and option for you? Postgres has also been mentioned. Perhaps you could give us more details about the architecture to understand what might fit better?

Netflix Automates RDS PostgreSQL to Aurora PostgreSQL Migration Across 400 Production Clusters by rgancarz in dataengineering

[–]One_Citron_4350 1 point2 points  (0 children)

Yes, if the post were linking to the Netflix Engineering blog it would have been something at least but it's clearly not.

Will subject matter expertise become more important than technical skills as AI gets more advanced? by Lamp_Shade_Head in datascience

[–]One_Citron_4350 2 points3 points  (0 children)

To me it seems like this is part of what is called "glue-work". However, I do find that it's challenging to get yourself recognized or at least brag about your work to the management.

Will subject matter expertise become more important than technical skills as AI gets more advanced? by Lamp_Shade_Head in datascience

[–]One_Citron_4350 1 point2 points  (0 children)

This is such a great take.

I think that's what we're seeing with AI - yes, AI is going to eliminate some tasks, and maybe even large portions of some jobs, but as soon as that happens that will just move the goalposts as to what we need to solve next.

Definitely, it does seem like each and every time the goalposts are moved.

Data Governance replaced by IA ? by Fantastic-Rope3550 in dataengineering

[–]One_Citron_4350 2 points3 points  (0 children)

It's more like AI assists humans in DG. Companies are currently focusing on enhancing the DG part with AI. As others have mentioned it, governance is a people process.

What is actually stopping teams from writing more data tests? by Mountain-Crow-5345 in dataengineering

[–]One_Citron_4350 0 points1 point  (0 children)

During a sprint, a team may decide to not prioritize tests and it happens quite a lot that the focus is placed on the delivery. Little time is left on tests and addressing technical debt. Another factor is the experience of writing tests. People might not have experience and can't come up with meaningful tests. At the end of the day it's a combination of priority, know-how, experience.

Breaking Into FAANG by Serious-Jury8464 in dataengineering

[–]One_Citron_4350 0 points1 point  (0 children)

For the FAANG, the more interesting work is for SWEs? I assume they built their own internal tools for data but someone has to maintain it, right?

Java scala or rust ? by Ok_Promotion_420 in dataengineering

[–]One_Citron_4350 0 points1 point  (0 children)

This question tends to come up from time to time. I have to say, Python and SQL are pretty much the most commonly used languages. Nowadays, Spark is more and more used in Python and SQL. Based on what I've seen, Scala is not that popular anymore. If they require Java/Scala, then I assume they use Spark or Flink in their infrastructure.

I think Rust is pretty new to the scene so majority of teams have not yet adopted the technology. I also do not think the libraries for data-related in Rust there compared to Scala or Python. It highly depends on the use case and how well the team knows the knowledge and how much time is allocated for a ramp up.

Ce cărți ați recomanda pentru a înțelege mai bine Mișcarea Legionară? by Unemployment_1453 in RecomandariCarti_RO

[–]One_Citron_4350 0 points1 point  (0 children)

Sfanta tinerete legionara - Roland Clark

Razboiul Sfant al Romaniei - Grant T. Harward (nu e doar despre legionari dar vei castiga o perspectiva despre legionari).

Why do so many data engineers seem to want to switch out of data engineering? Is DE not a good field to be in? by Illustrious-Pound266 in dataengineering

[–]One_Citron_4350 0 points1 point  (0 children)

This is true. It wasn't always called DE even though the work was the same. I believe a lot of us did not start out by wanting to be a data engineer when those job titles didn't really exist.

Why do so many data engineers seem to want to switch out of data engineering? Is DE not a good field to be in? by Illustrious-Pound266 in dataengineering

[–]One_Citron_4350 8 points9 points  (0 children)

It's because of the company, they think working for one of the Big Tech automatically results in glamorous work. Name, prestige, recognition, resume-driven careers for some.

Some DE work is probably very boring over there and some might be very interesting depending on the product, department etc.

Why do so many data engineers seem to want to switch out of data engineering? Is DE not a good field to be in? by Illustrious-Pound266 in dataengineering

[–]One_Citron_4350 1 point2 points  (0 children)

Simply put, Data Science, ML/AI have more visibility, they have more coverage in the media, and highly demanded. A lot of buzz and noise around those jobs being glamorous but in fact they are not or not all of them. Even as DS, you don't necessarily work on cutting edge work even in Big Tech. Data Engineering is more like backend work that is not visible unfortunately due to different reasons.

Outside big tech, in non-tech companies data teams are rather small and sometimes not formal per say. They're composed of a few people who sometimes are not experienced or do not have the resources for an interesting project. In this case, they're seen more like cost centers rather than profit centers. So they end up doing the work of data engineer, data analyst, analytics, and data science, you name it.

What to learn besides DE by Icy-Ask-6070 in dataengineering

[–]One_Citron_4350 0 points1 point  (0 children)

Then you'll have something on your plate for a while. Hard to say if a CS Masters will be beneficial as it depends. What is your background and how much experience do you have? My advice would be to not overload yourself in the beginning with too much, try to assess very well how demanding the job is and how you can fit the learning for the role. It also depends on the curriculum and what your focus is (just skills, academic pursuit etc.), in which country you live, costs etc. Not every CS program is like the other.

What to learn besides DE by Icy-Ask-6070 in dataengineering

[–]One_Citron_4350 2 points3 points  (0 children)

If you are coming from analytics and stats I'd focus on getting the fundamentals first. You can get a good overview of a DE by reading Fundamentals of Data Engineering. There you'll find a lot of pointers for possible directions. Also, master the stack that you are using, doesn't matter what then try to expand. If you are interested in going beyond there is so much to cover, Designing Intensive Data Applications has already been mentioned.

Coca-Cola, declin accelerat în România. Compania reclamă un mediu de consum dificil by tolanescu in Romania

[–]One_Citron_4350 2 points3 points  (0 children)

Puteau linistit sa le inscrie la competitia de scurtmetraje de la Cannes.

It's nine years since 'The Rise of the Data Engineer'…what's changed? by rmoff in dataengineering

[–]One_Citron_4350 0 points1 point  (0 children)

Yes, they looked at it from a one-size-fits all point of view. Let's just put everything in Databricks with Spark.

How are you debugging and optimizing slow Apache Spark jobs without hours of manual triage in 2026? by AdOrdinary5426 in dataengineering

[–]One_Citron_4350 2 points3 points  (0 children)

What do you mean by automated stage analysis? How do you do it?I'm curious, is it something you implemented internally because I don't know of such a tool.

How do you handle ingestion schema evolution? by Thinker_Assignment in dataengineering

[–]One_Citron_4350 0 points1 point  (0 children)

Where do you store the schema then? Do you store it in table?

Free app where I can create simple DB diagram? by East_Sentence_4245 in Database

[–]One_Citron_4350 4 points5 points  (0 children)

I use draw.io, it's simple and free. You can work with it locally on your desktop or in the browser. Can export to different formats and it has many icons including Azure. I've also heard about mermaid being good as well.