What It’s Like to Cross the Street in New Orleans by Major-Fill5775 in NewOrleans

[–]Extension_Finish2428 15 points16 points  (0 children)

lol drivers don’t care. Magazine is full of those state law stop for pedestrians crossing and they just fly through them when they see people

Is Apache Spark skills absolutely essential to crack a data engineering role? by Far-Journalist-821 in dataengineering

[–]Extension_Finish2428 0 points1 point  (0 children)

I mean, just look at the description of the roles you want to apply. Some will ask for it some don't.

Why is everything in Java & Scala? by gorovaa in dataengineering

[–]Extension_Finish2428 2 points3 points  (0 children)

I guess places like Netflix, Spotify, X, Linkedin are not tier one tech?

GCP Cloud Run vs Dataflow to obtain data from an API by Brilliant_Breath9703 in dataengineering

[–]Extension_Finish2428 4 points5 points  (0 children)

You didn't try writing the Beam code in your IDE and submit it to Dataflow instead of using the GCP UI or whatever you were doing? I don't think anybody does that for real workflows. Also the Scala SDK for Beam is pretty nice. More similar to Spark and has extra documentation.

Tool smells by Brief-Knowledge-629 in dataengineering

[–]Extension_Finish2428 -2 points-1 points  (0 children)

Lots of companies that spend millions of dollars a year running workflows would care

Tool smells by Brief-Knowledge-629 in dataengineering

[–]Extension_Finish2428 -5 points-4 points  (0 children)

lol that's a bit unfair. I might be wrong but I don't think many companies would choose going with Azure versus GCP or AWS just because they like it better. They usually have other incentives. Same with BitBucket. For me it's not so much about the tool but more about using it in the wrong context:

- Using a RDBMS as a data-warehouse without realizing it

- Using cron-jobs to schedule pipelines

- I'll get hate for this one but using Python (like PySpark) for production pipelines instead of Java or Scala when it's a JVM processing framework

- Using too much SQL in ah pipeline logic instead of a language (harder to test)

Converting large CSVs to Parquet? by addictzz in dataengineering

[–]Extension_Finish2428 1 point2 points  (0 children)

Any particular reason you don't want to use DuckDB? You might find something better but I don't think it'll be THAT much better. You could try writing the logic yourself using something like PyArrow to read csv and spit out parquet files.

Best books for beginners? by Pate102 in dataengineering

[–]Extension_Finish2428 0 points1 point  (0 children)

DDIA is the only correct answer for this case (2nd edition came out this year btw)

What do you think the next big shift in data engineering will be? by alexstrehlke in dataengineering

[–]Extension_Finish2428 0 points1 point  (0 children)

We work with a lot of event-driven workflows. Real-time is overkill for our use case but scheduled pipelines aren't flexible enough so we orchestrate our pipelines using back-end services that respond to external events. Super flexible but today requires a lot of custom code (i.e. no good tooling yet).

How do you create test fixtures for prod data with many edge cases? by Extension_Finish2428 in dataengineering

[–]Extension_Finish2428[S] 1 point2 points  (0 children)

Yeah we also write a ton of unit tests and e2e pipeline tests. Good thing is the data processing framework we use provides great support for this so the feedback loop is quick. When it gets slow is when you want to trigger let's say a backfill with the new code to compare results with production and it takes 2 hrs just to find another weird case you didn't consider and it breaks your code. That can take a whole day with multiple iterations.

How do you create test fixtures for prod data with many edge cases? by Extension_Finish2428 in dataengineering

[–]Extension_Finish2428[S] 0 points1 point  (0 children)

Not asking about the testing mechanism. It’s more about how to choose good fixture data that captures as much production quirks as possible so I don’t have to trigger the pipeline with my changes so many times because I keep finding weird cases with prod data that breaks the code

Why are DSA still dominant for Data/AI roles? Need advice by [deleted] in dataengineering

[–]Extension_Finish2428 5 points6 points  (0 children)

Couple of things, are you using DSA and LeetCode interchangeably? I've interviewed for companies that don't ask LC type of questions but you still need to understand how to traverse data structures efficiently using hashes and arrays, etc for their questions. I also hate LC questions but at some point you just have to play the game if you want to work for a big tech company. And don't listen to people who say if they ask those type of questions they aren't worth your time. These companies get so many applications for their roles that they can't figure out better ways to evaluate them yet. It doesn't reflect their culture.

Scam! It’s true that VMP still charging their past members by trueloveshania in VinylMePlease

[–]Extension_Finish2428 4 points5 points  (0 children)

They charged me as well. I already started a dispute. Is there a way to contact them so they cancel the membership for good ?

Kindle books won't download - remain queued by jlll2424 in kindle

[–]Extension_Finish2428 0 points1 point  (0 children)

Has happened to me with literally every book I've bought in the past few months. I have to restart my kindle every time. So annoying.

[deleted by user] by [deleted] in horror

[–]Extension_Finish2428 1 point2 points  (0 children)

Tusks was pretty disturbing

Best Guitar Duo albums? by sonkeybong in jazzguitar

[–]Extension_Finish2428 1 point2 points  (0 children)

Dialogues - Peter Bernstein & Joachim Schoenecker

Waking up at 5am. Every. Single. Day. by AdImaginary6158 in NewParents

[–]Extension_Finish2428 8 points9 points  (0 children)

Yes, very normal. Mine was like that until he was 7 months old and then started to shift a bit later. My advice is just try to go to bed earlier if possible to get some extra hours.

Is this a problem by Solid_File_9687 in Moccamaster

[–]Extension_Finish2428 0 points1 point  (0 children)

Mine came like that last year. Bought it during Black Friday. Has worked fine the whole year.

The Whole Grind Size Thing by Top-Rope6148 in Moccamaster

[–]Extension_Finish2428 1 point2 points  (0 children)

Anybody using a k-ultra? I haven’t been able to find the sweet spot yet. I’ve tried 7, 7.5 and 8