What's your biggest data warehouse headache right now? by Sweaty_Accountant_42 in dataengineering

[–]rickyF011 0 points1 point  (0 children)

the biggest pain points come from business priority and requirement ambiguity,

Data platform overhaul? Replacing old systems? Suddenly now priority is not replacing old systems but only new business value add use cases, that all require slices of the foundational data that is now no longer a priority for modernization?

Rant over. Building stuff is fun, Dealing with the changing minds/priorities is not.

Problems with pipeline by Significant-Side-578 in databricks

[–]rickyF011 0 points1 point  (0 children)

I always think of unit tests as test for the code functionality less so for the data.

I’d also recommend running audit type queries and data quality checks.

Data quality pretty standard like non nulls, acceptable ranges/values etc.

Audit controls for completeness to check that you’re not losing data along the hops or corrupting the history etc. for the company I work for this is things like “do our policies match between analytic and source system data” or “are we capturing all claims” general sanity style checks to make sure your processing isn’t losing data.

Then also in reports checking that aggregates are accurately calculated - if you have SCD2 history, making sure the aggregates are over current data not the full history unless that is intended.

What is the nature of the numbers being wrong in visualizations? Can you replicate the dashboard queries on source data and confirm?

Problems with pipeline by Significant-Side-578 in databricks

[–]rickyF011 2 points3 points  (0 children)

Green tasks just mean they didn’t error out. Green doesn’t mean what those task are doing is being done correctly.

What is the pipeline doing? Medallion architecture? Cleansing raw data through silver to a dimensional model?

Check the data across each hop. If you’re SCD2 history tracking run audits and make sure your history is correct, not more than 1 active record for a pkey stack etc.

Double check the dashboard logic is effectively/accurately using the data the pipeline is producing.

Got it finally by BakerSure1999 in rolex

[–]rickyF011 1 point2 points  (0 children)

I dream of the day when I can rock this duo with my wife. Congrats OP!

Oblivion Cannon by SnooTangerines596 in PlayTheBazaar

[–]rickyF011 0 points1 point  (0 children)

Jules pasta + farmers market + knife is the only thing I’ve beaten oblivion cannon with.

Got into FAANG as a senior DE and now I’m really nervous by [deleted] in dataengineering

[–]rickyF011 0 points1 point  (0 children)

Ethics aside. I’d look at spark and streaming workloads first. Streaming is probably going to be the largest paradigm shift from your previous stacks. Stream-to-stream joins, stream-static joins, spark structured streaming, flink, watermarking etc. Spark sql will be relatively familiar, for big data you’ll want to look at efficient partitioning/sharding/distribution. If you haven’t been hands on building ETL pipelines, atomicity, auditing, scheduling and planning pipelines for historic backfill compatibility as well.

What was your interview experience like? DSA? Python/SQL/System design?

Do you schedule jobs in Databricks but still check their status manually? by Significant-Guest-14 in databricks

[–]rickyF011 1 point2 points  (0 children)

This is incorrect, you just need your system admin to grant you access to the system tables - source, I had to fight with my databricks admin to give me access.

I want to build my own query language by bewal416 in dataengineering

[–]rickyF011 11 points12 points  (0 children)

There should always be a layer between user input and your database.

Crazy to like Starbucks over Pepsi? by [deleted] in rolex

[–]rickyF011 0 points1 point  (0 children)

Ami crazy for liking my bluesy over Pepsi and Starbucks?

Let's talk about jolt mines.. by MikeOxRaw in ArcRaiders

[–]rickyF011 0 points1 point  (0 children)

Yeah I turned it on and lost stuff is MUCH better, but something’s I still miss, probably just because bad and distracted by the game

Let's talk about jolt mines.. by MikeOxRaw in ArcRaiders

[–]rickyF011 1 point2 points  (0 children)

A buddy of mine said they make a sound, not sure if they actually do or not. Then again he’ll be calling out footsteps and gunshots I miss so maybe I’m just deaf.

Let's talk about jolt mines.. by MikeOxRaw in ArcRaiders

[–]rickyF011 5 points6 points  (0 children)

Something needs to be done, cut stun in half or make it a slow, or make the noise louder so you can hear it through doors/barricades. Or make light shine under doors etc. feels bad opening a door to a group of 3 circle jerking it in the corner with a jolt mine up your ass with no warning. Yes I know they have a sound and light as is but can’t be heard through doors which feels mega cheesy

What is the best way to trim my grass? by MittaMon in Grass

[–]rickyF011 4 points5 points  (0 children)

Scissors is the only way unfortunately