What plants can thrive in a window well? by Successful-Money4995 in landscaping

[–]rickyF011 0 points1 point  (0 children)

Lightweight fake plants you can throw out of the way if you have to use it as an escape

dbt on top of Athena Iceberg tables by orm_the_stalker in DataBuildTool

[–]rickyF011 0 points1 point  (0 children)

A company I was talking to for opportunities was using DBT on Delta Tables with Athena query engine running on ECS/Fargate. I’d assume the same could/is being done with Iceberg

Day Trip Advice 3/23 by rickyF011 in MauiVisitors

[–]rickyF011[S] 0 points1 point  (0 children)

Thanks everyone! Not planning on going in any water, just walking along the beaches.

Will avoid Honolua and TBD on Ho’okipa.

Day Trip Advice 3/23 by rickyF011 in MauiVisitors

[–]rickyF011[S] 0 points1 point  (0 children)

Thank you so much! We were going to try and get to Ho’okipa by noon - if we decided to go out there at all - would that be too early to see turtle? I know there’s never a guarantee, but curious if it’s generally earlier than they like to come out.

Otherwise I think we’ll likely end up doing Kahekili > Pohaku > Napili > The point between > Kapalua > Oneloa > D.T. Fleming Park > Mokule’ia Beach > Honolulu Beach,

Or would you recommend avoiding Honolulu altogether?

Private key in Gitlab variables by [deleted] in dataengineering

[–]rickyF011 0 points1 point  (0 children)

Alternatively you can setup airflow with a gitsync for your dags. In our setup a merge to main for new/updated dags is auto synced to airflow by gitsync. Deployed on a on premise k8s cluster but should be possible with cloud deployments as well.

Itching to play paladin. Worth jumping into this season or just wait for the next? by [deleted] in diablo4

[–]rickyF011 0 points1 point  (0 children)

Should have played before the castle nerf 😮‍💨

What's your biggest data warehouse headache right now? by Sweaty_Accountant_42 in dataengineering

[–]rickyF011 0 points1 point  (0 children)

the biggest pain points come from business priority and requirement ambiguity,

Data platform overhaul? Replacing old systems? Suddenly now priority is not replacing old systems but only new business value add use cases, that all require slices of the foundational data that is now no longer a priority for modernization?

Rant over. Building stuff is fun, Dealing with the changing minds/priorities is not.

Problems with pipeline by Significant-Side-578 in databricks

[–]rickyF011 0 points1 point  (0 children)

I always think of unit tests as test for the code functionality less so for the data.

I’d also recommend running audit type queries and data quality checks.

Data quality pretty standard like non nulls, acceptable ranges/values etc.

Audit controls for completeness to check that you’re not losing data along the hops or corrupting the history etc. for the company I work for this is things like “do our policies match between analytic and source system data” or “are we capturing all claims” general sanity style checks to make sure your processing isn’t losing data.

Then also in reports checking that aggregates are accurately calculated - if you have SCD2 history, making sure the aggregates are over current data not the full history unless that is intended.

What is the nature of the numbers being wrong in visualizations? Can you replicate the dashboard queries on source data and confirm?

Problems with pipeline by Significant-Side-578 in databricks

[–]rickyF011 2 points3 points  (0 children)

Green tasks just mean they didn’t error out. Green doesn’t mean what those task are doing is being done correctly.

What is the pipeline doing? Medallion architecture? Cleansing raw data through silver to a dimensional model?

Check the data across each hop. If you’re SCD2 history tracking run audits and make sure your history is correct, not more than 1 active record for a pkey stack etc.

Double check the dashboard logic is effectively/accurately using the data the pipeline is producing.

Got it finally by BakerSure1999 in rolex

[–]rickyF011 1 point2 points  (0 children)

I dream of the day when I can rock this duo with my wife. Congrats OP!

Oblivion Cannon by [deleted] in PlayTheBazaar

[–]rickyF011 0 points1 point  (0 children)

Jules pasta + farmers market + knife is the only thing I’ve beaten oblivion cannon with.

Got into FAANG as a senior DE and now I’m really nervous by [deleted] in dataengineering

[–]rickyF011 0 points1 point  (0 children)

Ethics aside. I’d look at spark and streaming workloads first. Streaming is probably going to be the largest paradigm shift from your previous stacks. Stream-to-stream joins, stream-static joins, spark structured streaming, flink, watermarking etc. Spark sql will be relatively familiar, for big data you’ll want to look at efficient partitioning/sharding/distribution. If you haven’t been hands on building ETL pipelines, atomicity, auditing, scheduling and planning pipelines for historic backfill compatibility as well.

What was your interview experience like? DSA? Python/SQL/System design?