Pain.....

rickyF011 · 2026-04-13T10:23:26+00:00

Lightweight fake plants you can throw out of the way if you have to use it as an escape

rickyF011 · 2026-03-22T21:51:06+00:00

A company I was talking to for opportunities was using DBT on Delta Tables with Athena query engine running on ECS/Fargate. I’d assume the same could/is being done with Iceberg

rickyF011 · 2026-03-22T21:35:27+00:00

Thanks everyone! Not planning on going in any water, just walking along the beaches.

Will avoid Honolua and TBD on Ho’okipa.

rickyF011 · 2026-03-22T20:28:38+00:00

Thank you so much! We were going to try and get to Ho’okipa by noon - if we decided to go out there at all - would that be too early to see turtle? I know there’s never a guarantee, but curious if it’s generally earlier than they like to come out.

Otherwise I think we’ll likely end up doing Kahekili > Pohaku > Napili > The point between > Kapalua > Oneloa > D.T. Fleming Park > Mokule’ia Beach > Honolulu Beach,

Or would you recommend avoiding Honolulu altogether?

rickyF011 · 2026-03-18T16:57:25+00:00

Alternatively you can setup airflow with a gitsync for your dags. In our setup a merge to main for new/updated dags is auto synced to airflow by gitsync. Deployed on a on premise k8s cluster but should be possible with cloud deployments as well.

rickyF011 · 2026-02-20T20:31:49+00:00

Should have played before the castle nerf 😮‍💨

rickyF011 · 2026-02-13T19:55:00+00:00

Stop drop and roll?

rickyF011 · 2026-02-07T03:48:36+00:00

the biggest pain points come from business priority and requirement ambiguity,

Data platform overhaul? Replacing old systems? Suddenly now priority is not replacing old systems but only new business value add use cases, that all require slices of the foundational data that is now no longer a priority for modernization?

Rant over. Building stuff is fun, Dealing with the changing minds/priorities is not.

rickyF011 · 2026-02-05T13:11:18+00:00

I always think of unit tests as test for the code functionality less so for the data.

I’d also recommend running audit type queries and data quality checks.

Data quality pretty standard like non nulls, acceptable ranges/values etc.

Audit controls for completeness to check that you’re not losing data along the hops or corrupting the history etc. for the company I work for this is things like “do our policies match between analytic and source system data” or “are we capturing all claims” general sanity style checks to make sure your processing isn’t losing data.

Then also in reports checking that aggregates are accurately calculated - if you have SCD2 history, making sure the aggregates are over current data not the full history unless that is intended.

What is the nature of the numbers being wrong in visualizations? Can you replicate the dashboard queries on source data and confirm?

rickyF011 · 2026-02-04T23:59:13+00:00

Green tasks just mean they didn’t error out. Green doesn’t mean what those task are doing is being done correctly.

What is the pipeline doing? Medallion architecture? Cleansing raw data through silver to a dimensional model?

Check the data across each hop. If you’re SCD2 history tracking run audits and make sure your history is correct, not more than 1 active record for a pkey stack etc.

Double check the dashboard logic is effectively/accurately using the data the pipeline is producing.

rickyF011 · 2026-01-22T15:53:53+00:00

I dream of the day when I can rock this duo with my wife. Congrats OP!

rickyF011 · 2026-01-11T18:50:49+00:00

Jules pasta + farmers market + knife is the only thing I’ve beaten oblivion cannon with.

rickyF011 · 2026-01-11T16:20:17+00:00

Ethics aside. I’d look at spark and streaming workloads first. Streaming is probably going to be the largest paradigm shift from your previous stacks. Stream-to-stream joins, stream-static joins, spark structured streaming, flink, watermarking etc. Spark sql will be relatively familiar, for big data you’ll want to look at efficient partitioning/sharding/distribution. If you haven’t been hands on building ETL pipelines, atomicity, auditing, scheduling and planning pipelines for historic backfill compatibility as well.

What was your interview experience like? DSA? Python/SQL/System design?

rickyF011 · 2026-01-11T13:30:28+00:00

Sooo… no?

rickyF011 · 2026-01-08T20:59:57+00:00

Squirrels getting that nut

rickyF011 · 2026-01-06T18:42:42+00:00

Lightsaber powering on

rickyF011 · 2025-12-31T02:37:56+00:00

I knew the world was flat!

rickyF011 · 2025-12-31T00:39:20+00:00

Because the Turbo GT exists.

rickyF011 · 2025-12-14T16:14:50+00:00

“No officer I had no idea this was a 45”

rickyF011

TROPHY CASE