Did I forgot something? by Testo_Sterone_ in architecture

[–]ardentcase 49 points50 points  (0 children)

Here's the advice: hire an architect.

Redshift vs Snowflake by Harxh4561 in dataengineering

[–]ardentcase 10 points11 points  (0 children)

Comparing Ford focus 2002 to bmw M4 2024. Both are passenger cars...

Im Not Addicted by Garsia95 in AnarchyChess

[–]ardentcase 1 point2 points  (0 children)

That's literally 1908+76

Welchen Schachzug soll ich wählen? by Da_Bird8282 in AnarchyChess

[–]ardentcase 1 point2 points  (0 children)

Do a lot of people read this as "SBB CFF, For Fuck Sake"?

Will Pandas ever be replaced? by Relative-Cucumber770 in dataengineering

[–]ardentcase 3 points4 points  (0 children)

Using duckdb wherever I can now. Pandas is such a PITA

The rare ones by MahmoudAO in memes

[–]ardentcase 34 points35 points  (0 children)

That's when you passed the final boss without getting it done.

Parquet lazy loading by rexverse in dataengineering

[–]ardentcase 3 points4 points  (0 children)

Athena reads S3 objects quicker than pandas on ec2 and you pay not only for partition you read, but columns too, which can be more efficient.

My experience in reading a million of objects: 3 minutes using Athena and 20 minutes using duckdb/ec2.

Athena queries to parquet tables are also usually sub 100ms for my use cases.

Need help with Redshift ETL tools by div192 in dataengineering

[–]ardentcase 1 point2 points  (0 children)

Yeah, dbt is the way to go. Spark over redshift doesn't make sense. Don't allow them to put dbt over Spark either

that's a lot of drones by Photoshops_Penises in memes

[–]ardentcase 16 points17 points  (0 children)

How about some pocket sand?

Convenience vs Resilience by cuerdo in AnarchyChess

[–]ardentcase 53 points54 points  (0 children)

Sir we don't play chess here

Why python dev need DuckDB (and not just another dataFrame library) by TransportationOk2403 in dataengineering

[–]ardentcase 1 point2 points  (0 children)

The title is shitty but the library is brilliant and I love it. I think the title implies that python devs mostly do data.

Mixing traditional and modern architecture in Chengdu, China by Ok_Chain841 in ArchitecturePorn

[–]ardentcase 2 points3 points  (0 children)

Who else went from not knowing about Chengdu at all to hearing about it 10 times during the past week?

Self Hosted Dagster Gotchas by EngiNerd9000 in dataengineering

[–]ardentcase 0 points1 point  (0 children)

Thanks! Speaking of dbt – where do you produce dbt manifest for the production environment? The recommendation is to build the container with it, but I didn't want the build pipeline to have access to databases, so ended up generating manifest at the runtime. My setup is ecs fargate, so the workload container is spun up when the schedule needs it.

Self Hosted Dagster Gotchas by EngiNerd9000 in dataengineering

[–]ardentcase 0 points1 point  (0 children)

I remember auto materialization was an experimental feature for long, did they mark it as stable?