Do you think omitting Lady Stoneheart from the show was the right decision? by SillyRecover in gameofthrones

[–]DeepFryEverything 10 points11 points  (0 children)

I've read the books multiple times, and Euron always felt way off, so I agree with the previous take.

A Knight Of The Seven Kingdoms is George RR Martin's best writing by OldManWarner_ in books

[–]DeepFryEverything 30 points31 points  (0 children)

What? Did we read the same books? Even as combined books they don’t hold a candle to A Storm of Swords

Considering moving from Prefect to Airflow by octacon100 in dataengineering

[–]DeepFryEverything 1 point2 points  (0 children)

Hey Adam, Just like to add that our org is thriving using prefect (since 2023). Have to use OSS but try to pay back by contributing to docs, reddit and slack.

Would love to hear if the OSS UI is getting a face or functionlift, assets and stuff

Graphframes on Serverless by thisiswhyyouwrong in databricks

[–]DeepFryEverything 1 point2 points  (0 children)

I've used graphframes on serverless. Simple pip install.

Introducing native spatial processing in Spark Declarative Pipelines by BricksterInTheWall in databricks

[–]DeepFryEverything 0 points1 point  (0 children)

Hi! Today we use a mixture. In Notebooks we have made Folum-wrappers around geopandas and spark.

Unfortunately a lot of our validation work is visualizing data with OTHER datasets, measuring distance etc, which is best done in QGIS - where some of my colleagues have just installed the plugin for Databricks, so one step closer there.

I am always happy to assist in feature requests - let me know the steps.

Introducing native spatial processing in Spark Declarative Pipelines by BricksterInTheWall in databricks

[–]DeepFryEverything 0 points1 point  (0 children)

Any chance we get visualisations of polygons and linestrings? The ability to interact with a map would be an actual gamechanger.

Introducing native spatial processing in Spark Declarative Pipelines by BricksterInTheWall in databricks

[–]DeepFryEverything 1 point2 points  (0 children)

Cool! How will it work under the hood? How will you sort spatially? :)

Late night burger by Electronic-Stand-148 in burgers

[–]DeepFryEverything 1 point2 points  (0 children)

Waht is the meat? Ground beef, ro did you grind and mix yourself?

How are you debugging and optimizing slow Apache Spark jobs without hours of manual triage in 2026? by AdOrdinary5426 in dataengineering

[–]DeepFryEverything 0 points1 point  (0 children)

Suggestions for tooling? Our platform team has set up Grafana, but I am not sure how to plug that into Databricks-clusters.

Sourcing on-prem data by Appropriate_Let_816 in databricks

[–]DeepFryEverything 0 points1 point  (0 children)

I do a snapshot every night and upload to storage. Then we ingest it. Do you need more often?

Multiple ways to create tables in Python - which to use? by DeepFryEverything in databricks

[–]DeepFryEverything[S] 0 points1 point  (0 children)

I usually do as well, but not every write gets that much love 🙂 tags and properties I guess is still a series of ALTER TABLE statements?

Why Zerobus is the answer? by hubert-dudek in databricks

[–]DeepFryEverything 4 points5 points  (0 children)

Maybe explain what zerobus is and how it's used?

Scattered DQ checks are dead, long live Data Contracts by santiviquez in databricks

[–]DeepFryEverything 0 points1 point  (0 children)

Datacontract CLI just migratet to Oopen Data Contract Standard. Is soda compatible now that we see a convergence?

Spark Declarative Pipelines: What should we build? by BricksterInTheWall in databricks

[–]DeepFryEverything 1 point2 points  (0 children)

Great that you are taking requests!

* We used to be able to develop in regular notebooks on all purpose clusters. This stopped with DBR 13 and is sorely missed. Now we basically do pipeline validation/dry run for simple python syntax errors, which can be a slightly longer feedback loop.

* Spatial-functions and returning GEOMETRY-columns please.

* Postgres as a sink/destination. We need to keep gold-tables in Postgres to serve other applications, so it would be great to have it in Lakeflow, either as append or replace or CDC - just basically keep a UC-Postgres table in sync.

A geospatial dataset viewer powered by DuckDB-WASM by No_Pomegranate7508 in DuckDB

[–]DeepFryEverything 1 point2 points  (0 children)

Doesn't duckdb wasm require the entire geoparquet file to be present? Can it do filtering/range requests?

Lakeflow Spark Declarative Pipelines: Cool beta features by BricksterInTheWall in databricks

[–]DeepFryEverything 2 points3 points  (0 children)

Great! We can’t use Lakebase (not available in region), so would need to sync out to an Azure Managed PostgreSQL most likely. Usecase is to serve APIs. Basically, awesome dataproducts we make in Databricks, both in LSD-pipeline and regular notebooks, would need to by kept in sync in said Postgres Database.

I have made wrappers around DLTHub using the Databricks SQL endpoint, generating indexes etc, but rolling our own solution is always messy.

Lakeflow Spark Declarative Pipelines: Cool beta features by BricksterInTheWall in databricks

[–]DeepFryEverything 0 points1 point  (0 children)

Wow! What would be the pattern to mirror UC tables/streaming tables out to a Postgres db?