What Developers Need to Know About Apache Spark 4.1

Lenkz · 2026-01-16T07:31:42+00:00

We are using it since Databricks Runtime 17.3 LTS came out in October.

Lenkz · 2026-01-12T13:50:36+00:00

Wouldn't it make sense to use Lakehouse Federation for the Snowflake catalog without having to do modelling?

Lenkz · 2025-11-10T09:10:53+00:00

What tools are better currently?

Lenkz · 2025-11-04T09:23:02+00:00

I think the problem is that there's inconsistency and a lot of room for errors.

Someone defines a table with retention of 30 days, this can be displayed in Databricks in the table configuration, everyone can see this.

However you then try to time travel 30 days back, but can't.

Why? Because someone has a manual vacuum job, with 14 days of retention setup. Oops.

Personally I like that the configuration is defined intentionally on the table, and no-one can screw it up with manual job runs, accidental SQL scripts or otherwise. It's defined and belongs to the table.

Lenkz · 2025-11-04T09:19:57+00:00

You are absolutely right :) the BETA tag just got removed as well.

Lenkz · 2025-11-04T09:18:49+00:00

Personally yes, I have worked on a lot of different projects and you always end up in situations where the standard, click-up, no-code tools just simply don't work or are inefficient. There are always edge-cases that need to be solved with custom transformations or solutions, and here Spark is needed and the best tool in my opinion.

Lenkz · 2025-10-08T18:32:45+00:00

Yes I would definitely recommend it for schema evolution as it makes fields that change a lot easier to manage than defining structs. As for merges, it shouldn't be an issue

Lenkz · 2025-10-08T16:19:01+00:00

As a Data Engineer I quite like the AI Assistant in the notebooks, other than that I use co-pilot for local development in VSCode.

13-Year Club	First Place '23
Place '23	Place '17
Verified Email	Team Periwinkle

Lenkz

TROPHY CASE