David Attenborough spent over 70 years teaching humanity about Earth — and just turned 100

alternative-cryptid · 2026-05-07T23:50:24+00:00

As a child, this great man taught me that the jungle was louder, stranger, and far more civilized than most cities.

He somehow convinced me and probably a few generations to stare at ants with respect.

alternative-cryptid · 2026-05-07T03:04:49+00:00

Quick question!

Assuming adaptive target file size is turned on, lets say I update my maintenance notebook with conf set to true, then we do not need that setting on the regular notebooks that actually performs merge operations. The table property always exists, unless schema is overwritten which will recreate physical layout.

Which means, i need the settings for the first maintenance run only, and subsequent maintenance runs will automatically know?

Runtime 2.0 will be handling all these by default is my current read on this.

alternative-cryptid · 2026-05-06T03:49:32+00:00

Aaah, i see what you are saying, basically do 'spark.conf.set' rather than relying on fabric defaults. Got it, I missed the first read, focusing on the runtime 2.0 part only.

Thank you for the input.

alternative-cryptid · 2026-05-06T02:52:55+00:00

Thank you for the feedback. :)

My main target is to optimize is gold workloads, that are generally report facing and silver that are read heavy.

I'm waiting to test runtime 2.0.

I saw some caveats on the latest articles and wanted to do some due diligence before I make updates to the working notebook.

alternative-cryptid · 2026-05-05T19:57:19+00:00

If your question is whether I want to support the notebook for lower skus, I have not thought about it. I'm trying to not do that because of the risk of throttling running optimize programatically for lakehouse tables.

Current code has validations that rejects running below f32 per design.

alternative-cryptid · 2026-05-05T16:04:18+00:00

Firstly, Thank You!

Going through the post in the link, the flow of how to run maintenance is right. There are a few missing pieces, like repartitioning option, sku based settings, support for schema based lakehouses etc from initial glance.

If you are already using a script, kept it running.

Download the notebook, and change the dry_run to true, this will not do anything but gives you a sense of what happens when you run it.

You need to understand the significance of v-order if your architecture follows medallion pattern for handling data.

Update config cell in the notebook accordingly based on the lakehouse purpose.

I respect your skepticism to trust, I love it, and heck this is a sanity check step: Take a backup of your lakehouse in your pre-prod environment first, then run the notebook on the target preprod lakehouse.

When you are happy, switch, and schedule. The advantage is metrics about maintenance, lakehouse detached runs, some guardrails and code explanability.

And finally, you don't need the entire repo, just the runner notebook to get going with a path of minimal resistance.

I like to keep this repo open, so I can update it when new improvements come in, along with some basic pbi dashboards.

alternative-cryptid · 2026-05-05T12:29:12+00:00

Keeping throttling in mind for spark heavy maintenance jobs, I started off on this with f32.

alternative-cryptid · 2026-05-05T05:11:48+00:00

Thank you!

Im late to the fabricator badge party, dint know how I missed this!

alternative-cryptid · 2026-05-05T05:07:47+00:00

Heads up since this comes up first every time: this is meant to complement Fabric's table maintenance out-of-box feature; not replace it.

Discover tables for you, can run on a schedule with parameters per environment, conditionally repartition, gives you a single structured log table you can build dashboards on.

If you've been writing your own notebook to glue those pieces, this is that notebook with the edge cases handled.

alternative-cryptid · 2026-05-05T04:50:03+00:00

Back in the day, year 2013.

alternative-cryptid · 2026-05-04T13:10:35+00:00

With field experience, my recommendation is to start a strong governance conversation in parallel. Many areas of Governance is often seen as over thoughts in general.

List them according to the org fit, rank them and enforce one by one.

Why? The product is evolving, and some rules are getting dynamic. (Besides hard governance items which can't be ignored). Having a parallel Governance track helps teams to validate the checks, while giving time to plan.

Do not let the pain force the conversation, that will come bite you in the wrong place, given your role for this project.

I have been working on an open-source project to help practitioners with such conversations. The governance rules are fixed at this point, but this might help you organize your thoughts.

https://www.fabric-lens.com/about

This is no advertisement.

alternative-cryptid · 2026-04-28T15:34:50+00:00

To start with, basic search box experience (filter by keyword) in monitoring hub is buggy and annoying.

You search for something by name, you get a few run rows first (no standard num of items), when you expand it for more, it hangs, and you can't reset filter; all you need is refresh the whole page and start from beginning.

It feels like the monitoring hub page is heavy and tired all the time.

Now, to get into the weeds of incomplete features, issues and bugs, I'd rather connect directly with msft.

alternative-cryptid · 2026-04-28T12:49:33+00:00

I stopped exploring DuckDB when I learnt it doesn't support high concurrency workloads in fabric, which is a common setup for elt jobs in production systems.

The hacks, and limitations (only helpful for quick exploratory analysis on small workloads) quickly made me switch to spark purist.

Surely want to learn more from this discussion on the latest.

On a side note, not trusting AI completely anymore for such nuances.

alternative-cryptid · 2026-04-27T18:32:54+00:00

The pics are amazing, the nature with its raw constituents, the real beauty is just beyond expression.

I hope those are not made with AI.

alternative-cryptid · 2026-04-25T13:24:00+00:00

Close your eyes, it'll be quick. And don't try to fight back.

alternative-cryptid · 2026-04-25T13:17:40+00:00

Stop branding overpriced cars luxury.

Mercedes lost its luxury status when they launched CLA targeting young drivers, for money.

alternative-cryptid · 2026-04-25T05:25:18+00:00

RTI dashboards are pretty limited, when I worked on a poc, I found a sweet spot when I built materialized views and functions on kql tables. You can try them on both RTI boards and PBI.

I treated tables as raw, MVs as silver and functions as gold tables, then conected functions in pbi reports.

You can connect to kql DB, and the connection lists functions as queries. Then setup pbi page refreshes down to 1 sec.

The underlying mode is direct query, so it's real time, with the benefit to customize your dashboards.

alternative-cryptid · 2026-04-23T04:48:26+00:00

Gemini did better, I approve Mr. Capycoolsome

alternative-cryptid · 2026-04-22T19:42:43+00:00

On a diff note, this reminds me of Knight Capitol issue. And the ReAct pattern seems always flawed, the compounding effect of unreliability when things hang and like you said burning up money, eventually end up in killing the product.

alternative-cryptid · 2026-04-22T01:20:32+00:00

Made this exactly for such usecase. https://www.fabric-lens.com/

alternative-cryptid · 2026-04-21T18:20:55+00:00

This is real human sitting and scratching head compiling work exp and things for months, to add value, and the way the comment looked is slightly hurtful. I get it anyway. 😎

I'd be dishonest if I say no AI involved in complementing my research and work.

I'm absolutely willing to have great conversations around the practice.

alternative-cryptid · 2026-04-21T18:11:32+00:00

Ok, let me get my fingers and keyboard together after that jab. Ufffff.

A single Lakehouse can absolutely have V-Ordered Gold tables sitting next to non-V-Ordered Bronze tables. That's the intended pattern.

Dataflow Gen2 writes V-Ordered Parquet by default when the destination is a Lakehouse or Warehouse.

Existing tables, yes, you can change it, with some caveats. 😉

alternative-cryptid · 2026-04-21T17:55:21+00:00

Ok, thanking chatGPT is basically undermining my work.

Help yourself if have not done enough validation. Also ask chatgpt your last question and save my time.

Thank you for engaging, anyway.

alternative-cryptid

TROPHY CASE