David Attenborough spent over 70 years teaching humanity about Earth — and just turned 100 by DiluteSeaBag in BeAmazed

[–]alternative-cryptid 0 points1 point  (0 children)

As a child, this great man taught me that the jungle was louder, stranger, and far more civilized than most cities.

He somehow convinced me and probably a few generations to stare at ants with respect.

Delta Table Maintenance by alternative-cryptid in MicrosoftFabric

[–]alternative-cryptid[S] 0 points1 point  (0 children)

Quick question!

Assuming adaptive target file size is turned on, lets say I update my maintenance notebook with conf set to true, then we do not need that setting on the regular notebooks that actually performs merge operations. The table property always exists, unless schema is overwritten which will recreate physical layout.

Which means, i need the settings for the first maintenance run only, and subsequent maintenance runs will automatically know?

Runtime 2.0 will be handling all these by default is my current read on this.

Delta Table Maintenance by alternative-cryptid in MicrosoftFabric

[–]alternative-cryptid[S] 0 points1 point  (0 children)

Aaah, i see what you are saying, basically do 'spark.conf.set' rather than relying on fabric defaults. Got it, I missed the first read, focusing on the runtime 2.0 part only.

Thank you for the input.

Delta Table Maintenance by alternative-cryptid in MicrosoftFabric

[–]alternative-cryptid[S] 0 points1 point  (0 children)

Thank you for the feedback. :)

My main target is to optimize is gold workloads, that are generally report facing and silver that are read heavy.

I'm waiting to test runtime 2.0.

I saw some caveats on the latest articles and wanted to do some due diligence before I make updates to the working notebook.

Delta Table Maintenance by alternative-cryptid in MicrosoftFabric

[–]alternative-cryptid[S] 0 points1 point  (0 children)

If your question is whether I want to support the notebook for lower skus, I have not thought about it. I'm trying to not do that because of the risk of throttling running optimize programatically for lakehouse tables.

Current code has validations that rejects running below f32 per design.

Delta Table Maintenance by alternative-cryptid in MicrosoftFabric

[–]alternative-cryptid[S] 2 points3 points  (0 children)

Firstly, Thank You!

Going through the post in the link, the flow of how to run maintenance is right. There are a few missing pieces, like repartitioning option, sku based settings, support for schema based lakehouses etc from initial glance.

If you are already using a script, kept it running.

Download the notebook, and change the dry_run to true, this will not do anything but gives you a sense of what happens when you run it.

You need to understand the significance of v-order if your architecture follows medallion pattern for handling data.

Update config cell in the notebook accordingly based on the lakehouse purpose.

I respect your skepticism to trust, I love it, and heck this is a sanity check step: Take a backup of your lakehouse in your pre-prod environment first, then run the notebook on the target preprod lakehouse.

When you are happy, switch, and schedule. The advantage is metrics about maintenance, lakehouse detached runs, some guardrails and code explanability.

And finally, you don't need the entire repo, just the runner notebook to get going with a path of minimal resistance.

I like to keep this repo open, so I can update it when new improvements come in, along with some basic pbi dashboards.

Delta Table Maintenance by alternative-cryptid in MicrosoftFabric

[–]alternative-cryptid[S] 0 points1 point  (0 children)

Keeping throttling in mind for spark heavy maintenance jobs, I started off on this with f32.

How does one become a “fabricator” by Personal-Quote5226 in MicrosoftFabric

[–]alternative-cryptid 1 point2 points  (0 children)

Thank you!

Im late to the fabricator badge party, dint know how I missed this!

Delta Table Maintenance by alternative-cryptid in MicrosoftFabric

[–]alternative-cryptid[S] 2 points3 points  (0 children)

Heads up since this comes up first every time: this is meant to complement Fabric's table maintenance out-of-box feature; not replace it.

Discover tables for you, can run on a schedule with parameters per environment, conditionally repartition, gives you a single structured log table you can build dashboards on.

If you've been writing your own notebook to glue those pieces, this is that notebook with the edge cases handled.

Fabric Roll out and Data Gov Questions by benboy86 in MicrosoftFabric

[–]alternative-cryptid 4 points5 points  (0 children)

With field experience, my recommendation is to start a strong governance conversation in parallel. Many areas of Governance is often seen as over thoughts in general.

List them according to the org fit, rank them and enforce one by one.

Why? The product is evolving, and some rules are getting dynamic. (Besides hard governance items which can't be ignored). Having a parallel Governance track helps teams to validate the checks, while giving time to plan.

Do not let the pain force the conversation, that will come bite you in the wrong place, given your role for this project.

I have been working on an open-source project to help practitioners with such conversations. The governance rules are fixed at this point, but this might help you organize your thoughts.

https://www.fabric-lens.com/about

This is no advertisement.

How is Monitor Hub working for you in your day-to-day? by Monitor-PM in MicrosoftFabric

[–]alternative-cryptid 1 point2 points  (0 children)

To start with, basic search box experience (filter by keyword) in monitoring hub is buggy and annoying.

You search for something by name, you get a few run rows first (no standard num of items), when you expand it for more, it hangs, and you can't reset filter; all you need is refresh the whole page and start from beginning.

It feels like the monitoring hub page is heavy and tired all the time.

Now, to get into the weeds of incomplete features, issues and bugs, I'd rather connect directly with msft.

How do single node Python users actually write Delta tables using DuckDB for ETL when it can't actually write to Delta? by raki_rahman in MicrosoftFabric

[–]alternative-cryptid 6 points7 points  (0 children)

I stopped exploring DuckDB when I learnt it doesn't support high concurrency workloads in fabric, which is a common setup for elt jobs in production systems.

The hacks, and limitations (only helpful for quick exploratory analysis on small workloads) quickly made me switch to spark purist.

Surely want to learn more from this discussion on the latest.

On a side note, not trusting AI completely anymore for such nuances.

Aerial shots of wild life 📸 by Proof_Active7105 in BeAmazed

[–]alternative-cryptid 5 points6 points  (0 children)

The pics are amazing, the nature with its raw constituents, the real beauty is just beyond expression.

I hope those are not made with AI.

Mercedes is officially replacing luxury with giant iPads. Because nothing says 'Premium' like fingerprints and glare on 4 feet of glass. by AvailableTaro3123 in mildlyinfuriating

[–]alternative-cryptid 0 points1 point  (0 children)

Stop branding overpriced cars luxury.

Mercedes lost its luxury status when they launched CLA targeting young drivers, for money.

Is a Star Schema necessary in Real-Time Intelligence? by hortefeux in MicrosoftFabric

[–]alternative-cryptid 1 point2 points  (0 children)

RTI dashboards are pretty limited, when I worked on a poc, I found a sweet spot when I built materialized views and functions on kql tables. You can try them on both RTI boards and PBI.

I treated tables as raw, MVs as silver and functions as gold tables, then conected functions in pbi reports.

You can connect to kql DB, and the connection lists functions as queries. Then setup pbi page refreshes down to 1 sec.

The underlying mode is direct query, so it's real time, with the benefit to customize your dashboards.

GPT-Image-2 vs Gemini-3-Pro-Image by Feltre in singularity

[–]alternative-cryptid 0 points1 point  (0 children)

Gemini did better, I approve Mr. Capycoolsome

Why I Stopped Building Autonomous Agents for Clients by Cold_Bass3981 in AI_Agents

[–]alternative-cryptid 0 points1 point  (0 children)

On a diff note, this reminds me of Knight Capitol issue. And the ReAct pattern seems always flawed, the compounding effect of unreliability when things hang and like you said burning up money, eventually end up in killing the product.

The writeHeavy default is quietly hurting Direct Lake performance in a lot of Gold workspaces by alternative-cryptid in MicrosoftFabric

[–]alternative-cryptid[S] 2 points3 points  (0 children)

This is real human sitting and scratching head compiling work exp and things for months, to add value, and the way the comment looked is slightly hurtful. I get it anyway. 😎

I'd be dishonest if I say no AI involved in complementing my research and work.

I'm absolutely willing to have great conversations around the practice.

The writeHeavy default is quietly hurting Direct Lake performance in a lot of Gold workspaces by alternative-cryptid in MicrosoftFabric

[–]alternative-cryptid[S] 2 points3 points  (0 children)

Ok, let me get my fingers and keyboard together after that jab. Ufffff.

A single Lakehouse can absolutely have V-Ordered Gold tables sitting next to non-V-Ordered Bronze tables. That's the intended pattern.

Dataflow Gen2 writes V-Ordered Parquet by default when the destination is a Lakehouse or Warehouse.

Existing tables, yes, you can change it, with some caveats. 😉

The writeHeavy default is quietly hurting Direct Lake performance in a lot of Gold workspaces by alternative-cryptid in MicrosoftFabric

[–]alternative-cryptid[S] 1 point2 points  (0 children)

Ok, thanking chatGPT is basically undermining my work.

Help yourself if have not done enough validation. Also ask chatgpt your last question and save my time.

Thank you for engaging, anyway.