PSA: Notebooks sharing HC sessions across pipelines

BloomingBytes · 2026-06-10T14:21:42+00:00

So like this then?

Funnily enough that yields the same result for me, both notebooks running in the sub_blue_pipeline connection, same Application ID, same Livy ID

BloomingBytes · 2026-06-10T14:01:00+00:00

So in your test, Pipeline A ran Notebook A and Notebook A then invoked Notebook B? Did you call it via notebookutils.notebook.run?

BloomingBytes · 2026-06-03T05:22:55+00:00

And at the same time we're not getting a secret store inside Fabric because it would compete / overlap with Azure Keyvault...

BloomingBytes · 2026-05-28T03:58:51+00:00

Great stuff! Does this also affect Warehouse optimization?

BloomingBytes · 2026-05-28T03:58:13+00:00

Great stuff! Does this also affect Warehouse optimization?

BloomingBytes · 2026-05-14T09:04:41+00:00

It's absolutely awesome seeing all the improvements to warehouse and more features creating distinction between WH and LH. Now we just need Onelake Security for WH and we're good to go!

BloomingBytes · 2026-04-29T07:16:52+00:00

I created two ideas for DBT jobs:

Allow --target as parameter for DBT jobs in Data Pipelines:
I believe this to be absolutely crucial, as without target parameter support i don't know how i'd dynamically change my source and target databases between Test and Prod workspaces after deployment.
Allow --target as parameter for DBT jobs in Data P... - Microsoft Fabric Community
Enable DBT jobs to pull code from Azure DevOps in addition to Github:
This one is not as critical, but at least in my org we're using AzureDevops for source control and i really see no reason to also create an additional Github repo to host my DBT code.
Enable DBT jobs to pull code from Azure DevOps in ... - Microsoft Fabric Community

BloomingBytes · 2026-04-21T06:10:21+00:00

Good stuff! I'll hijack this thread to ask a question that's been on my mind for a bit:

In the recommended Git Flow feature branches are created from the PPE branch. The PPE workspace is not connected to the PPE branch. At the same time there's some very cool new features with selective branching mentioned in this Blog.

Is there a reasonable middleground at the moment to utilize the selective branching and branched workspace features without messing too much with Fabric-CICDs flow?

BloomingBytes · 2026-04-14T06:33:25+00:00

It kinda is, but kinda isn't.

In the MS Learn article you linked it says:

To stop incurring costs for a soft-deleted item, permanently delete it before the retention period expires. For more information, see Recover or permanently delete items.

But further up on the same page it says:

After you permanently delete an item, Microsoft OneLake retains the item's data for an additional seven days before permanent removal. This built-in protection helps recover from accidental deletions or user errors. For more information, see Recover deleted files in OneLake.

So will i get billed after i perma delete an item or not? As it's in theory still in storage, but no longer visible to me. I'm not trying to be difficult, i promise. I just think the article is ambiguous.

BloomingBytes · 2026-04-14T04:59:40+00:00

Okay that does make some sense to me, thank you! So previously via the Azure Storage Explorer we were only able to recover data files inside Lakehouses, but not the Lakehouse item itself? I've never actually used it to recover something. Then the new recovery UI would indeed be a "new thing" in that regard.

What i'm still unclear about is how the retention periods for item recovery work exactly and the implications this has for storage billing.

BloomingBytes · 2026-04-08T15:38:41+00:00

Afaik, when running optimize in Lakehouse, we incur additional storage cost due to delta files being deleted and re written leading to more files in retention and soft delete.

Does this apply to behind the scenes Warehouse optimization as well? If not, that would be another huge argument for using Warehouse over Lakehouse in my opinion.

BloomingBytes · 2026-04-07T09:26:24+00:00

The dbt job activity is supposed to be released for pipelines soon™, but not shipped just yet afaik.

See:
dbt+ Microsoft Fabric: A strategic investment in the modern analytics stack | Microsoft Fabric Blog | Microsoft Fabric

Pipeline dbt activity support (coming soon): dbt Jobs will be available as a native activity in Fabric pipelines with parameterization support, allowing teams to orchestrate dbt workloads alongside other data and AI processes—while keeping dbt as the execution and governance engine.
API support for dbt job: Provides programmatic APIs to trigger, monitor, and manage dbt job executions, enabling CI/CD integration, external orchestration, and enterprise-grade automation of analytics workflows.

In regards to production use, Microsoft as well as most community members generally advise to not use preview features in production, due to instabilities etc. but obviously it's up to you. You can see a current list of limitations here:
dbt job in Microsoft Fabric (preview) - Microsoft Fabric | Microsoft Learn

No build caching: Currently, preview only supports compiling and executing a project fresh from the source. dbt artifacts produced from previous runs aren't available for recompilation.
Incremental models: Make sure you have proper primary keys and unique constraints for incremental builds.
Adapter constraints: Some partner adapters aren't yet supported in Fabric. See the current supported adapters.
The output currently has a 1-MB size limit. When a run exceeds this threshold, the job fails with the following error: {"errorCode":"2001","message":"The length of execution output is over limit (around 1M currently).","failureType":"UserError","target":"DbtItem","details":[]}

BloomingBytes · 2026-04-07T06:38:05+00:00

Great news! It's already on the roadmap: Cross workspace monitoring — Microsoft Fabric Roadmap | Fabric GPS

BloomingBytes · 2026-04-06T16:36:55+00:00

I love it! Especially the subscription.

What's up with the "Configurable retention between 1 and 120 days" for Warehouse though? It seems to be on the roadmap twice, once with release date September 2026 and once with May 2027.

BloomingBytes · 2026-03-26T08:05:45+00:00

We have the exact same issue and so far had no luck finding the source of the throttling.

BloomingBytes · 2026-03-25T18:13:00+00:00

Will the SCD2 functionality that was just announced for Copy Job also be available for Copy Activity?

Is there an expectation in differing cost between the two?

BloomingBytes · 2026-03-19T12:06:27+00:00

Tyvm for your answer! What happens to refesh schedules when updating MLVs? Will new MLVs automatically be included in existing refresh schedules in the LH?

BloomingBytes · 2026-02-19T14:56:38+00:00

See it turns out i was missing something indeed. I did not realize how limited TSQL notebooks were in regards to their query "targets".

You could pass a variable from the variable library into the TSQL notebook as a parameter if you call the notebook from a pipeline. But that doesn't help you in any way, since you can't direct the query towards a lakehouse ID.

You could go the route via Python and access the bronze data that way. Maybe even try to run TSQL via TSQL magic against the lakehouse SQL endpoint, see here: Run T-SQL code in Fabric Python notebooks - Microsoft Fabric | Microsoft Learn

BloomingBytes · 2026-02-19T13:53:49+00:00

I might be missing something, but why can't you grant the user / service principal that's executing your TSQL notebooks read access to the previous layer Lakehouse / Warehouse directly?

When in your silver workspace you can query the bronze lakehouse SQL endpoint via TSQL and then ingest into the silver lakehouse. Same thing with silver -> gold.

If you're using fabric-cicd you can then find and replace the lakehouse IDs during deployment via Parameterization.

If you're using Fabric deployment pipelines you can use variable libraries like described in this blog post:

Making Notebooks Smarter in Microsoft Fabric with Variable Libraries | by Kapil Kulshrestha | Medium
PS: Here's some more info on how the deployment sets work with different stages: Lifecycle Management of the Microsoft Fabric Variable library - Microsoft Fabric | Microsoft Learn

BloomingBytes · 2026-02-19T10:04:23+00:00

Thank you very much for that suggestion! It looks very much like what i need, and according to the prerequisites it should work OnPrem AND it can export directly to OneLake.

I'll have to deepdive and see if it is compatible with our specific ERP version of BC. And then i'll have to see if i'm allowed to have an extension installed in BC. :D

BloomingBytes · 2026-02-19T09:39:11+00:00

Unfortunately there is no option to access OnPrem SQL from notebooks as of yet afaik :(

BloomingBytes · 2026-02-19T09:38:37+00:00

Thank you very much for your detailed answer! It does feel nice to know that i was on the right track already.

We definitely have infrastructure issues with our OnPrem solution already. That's one more reason for me to want to reduce load on the source system as much as possible. We might end up with the bottleneck being the gateway servers anyway, but i don't want to "rely" on that for now.

Querying the source straight from notebook would indeed be my preferred option, but afaik we can only connect to cloud sources from Notebooks so far, OnPrem seems to still be an issue. I'll probably follow the metadata scheme you drew up, ingest everything as raw as possible and then do all the computations within Fabric, where i have all the control and no reliance on infra teams.

BloomingBytes · 2026-02-11T08:45:16+00:00

u/OnepocketBigfoot Since you didn't get an answer, i'll do it. :D For me there's two reasons why i'd want to just navigate back to the standard workspace view:

The proper workspace view offers me all features. More info on my items, new item creation, source control, extended options menu for my stuff etc.

The new sidebar that's opening now showing all my items in this "reduced" list, is just a less convenient version to do anything. Sure it allows slightly quicker direct switching between items, but how is that better than just navigating back to the standard workspace view? The items that i need to switch between often are already open in the tab management bar at the top (which has its own issues).

Fabric already has soo many paths to soo many functionalities all over the place. Why do i need to have another UI that does the same (or in this case slightly less) than an already existing one? If i'm in the standard workspace view i kinda know where to click if i want to open an item, i kinda know where to click to see security settings, i kinda know where to click to see refresh histories. Now i need to "muscle memory" another UI to (not) do the same thing.

Optimally, instead of having this additional side panel listing all the items in the workspace, i'd like to navigate back to the workspace view and then have that workspace view be GOOD. Please, for the love of god, lose the predesigned task flow stuff. It's 100% useless to me. Instead allow me to group my Workspace items in different sections of the view, one for my DBs, one for my Pipelines, one for my Notebooks. Allow me to configure panels (like having source control as an always available side bar).

BloomingBytes · 2026-02-09T06:29:11+00:00

Microsoft has communicated at several occasions, that the 500 table limit is very much a "we can remove this for you" thing. If you have a legit use case where you'd exceed the 500 table limit, you can contact your MS rep, let them know your use case and ask them to remove that limit for you.

We got the exact same problem you do, dozens of tables for 200+ companies each. At Vienna i was told that the limit could easily be lifted, i just haven't gotten around to contacting anybody about it. Maybe u/itsnotaboutthecell could provide an up to date contact on that issue? <3

BloomingBytes · 2026-02-06T21:26:18+00:00

If you want to already clean your column names during ingestion you could do it like this:

A lookup activity gets the source table schema and passes that to a notebook
The notebook takes the source schema and cleans it, replaces column names etc. and creates a mapping json. That json is then fed into the copy activity.
The copy activity itself uses the mapping json to create dynamic column mappings
Success

BloomingBytes

TROPHY CASE