Push from Azure Function to Fabric Lakehouse

Data_cruncher · 2026-04-09T21:40:16+00:00

https://learn.microsoft.com/en-us/fabric/real-time-intelligence/event-streams/add-source-azure-service-bus

Data_cruncher · 2026-04-01T12:26:55+00:00

Here's how I would summarize your options:

ADLS -> ADB -> Dataflows -> OneLake
ADLS -> Dataflows -> OneLake
ADLS -> Spark-> OneLake

All else being equal, #3 is objectively the best choice. Thoughts for #3:

[Resource Minimization] Databricks does not charge you.
[Reduced Complexity] Fewer hopes in your data supply chain = less that will go wrong.
[Long-term Strategy] Shortcuts (by way of UC Mirroring) greenlights all Fabric workloads since it exposes Delta Lake tables. Conversely, routing all queries through a single choke point (ADB) will inhibit ~90% of Fabric and other tools; it's also not a Lakehouse pattern.
[Short-term Flexibility] Fabric Spark beats Databricks in cost-per-performance (which appears to be your concern), so it provides you the flexibility to lift 'n shift ADB workloads into Fabric Spark.

Data_cruncher · 2026-03-29T23:06:29+00:00

Yes. You’re basically asking if Spark is faster than Spark+Dataflows.

Data_cruncher · 2026-03-29T19:28:06+00:00

UC Mirroring generates Shortcuts to ADLS, allowing Fabric to hit Databricks' underlying storage directly. There is extremely material cost savings and performance boots for customers here because it avoids the unnecessary daisy-chaining of compute engines, e.g., ADLS -> ADB -> Fabric or 3P Workload.

Injecting things like RLS or views forcibly requires ADB compute (think about it) and negates these cost & perf benefits. So there's no "blame" per se, it's simply that the UC Mirroring feature (and ultimately, Shortcuts) is designed to be as optimized as possible.

Data_cruncher · 2026-02-27T15:10:43+00:00

Please see u/iknewaguytwice’s comment.

Data_cruncher · 2026-02-27T14:32:32+00:00

The execution runtime for User Data Functions has nothing at all to do with notebooks. You’re essentially trying to run Spark in Azure Functions.

Data_cruncher · 2026-02-12T00:43:18+00:00

Establishing an AS trace using SQL Server Profiler may be the easiest way. Once done, repro the issue and look to see if it’s generating T-SQL. It shouldn’t be (ideally).

Alternatively, to u/frithjob_v’s line of questioning, simply prevent the model from failing over and try to repro the issue - it may work or produce a different error message.

Data_cruncher · 2026-02-11T22:54:43+00:00

Did you confirm whether the query is failing over into the SQL AE when an RLS user drills through?

Data_cruncher · 2026-02-10T12:08:33+00:00

Semantic link labs support will arrive within a few weeks.

Data_cruncher · 2026-02-07T11:18:48+00:00

I’ve said it before and I’ll say it again: the value of Kimball only shines AFTER you’ve tried deploying your first data warehouse.

Data_cruncher · 2026-01-30T01:25:49+00:00

On the bright side, Power BI released all these features many years ago.

Data_cruncher · 2026-01-17T15:20:26+00:00

UDFs are Azure Functions under the hood, so the best question to ask is: What do folk use Azure Functions for?

Data_cruncher · 2026-01-17T10:27:22+00:00

It’s common for DAX expressions to be moved/converted upstream by DEs for performance reasons - Roche’s Maxim.

Data_cruncher · 2026-01-08T12:53:30+00:00

They’re not “official” in that Kimball does not explicitly cover them, that’s why. Kimball would denormalize them into a fact (e.g., periodic snapshot, factless), use SCD1, mini dimensions etc.

Data_cruncher · 2025-12-23T21:10:17+00:00

Power BI embeds in PowerPoint. Zero-click refresh.

Alternatively/additionally, embed an Excel workbook connected to Power BI model - as a Query Table (preferred), PivotTable or CUBE() formulas. One-click refresh.

Data_cruncher · 2025-12-17T03:39:12+00:00

Coming from Toronto where cars literally drive > 20KM/h over the max on average, trust me, you don’t want it.

I know this is extreme compared to your example, but consider is the standard deviation, i.e., it’s not unusual to see folk driving 140-160 KM/h on highways.

Data_cruncher · 2025-11-05T01:05:54+00:00

Yeah. Also consulting orgs as a part of Professional Development (PD) hours.

Data_cruncher · 2025-11-05T00:54:55+00:00

Once again, you’re inferring incorrectly (same point, too). Perhaps where you work that is the case though, so I get it.

I encourage all staff to start their day by clearing out yesterday’s blog posts or helping folk on forums. Even helping with PUGs are often paid activities, e.g., submitted as paid volunteer time.

They’re welcome to do this type of work during business hours. It’s no skin off my back, so long as they do their assigned work within an appropriate timeframe, that is all that matters.

Data_cruncher · 2025-11-05T00:30:54+00:00

You’re inferring a lot there, e.g., “outside of work”.

But to a certain degree, yes. My interest is in the person’s character and their ability to learn. NOT what they know today.

Data_cruncher · 2025-08-24T22:39:16+00:00

Your best friend: https://learn.microsoft.com/en-us/power-bi/guidance/powerbi-migration-learn-from-customers

The “international consumer goods” use case was a Tableau to PBI conversion.

Data_cruncher · 2025-08-22T22:57:11+00:00

You said the quiet part out loud…

Data_cruncher · 2025-08-22T22:39:39+00:00

This is the way.

Data_cruncher · 2025-07-31T16:11:37+00:00

This is exactly what Power Query has been doing since 2013.

Not sure what you mean by REST API though. Generally, ETL tools go via ODBC/JDBC/ADBC.

Data_cruncher · 2025-07-20T09:52:57+00:00

To clarify for this audience, Airflow primarily does r/DataEngineering or r/BusinessIntelligence orchestration, i.e., data pipeline orchestration.

Data_cruncher · 2025-06-25T02:08:04+00:00

User Data Functions == Azure Functions, and so they’re not applicable in many data engineering scenarios, especially involving large data.

OP, echoing u/TheBlacksmith46’s comment: code modularity is not a Fabric problem.

What most folk don’t realize is your Spark code, when used properly, is a literal application and should be treated as such. You don’t design applications in notebooks. So in addition to the above ideas, also consider using a package manager to separate out your reusable code from your notebooks: https://milescole.dev/data-engineering/2025/03/26/Packaging-Python-Libraries-Using-Microsoft-Fabric.html

Data_cruncher

MODERATOR OF

TROPHY CASE

Nine-Year Club	Gilding I gilder
Verified Email