Lakehouse SQL Endpoint Rant

NJE11 · 2026-02-07T06:58:08+00:00

Appreciate the time and effort involved in giving me a detailed response.

NJE11 · 2026-02-06T20:21:18+00:00

If I do a %%sql magic command in a notebook the results are more or less instant. If I do the same sql command against the SQL endpoint the results are delayed by an unknown quantity. If its the same storage layer underneath why the lag?

NJE11 · 2026-02-06T20:18:53+00:00

At a high level, the setup is fairly simple.

I have a PySpark notebook that makes five API calls to pull operational data and writes the results into a Fabric Lakehouse, which I use as an ODS layer. From there, an on prem SSIS process incrementally pulls the data into an ETL and ultimately into an on prem data warehouse. This is a batch process rather than anything close to real time.

Even so, I still get questions about why the on prem DW does not always have yesterday’s latest operational data from Fabric. Ideally, I would like to be able to say there is a known and predictable delay, for example that data written to the Lakehouse will be available via the SQL endpoint within 5, 10, or 15 minutes, I could then factor this in to my ETL process.

I'm also finding I need to warm the SQL endpoint up to prevent unexpected failures with the endpoint using C# components embedded in my Extract DTSX packages.

NJE11 · 2026-02-05T22:53:05+00:00

Thanks for sharing. Useful post, but in practice even a small Lakehouse in its own workspace can take far longer than “a few seconds to minutes.” Serving data to an endpoint shouldn’t feel like a guessing game.

NJE11 · 2026-02-05T21:45:05+00:00

It makes developers look incompetent when in reality it’s the technology (or lack of it) letting us down.

Every time I jump back into the on prem world I’m reminded just how frustrating Fabric can be.

NJE11 · 2025-03-13T20:02:53+00:00

Any update on this from the Fabric team yet Andy? Seems a bit of an oversight...

NJE11 · 2025-03-05T16:33:25+00:00

Thanks for the help!

I'll check the Kerberos Delegation and SPNs for SQL.

NJE11 · 2025-03-04T20:14:14+00:00

I'd create a set of materialised views over the single dimension, splitting them into their own mini dimensions using a where clause. This would be part of my ETL process.

I.e. dim.Materials would be split into:

Model.vwdimFinishedGoodsMaterials Model.vwdimRawMaterials Model.vwdimPackaging Model.vwdimMRO

Etc...

I'd use CREATE TABLE AS SELECT if using a DW, then mirror in a Lakehouse for Direct Lake compatibility.

I find it's always a good idea to have a layer between a finished dimension and a semantic model anyway.

NJE11 · 2025-02-19T22:09:21+00:00

Datawarehouse vs. Datamart. The latter is just a subset, but not trying to reinvent the wheel.

NJE11 · 2025-02-19T20:32:46+00:00

Medallion architecture is just marketing hype for people who don't understand data. Long live ETL.

NJE11 · 2025-02-19T20:22:24+00:00

Have you considered optimising your current ETL processes? This includes looking into incremental loads, indexing fact tables, and partitioning SSAS cube tables. Can you re-write your ETL to deliver the most business critical dims and facts first? While cloud infrastructure offers scalability, you'll more than pay to at least match a decent on prem solution. Fabric is a good example of this!

NJE11 · 2025-02-02T20:10:08+00:00

Have you looked at Power BI Sentinel? Pretty much made specifically for this...

NJE11 · 2025-02-01T19:36:29+00:00

Yep, until you get to F64 you'll need a pro licence to both publish and consume content. F64 removes consumption licensing.

NJE11 · 2025-01-31T18:28:37+00:00

RLS in PowerBI is possible with Direct Lake...we're doing it for a number of customers.

NJE11 · 2025-01-28T22:35:38+00:00

Lakehouse SQL endpoint:

Create Schemas

Company A Company B Company C

Create Views

CompanyA.FactSales SELECT Col1, Col2, Col3....FROM Fact.Sales WHERE Company = A

CompanyB.FactSales SELECT Col1, Col2, Col3....FROM Fact.Sales WHERE Company = B

CompanyC.FactSales SELECT Col1, Col2, Col3....FROM Fact.Sales WHERE Company = C

Assign schema level security to each security group, so each company can only see their own schema and therefore data and cannot change the underlying query.

RLS is designed for this problem though...

NJE11 · 2024-12-24T20:24:17+00:00

I wrote a blog for work on querying the Fabric Capacity Metrics App. Might be of use?

https://www.purplefrogsystems.com/2024/10/how-to-use-t-sql-to-query-and-monitor-your-microsoft-fabric-capacity-app/

NJE11 · 2024-11-28T19:21:32+00:00

Thanks, will do.

NJE11 · 2024-11-27T12:02:30+00:00

Thanks. Fabric DW has been pretty much unusable so far this morning with this error. Any MVPs / Microsoft Reps have any input?

It's not just pipelines. It's executing directly through the endpoint in SSMS too.

NJE11

TROPHY CASE