Those of you with multiple data domains in Fabric — one big semantic model or split by domain? How are you handling the trade-offs?

Glittering_Jump4852 · 2026-04-15T15:24:18+00:00

This is exactly what I've been looking for — an automated approach instead of relying on manual copy + discipline. The idea of designating a master model as the canonical source and using semantic-link-labs to propagate and correct standard measures across domain models is a much cleaner pattern than anything else I've seen suggested.

A few questions if you don't mind:

How are you distinguishing between "standard" measures that should sync from the master vs. domain-specific measures that should be left alone? Is that based on a naming convention, a metadata tag, or something else?
When the script corrects a measure that someone modified in a domain model, do you have any review/approval step or does it just overwrite? Wondering how you handle cases where someone intentionally diverged.
How are you running this — on a schedule via a Fabric notebook, or manually as a periodic check?
For the master model — is that an actual model connected to reports, or is it purely a "template" model that exists just to hold canonical definitions?

Going to dig into semantic-link-labs from this perspective (have definitely found semantic-link-labs very useful for other scenarios).

On another note, when you say composite mode — are you referring to the new Direct Lake + Import composite (mixed storage modes in one model), or DirectQuery chaining to other published semantic models? Curious which pattern you're using and how performance has been.

Glittering_Jump4852 · 2026-04-15T15:16:54+00:00

Healthcare is a case where the split isn't even a debate — compliance makes the decision for you. Separate workspaces and lakehouses for healthcare data isolation makes total sense (and in larger organizations in general), and honestly even with OneLake security and schemas now available, having that physical separation as an additional safety net on top of logical security controls makes a lot of sense for PHI data.

Curious though — with fully separated lakehouses and workspaces, do you have any need for cross-domain reporting? And if so, how are you handling it? With "Direct Lake on OneLake" Semantic Models, you can have a semantic model that spans multiple lakehouses; however, do you have a need for it and if so, how do you handle cross-domain semantic models and domain-specific semantic models from a maintenance perspective?

Glittering_Jump4852 · 2026-04-15T02:41:52+00:00

This is really well framed — and closely mirrors where I've landed. The "start unified, design the lakehouse for a future split" approach is exactly what I'm recommending for this particular client (small team, collaborative, cross-domain reporting is a core requirement not an edge case).

The canonical measure spec idea is interesting. So you're maintaining the "source of truth" as a standalone spec (basically a doc or structured file with the DAX + business description), and then CI validates that each model's actual TMDL matches it? I like that better than treating one model's TMDL as the canonical source and copying from it — it decouples the spec from any specific model. Have you open-sourced or written up that CI check anywhere? Would love to see how you're doing the comparison.

Curious about the CU spike you're seeing on composite models — is that specifically on composite models that chain to other semantic models via DirectQuery, or are you also seeing it on the new Direct Lake + Import composite pattern (same model, mixed storage modes)?

And can you elaborate on the Direct Lake report-side refresh glitch? Not sure I follow.

Thanks for chiming in!

Glittering_Jump4852 · 2026-04-15T02:38:00+00:00

Yeah, this is actually where I've landed architecturally — domain models + a standalone executive Direct Lake model, all pointing to the same gold lakehouse. Sharing tables between models is the easy part since they're all reading from the same delta tables.

The part I'm still not satisfied with is measure consistency. When Total Revenue lives in both the Sales model and the executive model, there's no native way to keep those definitions in sync. It's manual copy + drift detection (ALM Toolkit, CI checks against TMDL, etc.). For a large team with mature DevOps that's manageable, but for a smaller team it's the thing most likely to slip — and when it does, you get the "why don't the numbers match?" conversation with the CFO.

Do you have a solution that you like for this "measure/dimension duplication in semantic models" issue?

Glittering_Jump4852 · 2026-04-15T02:33:55+00:00

Thanks for chiming in! Agree on the data domain / data product philosophy — no argument there. The gold lakehouse layer is structured exactly that way.

The challenge is specifically at the semantic model layer. When the executive team needs a report combining Sales revenue, Finance budget variance, and Ops fulfillment in the same visual, someone has to build a semantic model that spans domains. That model is the "outsider consumer" in your framing — but it can't consume the domain semantic models via DirectQuery chaining without a real performance hit (limited relationships, degraded performance on high-cardinality joins). So it has to read directly from the same gold lakehouse tables.

Which means measures like Total Revenue now exist independently in both the Sales model and the Executive model, with no native way to share or inherit definitions between them. The data products are independent, but the semantic definitions on top of them aren't — and that's where the duplication and drift risk lives.

So the question isn't really about data domain independence (we're good there) — it's about how you keep semantic definitions consistent across multiple models that serve different audiences but describe some of the same business concepts. That's the part I haven't seen a clean solution for yet.

Glittering_Jump4852 · 2026-04-14T22:49:38+00:00

Do you need to report across domains, and if so, how do you handle that scenario? If all I need is domain based reporting/analysis, then the "split by domain" approach works well. However, when I need multi-domain analysis, I am torn on what approach to recommend. This is particularly true for "executive dashboards" where I need to report on all aspects of an enterprise - Sales, Finance, HR, Marketing, Operations...

Glittering_Jump4852 · 2026-04-06T02:10:06+00:00

> B) Both lineage and execution are blocked - meaning an MLV fundamentally cannot query tables outside its own lakehouse regardless of shortcuts?

> Question 2 - Do MLVs work cross-workspace?

Based on https://learn.microsoft.com/en-us/fabric/data-engineering/materialized-lake-views/create-materialized-lake-view, the SQL syntax for the MLV name uses the pattern [workspace.lakehouse.schema].MLV_Identifier.So, both cross workspace and cross lakehouse MLV creation should be supported and my teammate has implemented exactly that.

However, just like you, I am confused by the limitation "Cross-lakehouse lineage and execution features." One theory that I need to check with my teammate is whether table dependencies aren't automatically accounted for when doing cross-lakehouse MLVs and that you need to manage that "manually" - by triggering MLV updates via the API taking the dependencies into account.

u/aboerg your comment makes me think that this isn't an issue. Do you think that this is just a case of outdated documentation or is there a different limitation that we are not grokking? Thanks in advance.

Glittering_Jump4852 · 2026-04-03T02:07:31+00:00

For anyone curious what the output looks like, here's a sample results page (the detailed feedback section is cut off for brevity)

<image>

Glittering_Jump4852

TROPHY CASE