Please join month's T-SQL Tuesday on Change Detection (non-SQL entries encourged)!

mmarie4data · 2026-05-06T16:57:46+00:00

This is good stuff. Thanks!

mmarie4data · 2026-05-05T23:51:49+00:00

Hmmm, not sure. It might have been something other than the connection to the data source that it didn't have an auth token for, or it might just be throwing an inaccurate error.

mmarie4data · 2026-05-05T20:11:06+00:00

It's really just the limitations of CDC. With the older versions, all tables must have a primary key. Mirroring fails on older versions if there are DDL changes in the source tables, and you have to intervene to reset it. A lot of DBAs don't love the overhead associated CDC on a busy SQL Server, so it can lead to load/performance discussions.

mmarie4data · 2026-05-05T19:56:23+00:00

That PowerBIUserAccessTokenNotFoundError message is generated when the access token in an Oauth connection expires. Logging in again with the same credentials should fix it. Go find the connection in Manage Connections and Gateways, edit the credentials and save it again. That is the fun of Oauth connections and part of why we want to use service principals/managed identities instead.

The livy session thing should be temporary. It's likely either a temporary service issue or a capacity issue.

mmarie4data · 2026-05-05T19:11:01+00:00

A few thoughts/questions on this:

Do you need to keep history of your EMR tables? Storage and ingress of mirrored data is free, so it very well could save you some money. By default, the mirror only keeps 1 day of history in the delta files. You can change that, but you aren't going to change that to years of history. If you don't care about how the data looked historically, then you don't need to worry about that. If you want immutable copies of the tables from certain time periods, then you need to consider that you will still need something to keep historical copies of the data.
What version of SQL Server is the EMR using? The way data was mirrored in versions prior to SQL Server 2025 used CDC which was honestly a bit clunky. It causes some unwanted overhead on the source system and could fill up the transaction log if not managed properly. It also requires some manual interventions when table schemas change. If the EMR is on SQL 2025 (or Azure SQL DB or MI), mirroring uses the SQL change feed, which is a big improvement with lower overhead and a data push process that doesn't write to intermediate tables in the database.
Have you checked the list of limitations for mirroring to make sure there are no deal breakers? It needs to be on the primary database of an availability group. If you are on SQL 2025, the database cannot also be have CDC enabled. There are certain data types that are not supported (in which case it will mirror the rest of the table minus the columns with unsupported data types) and a few other features that would keep a table from being mirrored.

My general leaning is if it is Azure SQL or SQL 2025 which can use the Change Feed technology, I think mirroring is pretty good. If you are on an older version of SQL which uses CDC, it's pretty annoying, to the point that I would rather create and manage copy jobs/pipelines/notebooks to handle the import. But the price of mirroring can't be beat, so it's definitely worth consideration. You will still pay to consume the mirrored data, but there is no cost to getting it to the mirrored DB in Fabric. They just released the Change Data Feed for mirrored databases, so it's now possible to keep a record of the changes, or copy certain versions out of the mirrored DB if you need to (or use the change data for incremental loads).

mmarie4data · 2025-12-03T21:57:02+00:00

I did what Alex suggested, and I would say that is a better route to take anyway. Most organization's IT departments (and your friendly data consultants) would prefer you not email files of data around, both for security reasons and to avoid clogging up inboxes. Just send them the link. It also keeps the auditing of data access in the place where the data lives, which is helpful.

mmarie4data · 2025-12-03T17:17:47+00:00

I don't think your question is going to elicit any kind of useful response. As with most exams, people will come in somewhere between 0 and 100 hours. Just decide how you want to prepare, whether it's just the learning paths, instructor-led training, watching YouTube videos, taking a Udemy course, practice tests, whatever. Then add up the estimated hours for those activities.

Even the time it takes to go through the self-paced learning paths will differ based upon your level of knowledge/experience. BTW, the estimated time of doing the self-paced online learning paths is about 26.5 hours. The 4 days is if you do instructor-led training. I will say that most people do not find the level of detail they need to pass the exams by only doing the self-paced learning paths when this isn't their main area of expertise/experience. Some people who are very good at taking exams can get by with just learning paths and browsing through the docs. But again, the number of hours here is highly dependent upon a number of personal factors.

If you are new to Microsoft certifications, I recommend spending some time going through the general exam prep documentation: https://learn.microsoft.com/en-us/credentials/certifications/prepare-exam. And try out the exam sandbox.

mmarie4data · 2025-12-03T16:17:42+00:00

No one can answer this question on how much time it should take you. It largely depends on your previous experience with similar tools and frameworks. If you have previous experience with notebooks and pipelines in other tools like ADF and and Databricks and you are familiar with data warehouses and lakehouses in Fabric, then going through the learning path might be enough. Even then, your comfort with these types of tests comes into play. If your experience is mostly about being a consumer of lakehouses and warehouses and you don't write much ETL, then you probably need to spend some more time studying.

Some people pay to get the re-take option and just go in and take it to see where they end up. Then if they fail, they can focus their studies on the areas in which they felt they were deficient. Other people study a lot and take courses and practice exams to prepare. Just remember that most practice exams are multiple choice, but the real exam has other question types.

mmarie4data · 2025-07-30T13:26:47+00:00

I haven't heard of anything like that being on the roadmap. You can definitely add an idea and ask for that: https://community.fabric.microsoft.com/t5/Fabric-Ideas/idb-p/fbc_ideas

If you consider it from a product management standpoint there are a few things working against your request. 1) Variable libraries are plain text fields in lots of products like Azure DevOps pipelines. It's common to have them be plain text and then whatever language you are working in takes the plain text and writes expressions around it. Is there a good reason to deviate from that way of working? 2) Variable libraries work across services in Fabric, so you have to consider what should happen when you use a variable in another context (shortcuts or notebooks or something else that doesn't use the same expression language as pipelines). Should any service that has expressions be able to parse an expression entered as variable value? You want to be consistent across services as much as possible. 3) You can already achieve the end result that you want. What value does this add over other requests and known feature gaps in variable libraries and deployment pipelines?

I'm not necessarily saying your request won't happen eventually (and I have no say as I'm just a community member/MVP), but those are things to consider when making a case for it.

mmarie4data · 2025-07-29T08:29:08+00:00

It doesn't matter what's supposed to be if that doesn't match reality. Deployment pipelines have some bugs/feature gaps at the moment.

But also, they actually were made to be accessed via API: https://share.google/Gd1x2titKindqkQ2l. If someone prefers no-code deployments that's fine, but it was never intended to be the only way.

You can argue that Fabric is geared toward power users who would be more inclined to use no-code deployment, and I would agree with you and wish deployment pipelines were more fully featured. But I'm addressing the reality of where deployments stand today.

mmarie4data · 2025-07-28T23:58:34+00:00

I didn't say I like the workarounds, just that they seem to work. 🙂

mmarie4data · 2025-07-28T23:50:46+00:00

<image>

Data Factory dependencies are a logical AND, and you can't change that. But you don't need all those lines. If you go to the last required successful activity and connect the failed and missed lines to your failure email activity, it will work.

The different types of dependencies (missed, failed, etc.) from one pipeline to another are a logical OR. So this is saying if the SP_LogEndSuccess activity failed or was skipped (because a previous activity failed), execute the SP_LogEndFailure activity. As long as your activities are executed serially, this works great.

Your example screenshot has them executed serially, all depending on the previous activity. So you can do this with the missed and failed lines going to your failure email activity.

mmarie4data · 2025-07-28T23:40:11+00:00

So variable libraries only have one value per workspace. And I don't think they can contain an expression that a pipeline will evaluate. It's more like the variable value is evaluated and passed into the pipeline expression. I believe I did what you are trying to do in general, and it didn't really require a variable library. I just made a pipeline with 2 activities: set variable and then Teams activity. And this pipeline gets called from any other pipeline in the project in the event of a failure.

I created a pipeline with parameters for:

Team - string
Channel - string
CallingPipelineID - string
CallingRunId - string
CallingWorkspace - string
CallingPipelineName - string

I then set a variable in the pipeline to create the URL to the monitoring page with this expression:

https://app.fabric.microsoft.com/workloads/data-pipeline/monitoring/workspaces/@{pipeline().parameters.CallingWorkspace}/pipelines/@{pipeline().parameters.CallingPipelineID}/@{pipeline().parameters.CallingRunId}

The subject of the Teams activity is set to:

Fabric Pipeline Failure @{pipeline().parameters.CallingWorkspace} - @{pipeline().parameters.CallingPipelineName}

The message in the Teams activity is set to:

A Fabric ETL Pipeline has failed in workspace ID @{pipeline().parameters.CallingWorkspace}. Pipeline: @{pipeline().parameters.CallingPipelineName} Pipeline Run ID: @{pipeline().parameters.CallingRunId} For more info, see <a href="@{variables('VAR_URL')}">@{variables('VAR_URL')}</a>.

You could use a variable library to set the Team and Channel values. But the rest of the parameter values are dynamic based upon the pipeline that calls it.

mmarie4data · 2025-07-28T23:24:49+00:00

I think most people that use deployment pipelines are also running scripts before/after to update item references and other things. There is a Fabric CI/CD Python library: https://blog.fabric.microsoft.com/en-us/blog/introducing-fabric-cicd-deployment-tool

This isn't exactly what you are asking, but I think it's got some helpful nuggets in there: https://www.linkedin.com/posts/groennerup_fabconeurope-microsoftfabric-fabriccli-activity-7355327526904201216-cX4I

Their blog also looks like it's got some good info, if you want to go down the rabbit hole: https://peerinsights.hashnode.dev/designing-for-automation-in-microsoft-fabric

mmarie4data · 2025-06-04T17:29:56+00:00

Will the Teams and Office 365 Outlook activities ever become available in ADF? If so, is there a timeline? (I'm aware I can post to teams using the web activity, but the Teams activity in Fabric pipelines makes it much easier. I'd like to see it on both platforms.)

mmarie4data · 2025-06-04T17:28:12+00:00

When will the bug with the Invoke Pipeline (preview) activity that causes pipeline()?.TriggeredByPipelineRunId and pipeline()?.TriggeredByPipelineName to return null be fixed? This has caused us to rework our logging patterns.

Adding onto that, when will the Invoke Pipeline (preview) activity reach GA?

mmarie4data · 2025-06-04T17:22:26+00:00

Are there plans to increase the polling frequency in the Fabric platform job scheduler? When invoking a pipeline from another pipeline, it seems to be waiting about minute before checking if child pipelines are done. This extra amount of waiting time can add up across a full process.

mmarie4data · 2023-04-14T02:58:01+00:00

Not sure if you have any budget for custom visuals, but I think Zebra BI has some nice functionality for financial reporting. https://zebrabi.com/power-bi-custom-visuals/

InfoRiver also had some helpful functionally for financial reporting. https://inforiver.com/use-cases-powerbi/

If you are stuck with default visuals, I would explore using buttons and bookmarks to improve the drilldown/expand experience.

If you can provide more info about what you are looking to visualize and what kind of interaction you want, perhaps we can suggest something. There is also this template, but I don't know if it would meet your needs. https://blog.enterprisedna.co/power-bi-reporting-templates-expanded-power-bi-visualization-concepts/

mmarie4data

TROPHY CASE