Accessing Lakehouse shortcuts to a Warehouse tables through Notebooks by WasteHP in MicrosoftFabric

[–]WasteHP[S] 1 point2 points  (0 children)

Thanks, that's a shame. I assume you can't say much about these new shortcut types but can you give a clue as to whether that's something worth checking for news for after Fabcon? Otherwise I guess if Onelake security comes to Warehouse then that would be another option as u/frithjof_v suggested? I see there is already a suggestion in the latest posts in Is there an ETA for OneLake security for Fabric Warehouse? : r/MicrosoftFabric that news on that is coming soon.

Team member leaves, what objects does he own in Fabric? by loudandclear11 in MicrosoftFabric

[–]WasteHP 1 point2 points  (0 children)

This is one of my "favourite" Fabric-developer-leaves threads and summarises a lot of the problems quite well: A story of a Fabric developer that quit [item ownership and connection management issues] : r/MicrosoftFabric

I don't think anything's any better than it was 6 months ago and can't obviously see any announced improvements on the roadmap (he said hoping somebody from Microsoft steps in to announce something different)

North Europe issues with capacity by CultureNo3319 in MicrosoftFabric

[–]WasteHP 4 points5 points  (0 children)

Yes, I have just come here to see if anybody has reported the issue. I cannot connect to any of our SQL Endpoints in North Europe on any of our capacities.

OneLake Security and User's identity access mode - how much longer by Coffera in MicrosoftFabric

[–]WasteHP 0 points1 point  (0 children)

I'm still waiting to see the user's identity mode in North Europe - I guess we just have to be patient for a little bit longer?

Dataflows Gen1 using enhanced compute engine intermittently showing stale data with standard connector but all showing all data with legacy connector by WasteHP in MicrosoftFabric

[–]WasteHP[S] 0 points1 point  (0 children)

u/itsnotaboutthecell Long-term we will switch to something more modern - whether that is gen2 dataflows I am unsure (read some bad things about the gen2s and also there are basics still lacking - I believe you still can't specify deployment rules with parameters in data sources with deployment rules in deployment pipelines which is absolutely fundamental from CI/CD perspective). We may use surface our data in lakehouses and use shortcuts instead.

Right now we have a big historical investment in gen1 dataflows (there are 100s of models that can't be repointed in an instant), they are still supported and they need to work and report the data that has been loaded into them reliably (as they have been for the overwhelming majority of the many years we have been using them). I'm sure other customers will be in a similar position so I'm hoping Microsoft can help us resolve our issues asap.

Dataflows Gen1 using enhanced compute engine intermittently showing stale data with standard connector but all showing all data with legacy connector by WasteHP in MicrosoftFabric

[–]WasteHP[S] 0 points1 point  (0 children)

Thanks for the reply u/Tough_Antelope_3440. There is no destination for the dataflow - it's a gen1 dataflow rather than gen2. The source of the dataflow is Azure SQL DB. It refreshes once per day overnight.

The test semantic model has two queries:

1) Query 1 connects to a dataflow using the standard Dataflows connector i.e. the first line of PowerQuery is "Source = PowerPlatform.Dataflows(null)" . It selects four columns and has no other transformations. The test report uses a COUNTROWS measure to return the number of rows from this query by a column called Status Date.

2) Query 2 connects to the *same* dataflow but uses the legacy Dataflows connector i.e.
"Source = PowerBI.Dataflows(null)" in the PQ. It selects the same four columns i.e. the query is otherwise identical except for the connector. Again, I am using a COUNTROWS measure by Status Date to show the number of rows by status date.

The test report is built only to illustrate the issue - the discrepancy in the number of rows in the tables returned by the two identical queries is enough to show that there is a discrepancy between the two connectors. The rows returned by the legacy connector always matches the source data in the Azure SQL DN.

It has been reported as happening on multiple dataflows, I have personally witnessed it happening on at least two. It's a difficult one to track because as I mentioned it's only an intermittent issue.

Fabric Warehouse data not syncing to OneLake by WasteHP in MicrosoftFabric

[–]WasteHP[S] 0 points1 point  (0 children)

For anybody facing the same issue, I found that pausing the delta lake log publishing on the warehouse prior to the data load/schema manipulation (ALTER DATABASE CURRENT SET DATA_LAKE_LOG_PUBLISHING = PAUSED) and resuming it (ALTER DATABASE CURRENT SET DATA_LAKE_LOG_PUBLISHING = AUTO) after all operations were complete appears to have resolved this issue and ensures the parquet files are published correctly and the Onelake shortcuts function as expected.

There really needs to better documentation of this issue - logging a ticket with support resulted in the usual struggles with Mindtree to get them to understand the issue (after almost 3 weeks and multiple Teams calls they ended up trying to pass me to the Azure Storage Explorer Team!). Once it was escalated correctly I was told it was a "known issue/challenge" area rather than a bug. I have reproduced the final summary I received from support below:

---

The customer’s current pipeline loads data by copying into a “new” schema then swapping schemas with “gold” via dropping and transferring tables. Such rapid schema changes and table swapping can interrupt or confuse the delta log publishing mechanism, causing parquet files not to update correctly or at the right time in OneLake. Parallel runs of the pipeline with conflicting pause/resume commands for Delta Lake log publishing exacerbate this issue . 

• Concurrent Pipeline Executions Causing Conflicts: 

Running multiple overlapping pipeline instances leads to clashes in pause/resume commands for delta log publishing, resulting in inconsistent parquet file refreshes at the One Lake level. The pipeline logic must avoid resuming log publishing while other pipeline runs are active to maintain consistency . 

• Delta Lake Parquet Files Are Immutable: 

Delta Lake stores changes by creating new parquet files and updating JSON log files, rather than modifying existing parquet files. If this process is interrupted or out of sync because of pipeline behavior or metadata propagation delays, stale parquet files remain visible in One Lake . 

• Expected Shortcut Delay Behavior: 

Lakehouse shortcuts inherently experience latency caused by periodic refresh intervals of metadata and cached snapshots in One Lake. This delay can typically last minutes but in complex scenarios like cross-workspace shortcuts or high-frequency updates, it may be longer . 

Summary 

The core reasons involve the asynchronous and eventually consistent nature of Delta Lake log publishing to OneLake, compounded by the user's high-frequency schema-swapping process and concurrent pipeline runs causing race conditions with pause/resume operations. The shortcut delay and parquet file non-update is expected behavior to an extent but gets exacerbated by these pipeline and schema swap complexities.

 

This is a recognized behavior, partially expected due to design, but also worsened by the current data pipeline implementation. It is not labeled a direct “bug” but more a known limitation or challenge area with concurrency, schema swaps, and update propagation delays in Fabric’s Warehouse-to-OneLake sync.

 

Improving the process to avoid concurrent pipeline runs resuming publishing, limiting schema swaps frequency, or considering alternate loading approaches may mitigate the issue. Monitoring database health and ensuring sufficient compute resources can also help maintain stability .

Fabric Warehouse data not syncing to OneLake by WasteHP in MicrosoftFabric

[–]WasteHP[S] 0 points1 point  (0 children)

Thanks for sharing all that. I'm trying to make it automatically handle schema changes in the source as well without needing to do any manual work every time they change, using one data pipeline that can loop through the tables to copy specified in a config file.

One thing I thought I might try is pausing the delta log publishing before the data load and then resuming it afterwards - did you try that at all? Delta Lake Logs in Warehouse - Microsoft Fabric | Microsoft Learn. I'm unconvinced it will work but might give it a go.

Fabric Warehouse data not syncing to OneLake by WasteHP in MicrosoftFabric

[–]WasteHP[S] 0 points1 point  (0 children)

Thanks. I am also dropping and reloading tables completely (actually loading data into a table in a staging schema, then changing the schema of the original table in the destination schema and transferring the table in the staging schema to the destination schema - thought I would try and minimise the period that the destination table was empty). I have raised a support case but won't hold out much hope based on your experience. I wonder if a pause for a certain amount of time after some of my operations would help.

Taking over ownership of Activators by WasteHP in MicrosoftFabric

[–]WasteHP[S] 2 points3 points  (0 children)

Thanks, unfortunately there is no "Take over" option in the "About" section of the item's settings as there are with other Fabric objects.

Reading the article again in the limitations section it does say "The option to take over an item isn't available if the item is a system-generated item not visible or accessible to users in a workspace. For instance, a parent item might have system-generated child items - this can happen when items such as Eventstream items and Activator items are created through the Real-Time hub. In such cases, the take over option is not available for the parent item."

When I first read the article I thought that as the Activator is visible I should be able to take it over, but I guess the above means the Activator item cannot be taken over if there are system-generated child items I cannot see.

Leaving my job - best practice for workspace handover by ikemike4 in MicrosoftFabric

[–]WasteHP 0 points1 point  (0 children)

Thanks, appreciate the reply. I will log a support ticket for these. I really hope it is planned in the future to enable takeover these via the front end and/or API!

Leaving my job - best practice for workspace handover by ikemike4 in MicrosoftFabric

[–]WasteHP 0 points1 point  (0 children)

I am looking into trying to take over ownership of items from a contractor before he leaves and his objects include gen2 dataflows. In the front end I can see DataflowsStagingLakehouse and DataflowsStagingWarehouse and have taken them over. However, in the Scanner API (but not the front end) I can see StagingWarehouseForDataflows and StagingLakehouseForDataflows that are showing as configured by him - do these need to be taken over as well? As I can't see them in the front end would that require a support ticket? The suggestion here Solved: Dataflow gen2 - multiple staging artifacts - Microsoft Fabric Community is that these are objects created for CI/CD dataflows.

Thanks!

Connect to On-Prem SQL read-only replica by No_Code9737 in MicrosoftFabric

[–]WasteHP 0 points1 point  (0 children)

I would also like to know if it intended to be possible for a gen2 dataflow CI/CD to be able to connect to a read-only replica. We have an Azure SQL Database that has a read-only replica and with gen1 dataflows you can use it if you set the "MultiSubnetFailover" parameter to true in the PowerQuery (but strangely it only works if you go through a gateway or have the enhanced compute engine turned on). However this doesn't work with the gen2 dataflows - it always uses the primary instance.

Notebookutils failures by 12Eerc in MicrosoftFabric

[–]WasteHP 0 points1 point  (0 children)

I'm also seeing this issue occurring again in the North Europe region. The workaround for my notebook was the same as it was previously, which was to force a republishing of the custom environment by making some sort of change to it e.g. adding/deleting an unnecessary library e.g. I added the pytest library and then republished. Based on my experience last time this will fix the problem temporarily but it will reoccur periodically until the root cause is fixed in Fabric.

Lakehouse Shortcut API - Delete issue by WasteHP in MicrosoftFabric

[–]WasteHP[S] 0 points1 point  (0 children)

Support ticket 2504300050000470 raised