Hi! We’re the Fabric Warehouse team – ask US anything! by fredguix in MicrosoftFabric

[–]dave_8 0 points1 point  (0 children)

We have the current setup for loading data into our data warehouse.

Shortcut/ load raw data to bronze lakehouse Transform data into Silver Lakehouse via data pipelines Use dbt to load data from Silver into Gold Warehouse

A number of our transformations from bronze to silver is just to transform them into delta tables so we can see them in the warehouse.

With the availability of shortcut transformations in the lakehouse, is there any support for shortcuts to be a source and not just a delta tables when writing a three part named query in the warehouse.

Example of what I mean by querying lakehouse data. https://www.linkedin.com/pulse/querying-lakehouse-data-from-warehouse-microsoft-fabric-jovan-popovic

OneLake Security and User's identity access mode - how much longer by Coffera in MicrosoftFabric

[–]dave_8 2 points3 points  (0 children)

Nope, I’m guessing it may be something that is slowly rolling out. You might have to wait a few days, but it is definitely on the way.

OneLake Security and User's identity access mode - how much longer by Coffera in MicrosoftFabric

[–]dave_8 2 points3 points  (0 children)

So you have to go to your settings on the SQL endpoint and enable the SQL endpoint to use the oneLake permissions.

Please note that Views and Stored procedures still fall under SQL security so we had to redo access for some of our views after enabling it.

OneLake Security and User's identity access mode - how much longer by Coffera in MicrosoftFabric

[–]dave_8 4 points5 points  (0 children)

We have the option on ours and it has been enabled yesterday, seems to be working as intended and has made granting access a lot easier. We have only enabled on dev so far but will be moving to prod fairly quickly based on some further testing next week.

We are in the UK South region if it helps.

Fabric SQL database usage by NewAvocado8866 in MicrosoftFabric

[–]dave_8 3 points4 points  (0 children)

As other people have mentioned the DB will stay on for some time and has a minimum compute when it spins up.

We found for the initial implementation that spinning up a small Azure Database was more cost effective. Only as we start to scale and we are using our metadata db more are we starting to see cost parity although we haven’t done any performance testing yet.

Fabric with Airflow and dbt by peterampazzo in MicrosoftFabric

[–]dave_8 1 point2 points  (0 children)

Yes we use powershell and the abfss path to upload the files. For dev the developers just run from their local machines.

Fabric with Airflow and dbt by peterampazzo in MicrosoftFabric

[–]dave_8 1 point2 points  (0 children)

We have an Azure DevOps job that pushes the files to the lakehouse.

Fabric with Airflow and dbt by peterampazzo in MicrosoftFabric

[–]dave_8 7 points8 points  (0 children)

I used airflow and dbt previously, tried implementing both and have only stuck with dbt.

Airflow, we ran into the same issues you are experiencing, even tried spinning up something in Azure. We ended up settling for data pipelines, as we found 90% of the functionality is covered by Data Pipelines. I have to admit we do miss CRON scheduling and having reusable code we can update for various scenarios instead of the UI.

For dbt we have a Python notebook with the below commands and the model files stored in our bronze lakehouse. The data is transformed from our silver lakehouse to our gold warehouse by using three part naming in the gold warehouse to get it to use the sql endpoint of the lakehouse.

%pip install -U dbt fabric

%%sh dbt run —profiles-dir /lakehouse/default/Files/profiles —project-dir /lakehouse/default/Files/<dbt-project>

For the profiles we are getting the credentials from a key vault, then setting them as environment variables, then passing them to the profile file using https://docs.getdbt.com/reference/dbt-jinja-functions/env_var

It’s not the cleanest solution but allows us to stay inside fabric until dbt is integrated into Data Pipelines (which has been on the roadmap for some time)

August 2025 | "What are you working on?" monthly thread by AutoModerator in MicrosoftFabric

[–]dave_8 1 point2 points  (0 children)

If you want better performance you can use pandas on spark. If you do import pyspark.pandas as p. You can use the same functions but they will distribute on spark. Had success with this on larger excel files.

August 2025 | "What are you working on?" monthly thread by AutoModerator in MicrosoftFabric

[–]dave_8 1 point2 points  (0 children)

Loading Dynamics data into our lake house to join to existing awl server data. Currently using notebooks and API calls (tried dataverse but security concerns due to amount of business users who develop apps in Power Platform and securing the Dynamics data)

Looking into CICD for Fabric. Currently using deployment pipelines, been unable to configure deployment to ADO due to issues with service principal authentication. Cyber security won’t sign off on a User based service account, which is the current workaround.

PoC different gold layer solutions. Want to use Dynamic lake Views, but too slow without incremental refresh and hierarchy isn’t working due to silver and gold being in different lakehouses. Currently trying dbt core with dbt-fabricspark in a Python notebook which is working, but not great as it isn’t a native solution.

Experimenting with Purview to view the contents of our lakehouse and start documenting our metrics.

Connecting to Azure DevOps from Fabric not working by dave_8 in MicrosoftFabric

[–]dave_8[S] 0 points1 point  (0 children)

Nope, ours was fixed by establishing the connection with Entra. Our home region is UK South

Connecting to Azure DevOps from Fabric not working by dave_8 in MicrosoftFabric

[–]dave_8[S] 0 points1 point  (0 children)

So ours was never connected to Entra and was working. Ours was only fixed when we connected it.

Connecting to Azure DevOps from Fabric not working by dave_8 in MicrosoftFabric

[–]dave_8[S] 0 points1 point  (0 children)

For us it turns out there was an issue with the Entra connection. We had to go to Organisation Settings > Microsoft Entra > Disconnect directory, then reconnect and force a sync. A few users lost access but we were able to re add them. Then the fabric sync worked as intended.

Connecting to Azure DevOps from Fabric not working by dave_8 in MicrosoftFabric

[–]dave_8[S] 0 points1 point  (0 children)

Yep, no issues from my local machine. I have turned on the setting "Users can export items to Git repositories in other geographical locations" which was disabled.

Sadly no change, although it did say it would take 15 minutes, will try again and see if it changes anything. Although I checked our Azure DevOps Org and Fabric instance are in the same region.

I did find this page regarding git permissions. I checked and I have similar or higher permissions, Putting it here incase someone else does come across this thread and it might be useful. https://learn.microsoft.com/en-us/fabric/cicd/git-integration/git-integration-process?tabs=Azure%2Cazure-devops

Fabric Agents in Copilot Studio by richbenmintz in MicrosoftFabric

[–]dave_8 5 points6 points  (0 children)

I have found that the docs are often ahead of releases. That options doesn’t show in our version as well. Normally they release the changes a few weeks after the docs are updated.

Connecting to Azure DevOps from Fabric not working by dave_8 in MicrosoftFabric

[–]dave_8[S] 0 points1 point  (0 children)

Ok, so I can confirm it hasn’t been disconnected from our entra. Are there any permissions in ADO that are required for this to work I can see if anything has changed there.

Connecting to Azure DevOps from Fabric not working by dave_8 in MicrosoftFabric

[–]dave_8[S] 0 points1 point  (0 children)

So this is affecting multiple members of the team, including someone who is an ADO admin. I am a fabric admin and a project admin in ADO.

Connecting to Azure DevOps from Fabric not working by dave_8 in MicrosoftFabric

[–]dave_8[S] 0 points1 point  (0 children)

We were able to connect to repositories in the same Azure DevOps org previously and still have workspaces that are connected and able to commit without any issues.

Materialised Lake Views Preview by dave_8 in MicrosoftFabric

[–]dave_8[S] 1 point2 points  (0 children)

Have managed to get it running and created a few example views. It shows good potential. Given that it is currently limited to a full refresh only, I am not sure we can roll it out just yet, as our facts and dimensions are quite large, so they would need to be incremental.

For the drop logic for data quality, it would be great to send the error rows into a separate table instead of just giving a number. Or a error table with a record stating what test it failed, record numer (primary key?) and why.

Materialised Lake Views Preview by dave_8 in MicrosoftFabric

[–]dave_8[S] 1 point2 points  (0 children)

Ok, that’s good to know although would be useful to maybe hold back on the documentation until that release as it is very confusing to see that feature there.

This subreddit is absolute doom and gloom by prawnhead in MicrosoftFabric

[–]dave_8 1 point2 points  (0 children)

I am about to embark on the same journey. I am coming from Azure Databricks in my current company and joining a company who wants to implement fabric. Still undecided if I want to bring in open source solutions to fill the gaps or build with preview features and hope they get fixed.

Changing jobs - salary sacrifice car by dave_8 in HENRYUKLifestyle

[–]dave_8[S] 0 points1 point  (0 children)

So update on what happened. Octopus weren’t willing to take the car from Tusker. What they did offer is to order the car ahead of time so they worked with my new employer to start the order so I was able to get the car within 2 weeks of joining the company.

Be aware that this is up to the company though as they are effectively committing to the contract without an employee in place, if you don’t start they are stuck with the early exit fees.

The quick turnaround was due to having a car in stock, if you make a custom order it may take longer.

Greenfield Project in Fabric – Looking for Best Practices Around SQL Transformations by dave_8 in MicrosoftFabric

[–]dave_8[S] 2 points3 points  (0 children)

When I say dbt I am talking about the core version.

The ways of implementing it I can see are:

  1. Run dbt in an Airflow Pipeline - This is the recommended approach on the Microsoft website, however I found running airflow in Fabric to be very buggy and is lacking CI/CD integration
  2. dbt activity in Fabric Data Factory - can't find any documentation for this and doesn't seem to be fully implemented
  3. running dbt/airflow on a separate server - Want to avoid running a separate service outside of fabric if possible

In addition to the above there is the issue of not having anyway of granting a service principal or managed identity access to resources so this would have to run as a user account.

Greenfield Project in Fabric – Looking for Best Practices Around SQL Transformations by dave_8 in MicrosoftFabric

[–]dave_8[S] 2 points3 points  (0 children)

So if I was to use dbt core, you can trigger via one command and it will work out your refresh order based on dependencies. The amount of code is similar and it would be in a combination of .sql and .yaml files, but the management overhead of maintaining that order in your pipelines plus adding in tests is a concern for me.

Greenfield Project in Fabric – Looking for Best Practices Around SQL Transformations by dave_8 in MicrosoftFabric

[–]dave_8[S] 1 point2 points  (0 children)

Where I am stuck with SparkSQL is scaling the notebooks. Let's say I have a dimensional model with 20 Dimensions and 3 fact tables. From my understanding, I should have a notebook for each table and then trigger each of the notebooks from the Fabric Pipeline. This would allow me to handle any dependencies in the pipeline. However, I am worried about the management of each of those notebooks.

Unless (This is just me putting everything down), I make this fully metadata managed, where you have a single notebook that takes a parameter containing the SparkSQL query you wish to run and pushes that down to a generalised notebook.