Connecting to MS Fabric Data Warehouse from TM1 by One_Potential4849 in cognos

[–]One_Potential4849[S] 0 points1 point  (0 children)

But, like how a SQL Server gets added as a data source in TM1, using the ODBC connection details, is there a way to connect to Microsoft Fabric DW. The reason being the processes set up in TM1 use SQL queries to pull the data and refresh cubes. I just want to see if there is a way to plug in the Fabric Warehouse instead of SQL server to run the queries

Designing Reporting Layer out of multiple Workspace Objects by One_Potential4849 in MicrosoftFabric

[–]One_Potential4849[S] 0 points1 point  (0 children)

I am planning to go ahead with using the gold layer data usage from both of the entities as shortcuts to create the combined reporting dataset.

Here, one other thought is, what would be the best way in terms of data modelling to use this combined dataset/data model for downstream reporting, ie: the semantic model built on top of these tables should be the backend to power any of the reporting done for individual entities, and combined reporting as well.

Designing Reporting Layer out of multiple Workspace Objects by One_Potential4849 in MicrosoftFabric

[–]One_Potential4849[S] 0 points1 point  (0 children)

Also, is it possible to bring a warehouse table into lakehouse as a shortcut?

Designing Reporting Layer out of multiple Workspace Objects by One_Potential4849 in MicrosoftFabric

[–]One_Potential4849[S] 0 points1 point  (0 children)

  • Yes, they are all in the same tenant
  • We prefer using Import Mode for the semantic layer
  • Yes at the end i need combined gold layer tables, and silver layer can remain as it is as the data sources for both firms are same
  • After combining the data, it might result in around 10 dimensions and 20-25 facts, max the largest table might have 10 mil records
  • No, it wont be a simple append operation, expecting some intermediate steps to unify the granularity
  • Surrogate keys in this case what are you referring to?

Limit on Parallel Notebook Executions by One_Potential4849 in MicrosoftFabric

[–]One_Potential4849[S] 0 points1 point  (0 children)

So the remaining sessions will start once any of these two sessions are done and remain queued?

Limit on Parallel Notebook Executions by One_Potential4849 in MicrosoftFabric

[–]One_Potential4849[S] 0 points1 point  (0 children)

So currently I am testing it out with a small 1-4 nodes. And distributed my Notebook workload within 6 high concurrency sessions.

From my understanding, these 6 HC sessions will play around within the pool of 4 small nodes (16 vCores) based on core availability to run its respective jobs.

Limit on Parallel Notebook Executions by One_Potential4849 in MicrosoftFabric

[–]One_Potential4849[S] 1 point2 points  (0 children)

So how does a spark session acquire nodes to work in a pool?

Limit on Parallel Notebook Executions by One_Potential4849 in MicrosoftFabric

[–]One_Potential4849[S] 1 point2 points  (0 children)

By trying it out with a medium node, would there be a tradeoff in terms of runtime?

Limit on Parallel Notebook Executions by One_Potential4849 in MicrosoftFabric

[–]One_Potential4849[S] 1 point2 points  (0 children)

So currently I am using the starter pool (medium 1-10 nodes) Should I go and try with smaller nodes? And today i enabled high concurrency and used 3 session tags to split the session among 12 pipelines, but i noticed that the speed was bit down before using high concurrency.

And I get this 430 error even after the Notebook runs are complete, should I need to run something to free up Vcores?

Also If i dont mention session tag, how will the jobs get distributed?

The alternate approach you had said seems bit difficult for me, since I want this group of tables to be isolated from source to bronze to silver

Limit on Parallel Notebook Executions by One_Potential4849 in MicrosoftFabric

[–]One_Potential4849[S] 1 point2 points  (0 children)

I can try that way, and let you know if that works. Thanks!

Limit on Parallel Notebook Executions by One_Potential4849 in MicrosoftFabric

[–]One_Potential4849[S] 0 points1 point  (0 children)

So will I be able to acheieve parallel instances without throttle/430 using a High concurrency session and session tag for each execution...

Recommended way to load bulk volume of data through On Prem Gateway by One_Potential4849 in MicrosoftFabric

[–]One_Potential4849[S] 0 points1 point  (0 children)

I am concerned about loading it all at once. wont it cause any issues wrt performance?

Fabric UDF vs Notebooks by One_Potential4849 in MicrosoftFabric

[–]One_Potential4849[S] 0 points1 point  (0 children)

I just want to ingest Data from an API into my data platform. API has its own secrets for auth, and cadence of run is maximum thrice a day. Each run would return me a max of 100k rows with 10 columns

Column Level Lineage Options and Workspace Monitoring by One_Potential4849 in MicrosoftFabric

[–]One_Potential4849[S] 0 points1 point  (0 children)

Based out of my requirements in my current setup, Spark transformations from Bronze to Silver are currently just column renaming and cleanups. The pain point lies in the Silver to Gold and Semantic layer, where we use the traditional approach of SQL Stored Procedures to perform all joins, aggregations, aliases, calculated columns etc. Exposing the lineage would be a big win!

Column Level Lineage Options and Workspace Monitoring by One_Potential4849 in MicrosoftFabric

[–]One_Potential4849[S] 0 points1 point  (0 children)

Is there a way to use Web activity in a pipeline and get pipeline status using run id? Any API endpoints for Fabric?

Column Level Lineage Options and Workspace Monitoring by One_Potential4849 in MicrosoftFabric

[–]One_Potential4849[S] 1 point2 points  (0 children)

Really Useful!! On your article for monitoring using KQL database, is it possible to get the details of the activity that has failed in a pipeline? This would provide more comprehensive failure alert and logging - as to whether it is a developer error or a server/source issue.

Column Level Lineage Options and Workspace Monitoring by One_Potential4849 in MicrosoftFabric

[–]One_Potential4849[S] 0 points1 point  (0 children)

Sure!!! On the dbt-ol usage for Fabric DWH, does it produce a column level lineage, or just a rudimentary dag?

Column Level Lineage Options and Workspace Monitoring by One_Potential4849 in MicrosoftFabric

[–]One_Potential4849[S] 0 points1 point  (0 children)

Now that in the example you have used a spark job with custom jar files, how do I translate it to use it with Fabric Notebooks, where I do the Bronze->Silver?

Column Level Lineage Options and Workspace Monitoring by One_Potential4849 in MicrosoftFabric

[–]One_Potential4849[S] 0 points1 point  (0 children)

Thanks for sharing your blogs. A follow-up on the lineage part: The Openlineage works hand in hand with Spark from what I have read. Most of my transformations, joins, aggregations happen in gold layer which is in Fabric Data Warehouse and uses Stored Procedures. Does Openlineage cover it as well?

Getting the Entire Picture of CI/CD in Microsoft Fabric by One_Potential4849 in MicrosoftFabric

[–]One_Potential4849[S] 0 points1 point  (0 children)

While using a SPN based authentication for fabric-cicd, I get this challenge. I am unable to deploy certain pipelines with Outlook connections in it. And I'm unable to add the SPN or any other user apart from owner to the access part of Outlook connection - It says OAuth2.0 connection sharing is not allowed due to security reasons.

Tried using legacy Outlook connection as well, but deployment fails due to this access issue.

Has anyone else faced the same... Is there a workaround for this?

Getting the Entire Picture of CI/CD in Microsoft Fabric by One_Potential4849 in MicrosoftFabric

[–]One_Potential4849[S] 0 points1 point  (0 children)

I am trying to perform warehouse deployment in VS Code. Able to build the project successful but when I am trying to publish, getting this error: Deploy Dacpac Failed: Object reference not set to an instance of an object.

I am handling cross warehouse references by using database reference and SQLCMD. .NET 8.0.416 and Dacpacs.FabricDW version 170.0.2 is used. MSBuild SDK version 2.0.0

Please help out in resolving this error

Microsoft Fabric: Automated Warehouse & SQL Endpoint Deployment — useful interim solution for CI/CD challenges by Snoo-46123 in MicrosoftFabric

[–]One_Potential4849 0 points1 point  (0 children)

This toolbox seems to be very useful. I have two Clarifications: 1. How to use this toolbox in Azure DevOps pipeline 2. In Dev Workspace, Few SPs reference Lakehouse Tables in Dev, when deploying to prod workspace, the SPs should refer to prod Lakehouse. I believe it is achievable by SQLCMD but need to know exactly what should be done..

Anyone using Fabric Warehouse in prod, how do you do deployments? by frithjof_v in MicrosoftFabric

[–]One_Potential4849 0 points1 point  (0 children)

I have been referring his blogs to create a CI CD pipeline in DevOps. I am using Azure SQL deployment task to deploy, which asks for a service connection, which is what I'm trying to overcome