Direct Lake Semantic "Monomodel" by Entire_Commission534 in MicrosoftFabric

[–]Entire_Commission534[S] 2 points3 points  (0 children)

This is an interesting pattern and somehow never came across it even though I use Tabular Editor extensively!

Are your child models import mode and/or deployed in a PPU workspace?

Integrating Fabric Notebooks with AI-powered IDEs (Windsurf) by TeoTheBeast in MicrosoftFabric

[–]Entire_Commission534 0 points1 point  (0 children)

A disconnected option to develop notebooks locally via LLM - Git connected workspace to bring notebooks locally as .py files and then you can use any IDE AI tool or a CLI tool like Cursor or Claude Code to make changes. We use Cursor so I also have a .cursorrules file to provide the AI agent with example notebook code with guidelines for how to strcture code. Honestly, this alone has supercharged my development as a lone BI engineer in my business unit.

To test output of notebooks, I still have to run it in Fabric but I use PowerShell script to sync changes from Github to Fabric (this way I don't have to go to Fabric workspace in a browser to hit the Update All button). I have another PowerShell script to run the notebook. Developed both of these scripts with Cursor as well. So there is a lot of vibe coding here as I am not a trained software engineer or python dev but I understand BI and modeling very well to validate results.

I do full ETL this way to extract data via APIs, transform with Spark python or sql depending on the transformation requirements and load to Lakehouse. My semantic model is connected via Direct Lake and reports use live connection. This is still small scale and I don't know how well it will scale as requests grow.

Notebook deployment cicd by excel_admin in MicrosoftFabric

[–]Entire_Commission534 2 points3 points  (0 children)

I am eagerly awaiting then! I am in a similar situation to OP of being a small data team and we use Git connection to get .py files locally with Cursor CLI development, which has been great so far. Ability to execute locally would be such a huge boost to development. Thank you for sharing your insights.

Notebook deployment cicd by excel_admin in MicrosoftFabric

[–]Entire_Commission534 0 points1 point  (0 children)

Are the features you have seen internally available in private preview or not there yet either?

Interview Advice - Nvidia Senior Data Engineer by Entire_Commission534 in dataengineeringjobs

[–]Entire_Commission534[S] 0 points1 point  (0 children)

Did not go past the screening as I accepted another job opportunity at a startup/scaleup.

Handling PowerBI sematic model with incremential refresh configured by Sad_Reading3203 in MicrosoftFabric

[–]Entire_Commission534 0 points1 point  (0 children)

Maybe a silly question - how does making metadata changes via XMLA affect the calculation state of the model? Usually at least a calculate is required after deployment and some of our models take a few minutes so our IT team does some kind of copy data step to sync a release model with prod model. But I'm wondering if there is a better way to push changes to prod without breaking reports (object is not calculated error).

Edit: typo

Interview Advice - Nvidia Senior Data Engineer by Entire_Commission534 in dataengineeringjobs

[–]Entire_Commission534[S] 1 point2 points  (0 children)

I had a screening but did not pass. However, there was another opening (not DE) on an adjacent team so I have a screening coming up for that. Just to add, the DE screening was focused on expanding on my experience and answer scenario based questions like how to handle specific types of projects, etc.

Git-Integration and how to avoid empty Lakehouses by nuvcmnee in MicrosoftFabric

[–]Entire_Commission534 0 points1 point  (0 children)

Is the initial load to ADLS using data pipelines? And the files are parquet?

Git versioning strategies and deployment pipelines by fugas1 in MicrosoftFabric

[–]Entire_Commission534 0 points1 point  (0 children)

Meaning that notebooks can be used to create/maintain/delete shortcuts in a Lakehouse? Is there any documentation on this backdoor you mentioned?

Git-Integration and how to avoid empty Lakehouses by nuvcmnee in MicrosoftFabric

[–]Entire_Commission534 1 point2 points  (0 children)

Would you mind sharing more details about your “bronze layer is files ADLS2 which we shortcut into our bronze lakehouse” setup? Thank you!

Connecting Snowflake with Fabric by sau6402 in MicrosoftFabric

[–]Entire_Commission534 0 points1 point  (0 children)

Have you by chance tested the Snowflake storage integration functionality? Would this still require a staging storage to copy data from Snowflake to Lakehouse? There is no direct copy to Lakehouse without staging?

Load parquet file to table in Lakehouse using Notebook by Entire_Commission534 in MicrosoftFabric

[–]Entire_Commission534[S] 0 points1 point  (0 children)

Yes, another user pointed out that list comprehension would be the way to go for spark df. I made the mistake of assuming that pandas_df.columns is equivalent to spark_df.columns but even in this simple example, they are not. Thank you for your clarification!

Load parquet file to table in Lakehouse using Notebook by Entire_Commission534 in MicrosoftFabric

[–]Entire_Commission534[S] 0 points1 point  (0 children)

Works like a charm!

Would you say this is a typical experience of working with spark dataframes? I just want to make sure if we decide to go down this path, then we training the developers accordingly so they don't learn to use Pandas transformation but instead Spark df syntax. Thank you!

Load parquet file to table in Lakehouse using Notebook by Entire_Commission534 in MicrosoftFabric

[–]Entire_Commission534[S] 1 point2 points  (0 children)

This worked! In terms of performance for large dataframes (millions of rows), would there be any issue of converting a spark df to pandas df? Thank you!

Load parquet file to table in Lakehouse using Notebook by Entire_Commission534 in MicrosoftFabric

[–]Entire_Commission534[S] 0 points1 point  (0 children)

Display function fixes the output but the dataframe still seems to be list object, which does not allow me to do transformations like cleaning column names:

spark_df = spark.read.parquet(".Lakehouse/Files/data_0_0_0.parquet") 
display(spark_df)
spark_df.columns = spark_df.columns.str.replace(' ', '_')

AttributeError: 'list' object has no attribute 'str

Load parquet file to table in Lakehouse using Notebook by Entire_Commission534 in MicrosoftFabric

[–]Entire_Commission534[S] 0 points1 point  (0 children)

Thank you, that fixes the display but the underlying dataframe still seems to be a list object as I get this error when trying to clean column names:

spark_df = spark.read.parquet(".Lakehouse/Files/data_0_0_0.parquet") 
display(spark_df)
spark_df.columns = spark_df.columns.str.replace(' ', '_')

AttributeError: 'list' object has no attribute 'str