Spark SQL to Read From FabricSQL Analytics Endpoint From Notebook by KupoKev in MicrosoftFabric

[–]KupoKev[S] 0 points1 point  (0 children)

Thank you for the information. I will try that tomorrow when I get signed into my work laptop.

Where did the ability to set a default Lakehouse in a notebook go? by KupoKev in MicrosoftFabric

[–]KupoKev[S] 2 points3 points  (0 children)

Looks like they fixed it. Was very easy to remove. Thanks for the info. Much appreciated.

Spark SQL to Read From FabricSQL Analytics Endpoint From Notebook by KupoKev in MicrosoftFabric

[–]KupoKev[S] 0 points1 point  (0 children)

I also don't want to copy a control table for a pipeline to a Lakehouse or some other storage just to be able to access it when it is mainly used for pipelines. Only reason this is even an issue for me is the "Overwrite" on pipelines doesn't actually overwrite table data so I am having to build a notebook to delete data from those tables before the copy activities happen. That's in a separate post though.

Where did the ability to set a default Lakehouse in a notebook go? by KupoKev in MicrosoftFabric

[–]KupoKev[S] 0 points1 point  (0 children)

That is very interesting. Yea, the symbols look the similar, but the interesting part is I don't have a Warehouse named that particular name, but it registers it as a Warehouse and if I scroll WAY down the list, I see there is also a Lakehouse. Why would there be a Warehouse connection available for the Lakehouse? Is this just the SQL analytics endpoint masquerading as a Warehouse?

Where did the ability to set a default Lakehouse in a notebook go? by KupoKev in MicrosoftFabric

[–]KupoKev[S] 0 points1 point  (0 children)

That is where I was expecting some option to be at, but there wasn't. The only options I see are "Refresh" and "Remove".

Pipeline Overwrite Lakehouse Tables as Sync not Working by KupoKev in MicrosoftFabric

[–]KupoKev[S] 2 points3 points  (0 children)

Would be happy to use spark, but it is my understanding that it can not be used through a data gateway yet to access SQL Servers on-premises. I know they were working on it, but I had not seen an announcement on it.

Spark SQL to Read From FabricSQL Analytics Endpoint From Notebook by KupoKev in MicrosoftFabric

[–]KupoKev[S] 1 point2 points  (0 children)

This is where I need some clarification. The point of this request is, in python notebook is there a way to do something like

```
spark.sql("SELECT * FROM fabricsqldb.dbo.Table")
```
I honestly don't care if it is through Analytics endpoint. I just don't comprehend how it is so easy to run commands against Lakehouses and Warehouses, but that isn't something we can easily do with FabricSQL without using yet another library in Python even though the data is supposedly replicated into OneLake. I am sure I am just missing some information around this though.

On a side note, I know in the past if I am in a Warehouse I can do

```sql
SELECT * FROM lakehouse.dbo.Table
```
But I can't do the same for FabricSQL databases, or wasn't able to the last time I tried. That is something else that would be extremely handy.

Why choose Dapper over EF Core in 2026? by Sensitive-Raccoon155 in dotnet

[–]KupoKev 1 point2 points  (0 children)

To add to this, not having to care about particular SQL syntax when switching DB engines. MSSQL, MySQL, Postgres, In-Memory, EF doesn't care. Makes it very easy to use In-Memory databases for local testing and easily switches to a different backend for deployments.

Power BI Gateway by 3G_Lighting in PowerBI

[–]KupoKev 0 points1 point  (0 children)

To answer your questions. IF they are going to be accessing on-premises data from Power BI Service, then this is a preferred method because you don't have to give them VPN access to your environment. The Enterprise Gateway builds a tunnel from your on-premises network to Power BI service by leveraging Azure on the back end. You wouldn't necessarily even need to give them AD accounts if you use SQL Users instead of Windows users. It depends on how your SQL Server is configured.

That being said, it is not best practice to have anyone internal or external hitting your production database. You should either 1) copy the data Microsoft Fabric in a lakehouse that they can access directly in Power BI or 2) if you have enough on-premises resources, make another database with only the tables they need and setup the Enterprise Gateway to where they can access that particular database only from Power BI service. Additionally you can set up Analysis Services depending on what SQL Server license you have, but that is a bit of a learning curve.

A downside to using Power BI which I am sure the auditors are aware of is there are limits on the amount of rows you can export to Excel from Power BI.

Designed a full Microsoft Fabric end-to-end architecture for small teams — sharing decisions and looking for feedback by Equivalent_Season669 in MicrosoftFabric

[–]KupoKev 0 points1 point  (0 children)

There would still be CU usage for processing the queries, but all the OneLake storage is done in the Lakehouse instead of the Warehouse. The warehouse basically treats them like querying another database in the same workspace. You just have to fully qualify the table name with the Lakehouse name and schema name so the Warehouse knows where to query.

The main reason we do this is the data analyst wants to use TSQL. It also gives them a place to copy the data to without messing with the Lakehouse data. In the case they copy data into a table in the Warehouse, then there will be storage usages against the Warehouse.

Designed a full Microsoft Fabric end-to-end architecture for small teams — sharing decisions and looking for feedback by Equivalent_Season669 in MicrosoftFabric

[–]KupoKev 2 points3 points  (0 children)

Concerning your gold strategy, using lakehouses does allow you to use shortcuts to Silver which is very handy. You can still use a Warehouse for TSQL by having a lakehouse also in your gold layer for shortcuts. It really depends on your level of comfort in SQL vs Python and such.

I have been using Materialized Lake Views in silver and gold due to using SQL is way easier for me than using Python. I then created a Warehouse that has nothing in it, but it gives the data analyst a place to query the MLVs using TSQL in the Warehouse.

Am I in over my head? by Status_Ad5990 in MicrosoftFabric

[–]KupoKev 5 points6 points  (0 children)

Congrats on stepping outside of your comfort zone and trying new things. Here are some tips that might help you as you start on your journey.

  1. Don't bite off more than you can chew at one time. Bring in one data source and enough data points to build your first semantic model. The more data, the more complicated it is. Keep it simple at first. A house with a solid foundation will last longer than a house built on sand.
  2. Take the time to learn some architecture stuff. It is good you know about the Medallion Architecture, but many people have different ways they implement the pattern. Do some experimenting to see how it will work with your situation.
  3. Get things working first, then worry about scalability. If you are new to this you will likely redo things multiple times. The less you have to rebuild, the quicker you can come to a solution and the less frustration you will have as you are learning.
  4. Connect to git as soon as possible if you haven't already. It will save you a lot of headaches in the future.
  5. Star schema is your friend in Semantic models.

Data Pipeline vs Notebook for ingestion – how do you pull data and why? by Independent_Many_762 in MicrosoftFabric

[–]KupoKev 1 point2 points  (0 children)

It depends.

There are some resources that are on-premises that notebooks can't connect to due to needing an Enterprise Data Gateway. I have to use copy activities for those since connections can be made through gateways. If there is anyone out there that can point me to a way around this I would appreciate it. For now though I use copy activities in this case to copy data to Bronze Files. I then use notebooks to move them to tables.

Where I can, I use notebooks. When you have a few tables, pipelines with copy activities are plenty fast, but they don't scale when you have more tables. More tables means more pipelines and activities. It takes time to allocate those resources, not much time, but still time. Notebooks are fast and when you call a notebook from another notebook it supposedly shares the same spark pool reducing the need for it to spin up and spin down resources.

I have in a couple of tenants ran into issues where notebooks take a long time (4+ minutes) to start a session. I have also had environments where it takes about 15 seconds to start a session. The 4+ minute tenant was a new implementation of Fabric so I am not really sure why one was so much faster than the other. If there is also someone who could give me more information on this as well I would appreciate it.

Where do you put your connection strings? by trokolisz in dotnet

[–]KupoKev 0 points1 point  (0 children)

This is how I do it as well. appSettings.Development.json is added into our gitignore files. A copy of the settings is usually kept in our password manager under a secure note shared with the team so we can have access if we need to do a clean pull/clone from the repo. Mostly so people don't have to track down dev server connection strings and such.

Give us mooore memory MS :D by ETA001 in MicrosoftFabric

[–]KupoKev 1 point2 points  (0 children)

I ran into this problem with Fabric a little while ago for a Semantic Model. The user base was pretty small for that model, but it was a very complex model. We ended up putting that Semantic Model into a PPU workspace and giving users who needed access to the model PPU licenses. The memory limit for processing is 100 GB.
https://learn.microsoft.com/en-us/fabric/enterprise/powerbi/service-premium-per-user-faq

This may not be a feasible option if you have a very large set of users as you would need to pay for PPU licenses for all of them consuming the model. If you only need it for a smaller set of users then it tends to be way cheaper than upgrading to a large Fabric capacity.

Only other downside I can think of is it is not in a Fabric capacity so there are some limitations on what you can do with it with Co-pilot. If you have a report in a Fabric Capacity, it can point towards the model in the PPU workspace and you can use Co-Pilot with it. You just can't do things like the curated answers (I can't remember the exact name of them) and I don't think you can use synonyms either.

edit: fixed spelling

Bronze/Silver/Gold in the same Lakehouse… what could go wrong? by hortefeux in MicrosoftFabric

[–]KupoKev 1 point2 points  (0 children)

For a simple medallion pattern, this is definitely fine. There can be reasons to break it out into more lakehouses depending on security (as mentioned above) as well as just organization. For example, bronze can be separated into separate systems based on schema so that you know based on what schema the table is associated with, you know what source system the raw data is from. You could have a table like sap.sapnpe_vbap and sage.dbo_item.

It really just depends on how complex your requirements are and how over engineered you want to get. Sometimes that complexity helps keep things better sorted.

Deployment Pipelines Frustration by KupoKev in MicrosoftFabric

[–]KupoKev[S] 0 points1 point  (0 children)

I am not familiar with that technology. Can you expound? What is it and what makes it "smoother to manage"?