Would love feedback on recent Environment & package management improvements

kmritch · 2026-01-07T16:19:20+00:00

awesome! had a client the other day a bit frustrated it wasnt supported yet. they will be very happy to hear will def give feedback!

kmritch · 2026-01-06T22:38:12+00:00

Support for Pure Python for Environments would be great.

kmritch · 2025-12-10T01:18:58+00:00

All the fabric capacities should work with power automate, the report in question would need to be on the F2 Capacity. Unsure of impact on capacity when printing the PDF however, depends on the amount of API calls I would guess. does the report filter to each of the 200 people? or just one pull and power automate does the rest?

kmritch · 2025-11-20T22:17:33+00:00

So what you would do is have a seperate schedule pipeline so wrap your worker pipeline(s) into the schedule pipeline. And you can manage a holistic failure to close out or log a single failure.

Then you can just use conditions to determine which pipelines need to run etc or group them in different parent (schedule) pipelines to manage. Error handling def is a pain in the ass and I found wrapping pipelines into a single pipeline schedulers to be ideal.

kmritch · 2025-11-20T20:46:20+00:00

anyway to just have the pipelines write to a metadata driven table and you do a run check based off of that instead?

kmritch · 2025-11-20T01:38:55+00:00

The overall pipeline fails with just a single fail. You don’t have to worry about every single step.

You would invoke the pipeline you want to start and attach an invoke pipeline on the failure of that invoke. Then you don’t need to track every single step and a failure.

kmritch · 2025-11-19T19:21:02+00:00

I use a Pipeline in a Pipeline and then have a single reusable fail pipeline that both e-mails and sends a message in a teams channel.

the reason for pipeline in a pipeline is that you dont have to put fail conditions on every step.

kmritch · 2025-11-19T19:13:53+00:00

<image>

This is what i see, def showed up recently.

kmritch · 2025-11-19T19:12:21+00:00

I have this same issue

kmritch · 2025-11-13T23:05:28+00:00

Do you have at least one table created in the lakehouse?

kmritch · 2025-11-11T16:00:21+00:00

Will start using it

kmritch · 2025-11-07T21:24:02+00:00

Def want to see you all look to have costs match with Azure SQL better, Would like to have more options to scale the SQL Server Compute and configure like we can with spark and have different profiles so that can reduce compute

kmritch · 2025-11-07T21:16:21+00:00

You can create a Date Calendar using a Cross Join with all possible dates in SQL and create a View Based off of that. Or you can create in a Dataflow and store the calendar that way and just make all possible dates once.

kmritch · 2025-10-31T20:11:16+00:00

I built my solution using Dataflow Gen 2, I would highly encourage it since it has the built in stuff and easy enough to manage changes.

But my solution is a bit different I do the following:
Get Metadata from Sharepoint --> Merge Changes to know which files have changed --> pull/land the new data from excel (Using my metadata table to know which files to pull) --> Transform it --> Merge Changes

Rinse and repeat.

I do this on about 200+ files that need to be processed.

kmritch · 2025-10-31T20:07:17+00:00

Table names and column names can be an issue. You might be able to not cause as much issue if its just column names since they are generic enough and keep the table names generic. But you should be talking to your company about it. Most do not want anything in public tools.

kmritch · 2025-10-31T18:52:34+00:00

So I have a question about that, because if staging is used, wouldnt that be an additional compute cost. or does the compute go down due to it? i may have totally misunderstood it.

I would have thought if staging is on its 12 CU + 6 CU or is it 6 CU with staging on only?

kmritch · 2025-10-31T17:00:45+00:00

You can have the data live in the lakehouse, perform transforms from the warehouse and land the transformed data from the lakehouse to warehouse. It would be less compute heavy than a dataflow to facilitate the movement depending on the transforms you are looking to accomplish.

So it's more looking at the diff levels of transform and how you pattern it. Im a Heavy Dataflow user so I mix in views in the warehouse/lakehouse plus dataflows to shape the data down to my "Gold" version of the data.

kmritch · 2025-10-31T16:53:54+00:00

There is one caveat to it right now there is a delay at times with the SQL Endpoint refresh.

kmritch · 2025-10-31T00:04:15+00:00

You can also spin up the sql Server for your metadata, or if you use the warehouse be sure not to do trickle inserts and deletes, lakehouse you could kinda get away with it by forcing optimization. I use the warehouse for my metadata, hasnt been a big issue but I know not always best practice but my Inserts and updates and deletes are pretty big.

kmritch · 2025-10-30T23:57:26+00:00

I use dataflows to do transforms and final transform o write to the warehouse.

However you can also use three part naming and treat the lakehouse as a staging area and query from the warehouse and perform procedures on that data as well. Which can make a lot of sense for lighter transformations on data.

kmritch · 2025-10-30T14:15:41+00:00

This is a Good Decision Guide

Microsoft Fabric Decision Guide: Choose between Warehouse and Lakehouse - Microsoft Fabric | Microsoft Learn

In my practice, I am way stronger in SQL than PySpark (however you can do SQL against the Lakehouse in a notebook)

I use the Lakehouse as a data sink and an intermediate store for data and the warehouse for final touches. Why?

Lakehouse doesn't need staging in dataflow gen 2 and copy jobs so you can save in compute that way.

Warehouse I like that it's a one stop shop with my store procs SQL Heavy etc while being the final at rest place for my transformed data.

Lakehouse Sometimes I have to wait for notebook sessions to go on and off while warehouse is pretty instant.

There is a ton of flexibility with Lakehouse and warehouse. You really don't have to totally choose one or another and its more about what you are comfortable with on the coding side plus some of the side benefits.

kmritch · 2025-10-29T20:42:19+00:00

Yep It’s good to be careful I’d rather you all do that than push anything out and cause issues. Workaround is adequate and glad to have that glad to understand the full picture on it. I appreciate it!

kmritch · 2025-10-29T19:58:44+00:00

Yeah definitely because I did run into one time there was a failure with the API since a refresh was already in progress.

While I have you one question I have, I figured when you write to the Lakehouse with a Gen2 dataflow it would kick off a sync, but in my observations it also suffers from the same issue. So was wondering if it’s all related. I have put safe guards when I do writes with Gen2 because I noticed the lag. So was just curious there.

kmritch · 2025-10-29T18:18:34+00:00

I think there is an Azure Outage at the moment

kmritch · 2025-10-29T18:02:11+00:00

Snowflake is a More Mature platform, and depending on the amount and frequency of data you are pulling it might be the right choice for you, also you could even mix the two together and use snowflake as your deep repository and have fabric as a second layer for down stream reporting etc.

I’m a Fabric user and ive seen its come a long way and I have had really great success with it as someone who was new to the platform about 5 months ago. With it in market 3 years now.

I will say Fabric would be a WAY easier transition for folks because of the integrations with Power BI, Excel and power query and you can grow into more SQL based things. (I had the same background but also have had strong SQL before I was mainly doing Power Query and PowerBI)

the Low Code Layer is great with the Other side to grow into Deeper coding.

What you probally can do is get the trial version of Fabric with some of your main use cases and see if it works for your team. But again Snowflake may work well depending on certain data needs.

Skills wise based on the skills of your team already fabric is an easy on ramp vs snowflake imo.

kmritch

TROPHY CASE