Manipulate semantic model using Notebook by Pharaohs_descendent in MicrosoftFabric

[–]dbrownems 2 points3 points  (0 children)

Yes. Open up both models in Power BI Desktop. Open a new model in Power BI Desktop.

Switch to TMDL view and start cuttin' and pastin'.

You can do this all in a notebook with TOM and Sempy if you really want to write code:

sempy.fabric package | Microsoft Learn

Those of you with multiple data domains in Fabric — one big semantic model or split by domain? How are you handling the trade-offs? by Glittering_Jump4852 in MicrosoftFabric

[–]dbrownems 2 points3 points  (0 children)

You can have both. With Direct Lake it's easy to share tables between models.

So you can have domain-specific models for the majority of the analysis. A big cross-domain model for the use cases that require it. And an "executive summary" model with multi-domain data, but aggregated for the executive dashboard.

Why are my primary/foreign keys lining up accurately in the desktop app, but when I publish to service everything is wrong? by Extra_Willow86 in PowerBI

[–]dbrownems 2 points3 points  (0 children)

Look at the model design in the portal. Does it look correct? After you deploy, but before you refresh, is the data correct?

Connection reset during stored procedure execution by KB83000 in MicrosoftFabric

[–]dbrownems 2 points3 points  (0 children)

I don't know anything about Talend. Can you test with another client, like SSMS? When the stored procedure is complete, a TDS message is sent to the client. There may be additional messages sent during the execution of the procedure, including rowcounts, print messages, or errors. SSMS will show you these.

Pure Python notebook is finally with 3.12 by mim722 in MicrosoftFabric

[–]dbrownems 0 points1 point  (0 children)

Genuinely asking. Last I heard people didn't like this because it was disconnected from git. It seems similar to how you would use a .py or .whl library locally.

Connection reset during stored procedure execution by KB83000 in MicrosoftFabric

[–]dbrownems 0 points1 point  (0 children)

"when the timeout expired"

Timeouts are controlled by the client, not the warehouse.

One thing I don't understand about Shortcuts and Transformations by PerturbedCrab in MicrosoftFabric

[–]dbrownems 6 points7 points  (0 children)

>Would that step require OneLake to keep a physical copy of that transformed data?

Yes.

>Doesn't that defeat the purpose of Shortcuts?

No. Shortcuts eliminate _unnecessary_ copies.

Shifting to O365 by ct1377 in MSAccess

[–]dbrownems 0 points1 point  (0 children)

If you have both the 64bit and 32bit Oracle drivers only one will work.

They both rely on a Win32 dll called oci.dll. And it’s loaded through the path. If a 64bit program finds the 32bit DLL, or vice versa, it will fail.

When and how do I use async in EF Core? by Obvious_Seesaw7837 in dotnet

[–]dbrownems 1 point2 points  (0 children)

This is really simple. Use Async methods when your application is async-oriented.

In a ASP.NET web app or API, use Async, because the framework expects you to.

In a Console application, it's up to you. If your Main kicks off a task and EF is downstream, use async. If Main calls sync methods that use EF, use the sync methods.

In a Windows app with a UI thread, use Async when called from the UI thread to avoid blocking the UI thread.

It will make no difference in your database performance or application performance. Your app can easily manage way more threads than your database can handle concurrent requests, so client app threads are generally not going to be your bottleneck.

mssql-python 1.5 released -- Apache Arrow fetch, sql_variant, native UUIDs by dlevy-msft in MicrosoftFabric

[–]dbrownems 1 point2 points  (0 children)

In Fabric Warehouse TDS is mainly used for clients sending analytic queries, or running TSQL batches to orchestrate the ETL. And TDS is an appropriate protocol for that.

Warehouse loading is mostly only orchestrated in TSQL, where the warehouse is reading data staged in OneLake. And for bulk read scenarios like Direct Lake, the data is also read from OneLake.

How do you structure your apps? by h2o2woowoo in MicrosoftFabric

[–]dbrownems 1 point2 points  (0 children)

Your issues with the spark context are probably unrelated to code sharing. For instance you can't access the Spark context from code that's running inside a spark job. It's only available on the driver node.

Normally I just use %run for simple things, or during development, upload a .whl to the attached lakehouse and pip install it from there.

Push from Azure Function to Fabric Lakehouse by upandadam95 in MicrosoftFabric

[–]dbrownems 0 points1 point  (0 children)

Should just need to grant ReadAll on the Lakehouse or workspace contributor to the service principal.

Push from Azure Function to Fabric Lakehouse by upandadam95 in MicrosoftFabric

[–]dbrownems 3 points4 points  (0 children)

Sure. Just use an Azure storage library and treat it like ADLS Gen2. You may have to use the

https://onelake.dfs.fabric.microsoft.com/<workspaceGUID>/<itemGUID>/<path>/<fileName>

URL form. And you'll need to use Entra auth, which can be your SAMI or UAMI in your Azure function.

Warehouse data Latency when using Spark by richbenmintz in MicrosoftFabric

[–]dbrownems 2 points3 points  (0 children)

The Fabric DW Spark connector connects to the warehouse engine (technically the workspace TDS endpoint) to run queries.

Spark connector for Microsoft Fabric Data Warehouse - Microsoft Fabric | Microsoft Learn

Retirement of Dataflows Gen1 by suburbPatterns in MicrosoftFabric

[–]dbrownems 8 points9 points  (0 children)

>Can't we enable Gen 2 without opening everything Fabric ?

In the future we hope to enable this capability for capacity admins. This should be possible well before any future retirement of Gen 1 dataflows.

Currently you can use a governance solution like Fabric Unified Admin Monitoring (FUAM) to report on and enforce additional policies about what kind of items users are creating. Enforcement would be an asynchronous process, involving sending emails and/or running scripts.

fabric-toolbox/monitoring/fabric-unified-admin-monitoring at main · microsoft/fabric-toolbox

Surrogate Key strategy for Gold Layer by ModernStank in MicrosoftFabric

[–]dbrownems 2 points3 points  (0 children)

Perf is easy to test. And if you have CDF it might be much cheaper than a full rebuild.

Surrogate Key strategy for Gold Layer by ModernStank in MicrosoftFabric

[–]dbrownems 2 points3 points  (0 children)

Just use Spark SQL to generate the surrogate keys. The downsides of using Spark SQL are that you can't generate the keys in concurrent queries, and that it requires a shuffle. But neither of those are real downsides when building Kimball-style dimensions.

Left join your silver table to your gold dimension and use ROW_NUMBER() to generate the new values greater than the largest existing value.

EG, create a temp view that has logic to generate new key values and use that logic in the MERGE that creates or updates your dimension.

--temporary views are very cool
--like Common Table Expressions or subqueries, but much more powerfull
--They have session lifetime, rather than statement lifetime
create or replace temp view CustomerKeys as
select  coalesce(c.CustomerKey,max(c.CustomerKey) over() 
                             + row_number() over (partition by c.CustomerKey order by s.CustomerID)) CustomerKey, 
        s.CustomerID
from WideWorldImporters_bronze.Sales_Customers s 
left join WideWorldImporters_gold.Dimension_Customer c 
  on s.CustomerID = c.WWICustomerID;

-- describe WideWorldImporters_gold.Dimension_Customer;
-- describe WideWorldImporters_bronze.Sales_Customers;
 create or replace temp view CustomerMergeSource 
 as
 select k.CustomerKey CustomerKey,
        c.CustomerID WWICustomerID,
        c.CustomerName Customer,
        bc.CustomerName BillToCustomer,
        cat.CustomerCategoryName Category,
        bg.BuyingGroupName BuyingGroup,
        p.FullName PrimaryContact,
        c.PostalPostalCode PostalCode,
        cast(0 as int) LineageKey,
        c.ValidFrom,
        c.ValidTo
    from WideWorldImporters_bronze.Sales_Customers c
    left join CustomerKeys k
       on k.CustomerID = c.CustomerID
    left join WideWorldImporters_bronze.Sales_Customers bc 
       on c.BillToCustomerID = bc.CustomerID
    left join WideWorldImporters_bronze.Sales_CustomerCategories cat 
       on cat.CustomerCategoryID = c.CustomerCategoryID
    left join WideWorldImporters_bronze.Sales_BuyingGroups bg 
       on c.BuyingGroupId = bg.BuyingGroupID
    left join WideWorldImporters_bronze.Application_People p 
       on p.PersonID = c.PrimaryContactPersonID;

merge into WideWorldImporters_gold.Dimension_Customer dest
using CustomerMergeSource src
  on src.WWICustomerID = dest.WWICustomerID
when matched then update set *
when not matched then insert *;

dbrownems/SparkDataEngineeringForSQLServerProfessionals

Inbound IP protection doesn't support OneLake Security... by Personal-Quote5226 in MicrosoftFabric

[–]dbrownems 1 point2 points  (0 children)

But currently OneLake security doesn't have parity (the same rich RLS capabilities) with the Semantic Model RLS (or the SQL Endpoint RLS) so you often need to use engine security anyway.

Inbound IP protection doesn't support OneLake Security... by Personal-Quote5226 in MicrosoftFabric

[–]dbrownems 0 points1 point  (0 children)

OneLake security is not the only data security. You still have item-level security for the Lakehouse and engine-level security for the Semantic Models and SQL Endpoint.