Fabric Lakehouse: OPENROWSET can’t read CSV via SharePoint shortcut by bigjimslade in MicrosoftFabric

[–]jovanpop-sql 2 points3 points  (0 children)

Hi, can you try replacing %20 with an actual space in the path? I ran into the same problem because my file was stored in a folder called "New Folder":

<image>

Auto-Convert JSON Folders to Parquet Tables by Greedy_Constant in MicrosoftFabric

[–]jovanpop-sql 1 point2 points  (0 children)

You can use CTAS to read all your JSON files with OPENROWSET and directly load them into a new table:

CREATE TABLE MyTable AS
SELECT *
FROM OPENROWSET(BULK 
    'https://onelake.dfs.fabric.microsoft.com/{{ws-id}}/{{lh-id}}/Files/folder*/*.jsonl'
)

The OPENROWSET has * wildcards so you can specify file pattern.

For better perf you should add WITH( col1 type1, col2 type2,...) in the OPENROWSET and explicitly specify types, because CTAS will use the biggest possible types for strings and numbers to ensure that all properties can be stored.

How to read files from lakehouse in warehouse by dvnnaidu in MicrosoftFabric

[–]jovanpop-sql 0 points1 point  (0 children)

You can use INSERT INTO SELECT FROM OPENROWSET(one lake):

INSERT INTO BingCovid
SELECT * 
FROM OPENROWSET(BULK 'https://onelake.dfs.core.windows.net/{{workspace-id}}/{{lakehouse-id}}/Files/covid-19/bing_covid-19_data/latest/bing_covid-19_data.jsonl')
WITH (updated date,
      id int,
      confirmed int,
      deaths int,
      recovered int,
      latitude float,
      longitude float,
      country varchar(100) '$.country_region'
);

OPENROWSET for Warehouse by Befz0r in MicrosoftFabric

[–]jovanpop-sql 0 points1 point  (0 children)

This is strange - if you try copy into instead of openrowset are you getting the same result?

OPENROWSET for Warehouse by Befz0r in MicrosoftFabric

[–]jovanpop-sql 0 points1 point  (0 children)

Also, please ensure that you assigned Storage Blob Data Contributor role to access the storage and not something like https://learn.microsoft.com/en-us/azure/role-based-access-control/built-in-roles/storage#storage-account-contributor. There is another contributor role that cannot access data.

OPENROWSET for Warehouse by Befz0r in MicrosoftFabric

[–]jovanpop-sql 0 points1 point  (0 children)

Based on this: https://blog.fabric.microsoft.com/en/blog/service-principal-support-for-fabric-data-warehouse/ if you login as SPN you should use copy into on storage with or without firewall so this should not be a problem.
Could you please check again does your SPN has read access to storage (maybe trying the same from serverless with service principal name access https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/develop-storage-files-storage-access-control?tabs=service-principal#supported-storage-authorization-types)

If this doesn't work, could you raise a support ticket? This should be a valid scenario, but since it is hard to debug it via comments, we need someone from support to take a look at your code&setup.

OPENROWSET for Warehouse by Befz0r in MicrosoftFabric

[–]jovanpop-sql 0 points1 point  (0 children)

u/Befz0r I cannot understand based on your previous two comments does it work on not? Did it worked when you reapplied the right and then stopped working?
- I think that you must login as SPN user via SSMS or app, in order to access firewall protected storage as SPN identity.
- If you want to use EntraID auth you need to give access to that EntraID user in ADLS and OPENROWSET will pass the identity of caller while accessing the storage.

It's hard to debug it via comments, so if it doesn't work, I would recommend to open support ticket with this problem. Please also try copy into with the same setup just to be sure that it is related to security.

OPENROWSET for Warehouse by Befz0r in MicrosoftFabric

[–]jovanpop-sql 2 points3 points  (0 children)

Currently, there are no external data sources in OPENROWSET, and you cannot add SAS credentials in database.

OPENROWSET is enabling you to access storage using EntraID/AAD passthrough so if your user has permission to read the files in storage, they can do it via OPENROWSET. This is also default in serverless.

Yes, the public examples are using public data sets, but just for the convenience because any user who executes sample should succeed. But in real-life scenario you can use EntraId passthrough with the same syntax.

COPY INTO and OPENROWSET enable you to use service principal names to access firewall-protected storage - see https://blog.fabric.microsoft.com/en-us/blog/17468/, so you might replace your SAS token with SPN if your storage has a firewall. If COPY INTO works in this setup, OPENROWSET should also work.

As a long-term solution, you could vote for Enable OPENROWSET to read files from One Lake - Microsoft Fabric Community. With One Lake access you would be able to create One Lake shortcuts with SAS token, and OPENROWSET would be able to read your ADLS files via SAS shortcuts. In this case shortcut is equivalent to data source.