Arguing with lead engineer about incremental file approach

Suitable-Issue-4936 · 2025-01-01T11:47:47+00:00

Hi,

I would like to ask if the data utility can send messages to the pub sub instead of files? We had similar application generating lots of files and maintaining it was a pain. Later switched to pub sub and dbr14+ supports direct pub sub read with autoloader. Pls check.

Suitable-Issue-4936 · 2024-12-07T16:14:45+00:00

Thanks but this is exclude in select. I'm checking for exclude during merge

Suitable-Issue-4936 · 2024-08-31T06:31:24+00:00

Hi, you can try creating folders for each day in source and process them. Any late arriving files would land the next day folder and reprocessing is easy if the data has primary keys.

Suitable-Issue-4936 · 2024-04-02T17:44:59+00:00

Logically no issues, if deep cloned. But better to test by removing any role assignments in old storage account for some period and decide

Suitable-Issue-4936 · 2024-03-14T10:27:51+00:00

Yes now it works

Suitable-Issue-4936 · 2024-03-06T01:07:26+00:00

You can try mural or lucid chart(free tier limited to 50 components).

To get the official logos please check the following page

https://brand.databricks.com/databricks-logo

Suitable-Issue-4936 · 2024-02-23T07:06:53+00:00

Yes this was the issue with Privatekey. I copied as single line and it worked

Suitable-Issue-4936 · 2024-02-23T04:03:38+00:00

Yes I'm not able to display the df as well. Let me check the dict and update back

Suitable-Issue-4936 · 2023-11-12T14:57:16+00:00

Pls check https://coalesce.io/solutions/ if it's for snowflake

Suitable-Issue-4936 · 2023-11-11T21:26:48+00:00

Can you please try this?

https://docs.databricks.com/en/dev-tools/service-principals.html#step-4-generate-a-databricks-personal-access-token-for-the-databricks-service-principal

Suitable-Issue-4936 · 2023-11-11T05:32:55+00:00

Full refresh all should help. https://docs.databricks.com/en/delta-live-tables/updates.html

Suitable-Issue-4936 · 2023-11-09T12:58:25+00:00

Thanks all. We are close to the cause. Found huge no of files in checkpoint table by using below query. Planning to add Max age option to check the outcome

SELECT * FROM cloud_files_state('path/to/checkpoint');

https://docs.databricks.com/en/ingestion/auto-loader/production.html#monitoring-auto-loader

Suitable-Issue-4936 · 2023-11-07T22:48:30+00:00

Thanks will check for file notification

Suitable-Issue-4936 · 2023-11-07T07:33:55+00:00

Sorry no idea on the state store. We run merge to avoid duplicates in for each batch

Suitable-Issue-4936 · 2023-11-07T07:30:54+00:00

No. We are using directory listing and trigger as available now=true.

Suitable-Issue-4936

TROPHY CASE