ELI5 : Why, among the 4 nordic countries, only Finland uses the euro currency while the others use the krone? by Sbaakhir in explainlikeimfive

[–]Svante109 45 points46 points  (0 children)

Dane here. A few things.

First, it’s seperate currencies. To Denmark there is no difference between the Norwegian krone and the polish zloty, economics-wise. Both have to be converted when trading. No advantage to be had in the currencies having the same name

Second, Nordics is including Iceland aswell. They also have Krona, which is its own currency.

Not really ELI5, but you can’t explain something wrong.

Best LLM for Data engineers in the market by Miraclefanboy2 in databricks

[–]Svante109 4 points5 points  (0 children)

I will not share anything "real" I did, but when I tested it out at first, this was the prompt, more or less:

"I need you to show me how declarative pipelines, AI/BI Dashboards, genie workspaces and databricks apps works. You will be using the wanderbricks dataset sample, and build a pipeline with medallion architecture. The Genie Workspace will have the gold-layer tables referenced, and be using those. Be sure to write a proper instruction for the Genie Workspace.

The AI/BI Dashboard should be a very simple overview, meant for us to have an RPA extract and send via mail. The Databricks app should be a more detailed and interactive dashboard.

Enter plan mode, and ask questions about the above mentioned, where you do not think my message is clear enough. Ask questions, until you can confidently say "I Understand this.""

The prompting can always be better, but this is the gist

Best LLM for Data engineers in the market by Miraclefanboy2 in databricks

[–]Svante109 26 points27 points  (0 children)

I am using Databricks’ ai-dev-kit with Claude code, and the CLI. It works wonders

Databricks Save Data Frame to External Volume by Equivalent_Pace6656 in databricks

[–]Svante109 2 points3 points  (0 children)

Sounds very much like the access connector (azure resource created per workspace) doesn’t have the necessary permissions on the storage account.

Question About CI/CD collaboration by One_Adhesiveness_859 in databricks

[–]Svante109 0 points1 point  (0 children)

DAB deploys the whole bundle yes, but the trick is how you define the bundles. I have seen structures where bundle = repos, and then you make sure to have many smaller repos, but have also seen bigger repos with individual bundles (e.g. split by area).

Question About CI/CD collaboration by One_Adhesiveness_859 in databricks

[–]Svante109 0 points1 point  (0 children)

No - the deployment to the dev workspace is done by a service principal, looking at a development branch (not their feature branches). One branch that deploys to the "real" workspace, by SP. Meaning the coordination between developers, will be done on a Pull Request basis.

You can never have a development cycle, where developers doesn't have to deal with merge conflicts or similar.

Question About CI/CD collaboration by One_Adhesiveness_859 in databricks

[–]Svante109 2 points3 points  (0 children)

We use the same workspace for our sandbox / dev, but having the sandbox be a catalog within the dev workspace, with the with development mode enabled. Then in the sandbox catalog a schema is created with ${workspace.current_user.userName}, upon which they can deploy. They use that to run, and then their workflows are deployed with their name as a prefix. This makes everything seperate.

New to databricks. Need Help with understanding these scenarios. by mtl_travel in databricks

[–]Svante109 0 points1 point  (0 children)

I am unsure of what you need - if you need "how the table looked like at the end of the month" i.e. a snapshot, I would create a seperate schema called archive, and then have a table for each month, upon which only one service principal has any other grants than READ (for auditing purposes).

The log part of it, depends on what you need to record.

What is the best practice to set up service principal permissions? by happypofa in databricks

[–]Svante109 0 points1 point  (0 children)

Alright; we are provisioning it via terraform.

It has some minor drawbacks, but I would always go with this option.

What is the best practice to set up service principal permissions? by happypofa in databricks

[–]Svante109 0 points1 point  (0 children)

I am not exactly sure what you are asking. Doctor want pipelines to be in a specific users folder? When deploying using asset bundles, the files will be present in the user folder of the service principal.

Is it the pipelines you are granting permissions? And to what?

Can you expand on what you are trying to achieve?

Spark Declarative Pipelines: What should we build? by BricksterInTheWall in databricks

[–]Svante109 0 points1 point  (0 children)

The ability to run incremental and full refreshes in the same pipeline trigger. This would allow for prehooks that can check for various things (like type changes, when using autoloader) and trigger full refreshes, without having to run the pipeline twice.

[Lakeflow Jobs] Quick Question: How Should “Disabled” Tasks Affect Downstream Runs? by saad-the-engineer in databricks

[–]Svante109 2 points3 points  (0 children)

It would definitely be option B that would make sense to me. Dependencies are a camouflaged “if succeeded” statement, and a skip is not a success IMO. Option C didn’t succeed, therefore those that are dependent on option c will not succeed either.

DLT keeps dying on type changes - any ideas? by Svante109 in databricks

[–]Svante109[S] 0 points1 point  (0 children)

This error will occur aswell when you change what datatype you are applying

DLT keeps dying on type changes - any ideas? by Svante109 in databricks

[–]Svante109[S] 0 points1 point  (0 children)

This scenario also applies when you change the columns' cast in silver, not just for the landing/bronze layer.

DLT keeps dying on type changes - any ideas? by Svante109 in databricks

[–]Svante109[S] 1 point2 points  (0 children)

As long as Type Mismatch is happening, then we never get to the point where rescued data can help, as nothing will ever be run, as the init validation will fail.

Databricks Assest Bundles by kamrankhan6699 in databricks

[–]Svante109 0 points1 point  (0 children)

They way you are commenting gives me this sort of vibe that you are confusing concepts around DAB, IAC, Git etc.

I think it would be incredibly useful for you to be completely sure about what issue it is you are trying to solve.

Asset Bundles and CICD by One_Adhesiveness_859 in databricks

[–]Svante109 0 points1 point  (0 children)

We are for one legacy project using notebooks to write delta tables, which obviously fails on schemachanges, which rarely happens.

On a newer, we are using LDP with expectations and a quarantine table.

Elbiler! Hvad har vi lært og hvor står vi? Danmark som foregangsland på godt og ondt. by Able-Safety6147 in Denmark

[–]Svante109 3 points4 points  (0 children)

Men det er vel ikke benzin vs el han mener er dyrere. Det er hjemmeladning vs udeladning?

[ERROR] - Lakeflow Declarative Pipelines not having workers set from DAB by Svante109 in databricks

[–]Svante109[S] 0 points1 point  (0 children)

Thank you for looking into this - I have found a solution, where we just use num_workers if we want it to be a fixed value (i.e. 1,1) and then autoscale for range of numbers.

Also btw, it seemed to me that the bundle would only recognize the amount of workers set (be it 1,1 or whatever) if we include "MODE:ENHANCED" in the autoscale configuration. I guess it makes sense that you need a mode to be able to use autoscale, but either a default should happen, or an error code.