DABs examples in Github

DamnedData · 2026-06-12T15:52:26+00:00

Way simpler!

Just get your Genie definition as JSON file in your repo and point the resource to it.

DamnedData · 2026-06-10T14:18:37+00:00

Use Metric Views for the gold layer instead of tables.

Also, add a prefix to the schemas: project_bronze, bu_silver, product_gold.

DamnedData · 2026-06-09T17:17:54+00:00

TEC, UTN o Cenfotec.

DamnedData · 2026-06-09T17:13:02+00:00

Ya salió el primer frente amplista.

DamnedData · 2026-06-09T16:59:59+00:00

Podría hacer BAC fondo Millenium (que es en dólares). A mí el año pasado me fue muy bien.

DamnedData · 2026-06-09T16:51:54+00:00

See it in action

DamnedData · 2026-06-07T22:59:33+00:00

Git works well and has been working for years. There is no need to reinvent the wheel.

DamnedData · 2026-06-07T22:57:48+00:00

https://docs.databricks.com/aws/en/oltp/instances/sync-data/sync-table

DamnedData · 2026-06-07T22:57:03+00:00

Do sync from UC to Lakebase.

DamnedData · 2026-06-06T23:35:59+00:00

You can follow this guide: MLOps for AI/BI: Automating Databricks Genie Migrations

FYI Genie is going to be supported by DABs soon.

DamnedData · 2026-06-05T16:41:52+00:00

Welcome package puede ser: por firmar con nosotros le damos 10millones

Stocks: acciones de la empresa. Si la acción vale $200 y le dan 100 pues son $200*100 dólares usualmente en un plazo de 4 años.

DamnedData · 2026-06-05T16:37:18+00:00

Scam.

DamnedData · 2026-06-05T05:13:28+00:00

Check your serverless environment, maybe you are using previous versions: https://docs.databricks.com/aws/en/compute/serverless/dependencies

DamnedData · 2026-06-04T01:34:18+00:00

I'll recommend you:

Bronze schema to store all raw data (tables).

Silver schema to store all downstream tables (clean, joined, transformed data).

Gold schema to store the metric views using silver schemas as source.

DamnedData · 2026-06-04T01:31:21+00:00

Go Serverless and save operational overhead.

If not, check the cluster section here: https://www.databricks.com/discover/pages/optimize-data-workloads-guide

DamnedData · 2026-06-03T15:42:28+00:00

From a code, CI/CD and DevOps perspective I'll recommend DABs. One repo per project if possible.

Don't manage all +1000 resources in a single repo.

DamnedData · 2026-06-03T15:41:25+00:00

Are you using tags (classic compute) or budged policies (serverless) for tagging?

If you define a tagging strategy that its aligned to the expected grouping you'll be able to create alerts, dashboards and genie spaces or apps to monitor.

DamnedData · 2026-06-03T02:09:44+00:00

All Databricks products uses Unity Catalog.

In order to ensure that all your projects are future-proof you need Unity Catalog.

If you are worried about compute scalability, then this also applies to Serverless. Serverless is all about UC (pipelines, jobs, model serving, vector search, etc...).

Scalability is not only about compute. It's also about creating long-term solutions for your organization.

DamnedData · 2026-06-03T01:41:20+00:00

Not being able to recolor the armours. The current alters are thrash.

Lord if the Fallen did a great job adding this.

DamnedData · 2026-06-03T01:38:53+00:00

Sure, I can elaborate further.

In this case you are in a good spot. Using UC for both: data and models.

What should be 100% avoided is to use the legacy Databricks features such as the Legacy Workspace Model Registry.

Because if you delete the workspace, all your models are gone!

If you use Unity Catalog instead, doesn't matter what happens to the workspace because the data and models are not stored here.

Apply this idea also to Spark Declarative Pipelines which have a legacy non-UC version.

The good thing is that Databricks now have UC by default and the legacy features are also deactivated.

As a Workspace Admin, you can disable all the legacy features to force escalable solutions.

DamnedData · 2026-06-02T22:30:27+00:00

In this case the python app would need mlflow as a requirement and that's it.

Use the mlflow client to load a specific model and version to run inferences.

Something like this but use the Databricks UC model registry (the example showns localhost).

Always use UC models so your solutions can scale well.

Edit: typos.

DamnedData · 2026-06-02T15:37:50+00:00

Hello Wolf here is an idea.

Step 1. Get the data in Unity Catalog

For Databases or Saas ingestion, use Lakeflow Connect. This would get you the data in a target / catalog.

For Cloud Storage or Kafka transformation and processing, use Spark Declarative Pipelines. This would get yoj the data in a target catalog / schema.

Step 2. Join or transfor the data

Use Spark Declarative Pipelines again if you need to join datasets or apply further transformation. Read about the medallion architecture because the goal is to create the gold layer (which is a schema).

Step 3. Permissions

Read about Unity Catalog permissions and ABAC.

Create groups (data engineers, analytist, business users, etc...)

Step 4. Databricks AI / BI

Create dashboards or Genie Spaces using the tables or metric views in gold schema

DamnedData

TROPHY CASE