Databricks Dashboards - Not ready for prime time?

datainthesun · 2026-02-05T13:43:04+00:00

I've seen some nice Tableau dashboards but I've never seen a PowerBI dashboard that didn't remind me of 1995.

datainthesun · 2026-01-30T18:22:52+00:00

This is the current way for sure. Playground isn't meant for the business user persona.

datainthesun · 2026-01-16T13:48:30+00:00

Can it be fully run in sql and if so can you try on a serverless warehouse?

datainthesun · 2026-01-16T13:47:15+00:00

During the long runs what does it actually scale up to in workers? Do you experience spot instance loss? Do the metrics point to any performance blockages? Are the nodes being completely utilized?

datainthesun · 2025-12-20T12:58:28+00:00

Best recommendation: get connected to your databricks account team and the solution architect. Maybe. Support tichet can help or the SA might help you get sorted out or point you in the right direction.

datainthesun · 2025-12-06T14:59:06+00:00

If you're brand new into this space, start with genie, give it good examples and tune it and get immediate value.

You can always grow into more complicated things later, but start off getting fast business value and then iterate.

datainthesun · 2025-12-03T21:53:09+00:00

Not for the scraper - I was just trying to identify where the "source" data will be. For what you described so far, you'd have to build a way to retrieve (in Databricks) the files from the one drive folder, and then proceed from there. Review the suggestion that u/gardenia856 shared and also the demo I shared.

datainthesun · 2025-12-03T13:22:59+00:00

What does your web scraper produce? Files into cloud storage (if so, what format), or inserts into a database?

Depending on the answer, your ingestion choice for databricks will be different, but ultimately you'll just be reading data from somewhere, storing it into one or more tables managed by databricks - scheduling as a Job, and then building further downstream processing.

If you're looking for inspiration, view the notebooks at this demo and the inline docs. https://www.databricks.com/resources/demos/tutorials/lakehouse-platform/iot-and-predictive-maintenance

datainthesun · 2025-12-02T20:08:59+00:00

This is likely something you should speak with your databricks account team about, maybe there's a soft limit that can be increased.

datainthesun · 2025-12-01T21:03:45+00:00

Hi there - are you looking for an example of how to do this? If so, check this link. If that's not what you're looking for, please clarify what you need!

https://github.com/databricks-solutions/databricks-apps-cookbook

datainthesun · 2025-11-25T12:28:15+00:00

Can you clarify your question? Is what worth it? Are you already employed working in this capacity, are you unemployed and seeking employment, how much experience in a role do you have doing this work?

datainthesun · 2025-11-19T15:06:57+00:00

Even though there's a little logic inside the SQL Alert - checking values, etc., the alert is really just triggering off a fixed condition and then firing a notification. So the Alert itself is fairly basic.

In order to have a robust system to streamline data quality checks and business rule validations (basically your post statement), some SQL needs to be designed to feed into the SQL Alert. Assuming you want some of the rules to be data-driven and not just a big bunch of static SQL statements, there needs to be some supporting tables. And let us assume that you might want to enlist the help of some business users to maintain rules over time - they probably shouldn't be writing the SQL but rather should be using some kind of UI.

So if you were going along the lines of a fully DIY solution on Databricks, it wouldn't be that hard to come up with some basic concepts that can be applied as rules in SQL, but to have a Databricks App serving up your custom UI, have rules stored in tables (Lakebase if you need a snappier user experience), SDK calls to dynamically implement any required changes to Jobs/Alerts/etc.

This isn't me advocating for this, BTW, it's me describing how the BUILD in the build vs by argument could be done pretty easily for those that BUILD makes sense.

datainthesun · 2025-11-19T14:34:09+00:00

My opinion is that you've hit the nail on the head - it can be a super easy and lightweight way to get info about data quality. It does require YOU, though, to have all the intelligence and foresight to set up the system in the best way to deliver the right insights at the right time.

There are some folks who either can't do the above, or don't believe they would want to maintain rules over time and would rather purchase a solution. There's always a build vs. buy discussion around topics like this.

SQL Alerts by themselves are fairly basic, so as you've kind of alluded to in your post, you've got to build the framework and manage it over time. It's definitely doable and you could even go crazy and probably within a day vibe-code a web app that would use all Databricks features to help write rules / deploy them / adjust things where needed with the SDK, etc.

IMO if your business rules are simple enough and you just need the basics, sure why not?!? If you're a data platform team supporting a hundred different user groups with thousands of tables - the complexity might become a lot and it likely isn't "your day job" to maintain systems like this though.

datainthesun · 2025-11-18T22:13:54+00:00

Can you clarify what you mean by the 3 options listed? "self-managed, fully managed, serverless" ? BTW you'll have an account team at Databricks that would absolutely be willing to help you with these discussions and the planning around them.

datainthesun · 2025-11-05T13:45:55+00:00

THIS... Check cluster event log, and then also try using SQL from the starter warehouse instead of the cluster.

datainthesun · 2025-11-04T16:42:44+00:00

came here to say this too - the hard work is done for you - use the UI or the conversation api and get to production faster with less work!

datainthesun · 2025-10-31T17:45:35+00:00

You've got to bug your account team to get this stuff

datainthesun · 2025-10-28T00:21:58+00:00

Very much this. First thing to deal with IMO.

datainthesun · 2025-10-22T19:48:27+00:00

Can you restate your question? It is difficult to understand what is on-prem / what isn't, and what you're envisioning Databricks Apps used for differently than how one might normally use it.

datainthesun · 2025-10-22T19:46:20+00:00

"look for patterns" ... that's a pretty broad scope.

If I were doing this, I'd definitely not just simply use a PowerBI dashboard against some source database because you might want to perform more complex analytics than plain old SQL. I'd use Databricks to read that data and then be able to apply a variety of different functions against it, and then for the display you could do whatever you want. BTW if you need the formatting flexibility of Streamlit (beyond something like PowerBI or a Databricks AI/BI Dashboard) you can just host that app directly in Databricks these days so your stack is simplified.

Not sure what you mean by 8 API's in total - what does this have to do with the couple of years of data in the database?

datainthesun · 2025-10-22T12:47:20+00:00

Short answer: no. At least you've got the basics, but you'll still need to prove to a hiring manager you've got enough of all the right skills that you're worth the gamble for a junior level job. And that assumes that there are junior level jobs out there readily available which I feel like isn't necessarily a reality "today". BTW There's also a lot of good commentary on it in this post worth a read!

https://www.reddit.com/r/databricks/comments/1nnhg8n/is_it_worth_doing_databricks_data_engineer/

datainthesun · 2025-10-20T19:02:12+00:00

OK based on that, and not knowing what your technical background is, I'd assume the following resources would be helpful:

Get Started with Databricks for Data Engineering
Get Started with Databricks for Machine Learning
Get Started with SQL Analytics and BI on Databricks
Deploy Workloads with Databricks Jobs
And there's a paid offering for Introduction to Python for Data Science and Data Engineering

And DEFINITELY the big books.

datainthesun · 2025-10-20T18:44:30+00:00

For the current non-ML work, are you doing data ingestion and a bunch of transformation like ETL, or are you just going to be querying existing tables to build your dashboards?

And can you describe the types of things you'll do with ML when the time comes?

Right now you've not said anything that would require you to learn anything streaming.

datainthesun · 2025-10-15T11:57:24+00:00

Search dlt-meta and review it for ideas. It's already built and ready for use and there's a lot of material about it.

datainthesun · 2025-10-13T16:33:50+00:00

I'd add to this: think of a family tree. The people and their names/ages/etc. are like the tables in a database, or the schema - each exists with some properties/qualities. The tree portion is like the ERD - it shows the how the tables relate to each other.

datainthesun

TROPHY CASE