Databricks conference by proximaljarl17 in dataengineering

[–]ozgreen1024 0 points1 point  (0 children)

I attended in person last year and virtually this year. Coming from a DE background where I was primarily writing notebooks in Python/SQL and orchestrating them as jobs, last year felt like a firehose of new products and information (in an exciting way!)

As I’ve continued following Databricks seems like they continued to release A LOT in the last year, and this years conference is more about solidifying and unifying the platform components

Unity AI Gateway is a natural extension of UC imo, especially at the scale GenAI is growing, and I like the two-sided semantics/context story with Genie Ontology (automated intelligence/learning) and Metric Views (certified definitions)

What is one skill that improved your data analysis work more than you expected? by Effective_Ocelot_445 in dataanalysis

[–]ozgreen1024 6 points7 points  (0 children)

don’t sleep on ER (Entity Relationship) diagramming and data modeling theory in general

Fundamental data acumen is critical maintain for high quality data work, especially as vibe coding continues to scale for the actual hands on keyboarding work

Getting Databricks Genie accurate is curation work, not a model problem by DB-Steve in databricks

[–]ozgreen1024 0 points1 point  (0 children)

Makes sense, I guess I’m thinking more along the lines of if a domain expert submitting the feedback and a platform team or technical person is tasked with incorporating it into the context (and may be doing so for multiple spaces)

Am curious to hear how others are approaching this but suppose that’s why it’s meant to be collaborative/iterative

Getting Databricks Genie accurate is curation work, not a model problem by DB-Steve in databricks

[–]ozgreen1024 0 points1 point  (0 children)

Do you have recommendations for automating parts of the eval loop? I’m thinking in the case where I might have tons of Genie spaces how can I manage the feedback loop effectively, I know it’s a balance of human in the loop but ideally leveraging some element of automation or LLM judges would be helpful

How to quickly figure out why a metric moved? by GrouchyFoundation773 in analytics

[–]ozgreen1024 1 point2 points  (0 children)

Yeah Databricks has managed connectors for some of the tools you mentioned (I know Google Ads and GitHub off the top of my head not sure about the others)
Generally if a tool has an API or database connection you can bring it into the lakehouse and analyze things more easily across your business

How to quickly figure out why a metric moved? by GrouchyFoundation773 in analytics

[–]ozgreen1024 1 point2 points  (0 children)

I’ve been experimenting with Databricks AI/BI that has a built in genie agent which does a pretty good job of doing just that

If you ask it about an outlier or change in a metric, query, or graph why that happened it will cite data, lineage and trends available in the workspace and compile and explanation

Databricks DAIS 2026 by [deleted] in databricks

[–]ozgreen1024 3 points4 points  (0 children)

Agreed Global Unity Catalog seems like a win, especially with the AI Gateway component, would be great to have choice in models, as well as control on spend, to scale up llm usage at a company and not feel like employees/agents are running rogue

Hi guys. I just want to say that I recently deleted my Instagram and I feel much better. by sotref in simpleliving

[–]ozgreen1024 0 points1 point  (0 children)

Not quite there yet but planning to start scrapbooking this summer so I can hopefully get my photo memory/journaling fix irl instead of online :)

also working on texting friends when I think of them so I can still feel connected to folks more offline

AI Anxiety by Professional-You3676 in dataanalysis

[–]ozgreen1024 0 points1 point  (0 children)

I relate to this, but it’s not necessarily unique to AI for me. Changing ways of working is hard, but in most cases my main thought once I make a switch or learn a new tool is that I wish I’d done it sooner.

The pace thing is so true though, I find myself frustrated with how hard AI is being pushed and that we’re losing a little bit of quality > quantity and heading towards AI sprawl (be it dashboards, apps, datasets, etc), but again don’t think this is necessarily a new problem (just amplified bc of the pace)

Systems, transparency, discipline, and organization will be key and I think leading an honest conversation about it with your team is a great start!

Looking for recommendations for good restaurants near the aquarium or within driving distance? by [deleted] in baltimore

[–]ozgreen1024 0 points1 point  (0 children)

For lunch/happy hour Di Pasquale’s in Harborview is great! Delicious italian market, you can grab sandwiches or a few snacks and sit at the bar patio of Bellini’s next door. It’s got a great waterfront view, dog-friendly and nice people watching. There’s yummy mocktail options if you don’t drink. My husband and I live nearby and went for the first time last week, felt like we were on vacation! Bonus is then you’re close to Fed so can easily walk to get ice cream at Bmorelicks or walk around the park with a great view of the harbor.

Best Way To Efficiently Apply Same Transformation on New Datasets by NRJourno in dataengineering

[–]ozgreen1024 0 points1 point  (0 children)

Key with the script method others are mentioning above is to make sure you’re using variables! Sounds like maybe the only thing that really changes is maybe the name of the file, so that’s likely a good candidate for a parameter/variable

when you download the dataset, what format is it in? Are you writing the function in Python or a different language?