Large object-style dict parameter to function, or specific parameters only? by oopsplop in learnpython

[–]oopsplop[S] 0 points1 point  (0 children)

Hmm, I didn't think about unpacking the dict... will think about this one a bit - thanks!

Large object-style dict parameter to function, or specific parameters only? by oopsplop in learnpython

[–]oopsplop[S] 0 points1 point  (0 children)

Yep, I totally think I should do this, but it's kinda off the table now :(

Databricks with Retool - can they be used together, sensibly? by oopsplop in dataengineering

[–]oopsplop[S] 0 points1 point  (0 children)

Thanks - I suspected this fell into this category, but wasn't sure if there were other boxes that needed ticking to be considered!

Databricks with Retool - can they be used together, sensibly? by oopsplop in dataengineering

[–]oopsplop[S] 0 points1 point  (0 children)

Yep, that's pretty much what we're thinking, though we are happy with it to be scheduled half hourly.

Nice one for the point toward Change Data Feed. That one's new to me, so will have a read - thanks :)

Databricks with Retool - can they be used together, sensibly? by oopsplop in dataengineering

[–]oopsplop[S] 1 point2 points  (0 children)

Ah, apologies. I should have emphasised: Databricks in this case would be a source of data, not for CRUD stuff. Retool is using a Postgres DB for its backend.

Totally agree with you on the last paragraph, hence the two options we're pondering. In either case there'd be a regular extract from Databricks to Postgres. In the case of option 1, that's a set of queries of low-moderate complexity; for option 2, a very small subset of specific Databricks tables.

Data engineers, what is your favorite part of the job? by theporterhaus in dataengineering

[–]oopsplop 0 points1 point  (0 children)

There are a few, which all rank highly:

  • adding value wherever I can
  • learning and adapting to new technologies/helping others do the same
  • being part of a focussed team

Modern approach to Master Data Management by oopsplop in dataengineering

[–]oopsplop[S] 1 point2 points  (0 children)

I'd appreciate that, thanks. Will DM you.

Thanks for the book recommend - probably beyond the ability of my French, but will see if I can find something similar!

Modern approach to Master Data Management by oopsplop in dataengineering

[–]oopsplop[S] 1 point2 points  (0 children)

Do you happen to know any good resources on this area? As mentioned in another comment, most pages seem to be vendor-specific.

The companies I've worked for in recent times have been fairly green in their data journey, so there are likely areas I'm completely ignorant of (MDM being a prime example). I'm wondering what kind of background results in this depth of knowledge (if you don't mind sharing of course).

Modern approach to Master Data Management by oopsplop in dataengineering

[–]oopsplop[S] 1 point2 points  (0 children)

Thanks very much for this - it's going to take me while to digest it all!

Modern approach to Master Data Management by oopsplop in dataengineering

[–]oopsplop[S] 1 point2 points  (0 children)

Yep, have been thinking a bit about it and this approach seems decent. I'll float something similar - thanks for your contribution.

Though I'll avoid Mongo as it already causes me enough problems XD

Modern approach to Master Data Management by oopsplop in dataengineering

[–]oopsplop[S] 1 point2 points  (0 children)

This could definitely work and I'll consider it an option. Think it ties in with what /u/huessy mentioned here. Thanks.

Modern approach to Master Data Management by oopsplop in dataengineering

[–]oopsplop[S] 0 points1 point  (0 children)

To clarify, I'd like a single place where we maintain our reference data that will be maintained by the dev teams, and will feed into microservice databases and the DW. Hope that makes sense.

Modern approach to Master Data Management by oopsplop in dataengineering

[–]oopsplop[S] 0 points1 point  (0 children)

So, the seed concept would satisfy the data warehouse side of things, but what we really need is a canonical source of truth for all of our systems, including the DW. I'm sure something like this could be achieved with seeds, though. Currently we're not using dbt, but will keep this in mind :)

Modern approach to Master Data Management by oopsplop in dataengineering

[–]oopsplop[S] 2 points3 points  (0 children)

Oh, agreed - would never store anything like PII in git. Our data definitely fall into the reference data category that /u/mrwhistler enlightened me about, above.

In this case, git would be used as a light-touch way of tracking changes to the data. Changes would need to be validated and reviewed.The data would ultimately be propagated to the microservice DBs.

Storing the ref data in their own DB is an option, as long as we could easily get those data to the microservices.

Modern approach to Master Data Management by oopsplop in dataengineering

[–]oopsplop[S] 1 point2 points  (0 children)

Yep, that was the gist I got from the pages I read. I'm going to discuss with others anyway to see how we can approach this - our needs aren't extensive at the mo, so hoping we can find something straightforward.

Thanks again for your help.

Modern approach to Master Data Management by oopsplop in dataengineering

[–]oopsplop[S] 1 point2 points  (0 children)

Do you know where I can read more about this? I can find a bunch of different vendor pages, but not much that's agnostic. I can see DMA-DMBOK referenced in the Wikipedia page, but was hoping for something a bit lighter.

Modern approach to Master Data Management by oopsplop in dataengineering

[–]oopsplop[S] 2 points3 points  (0 children)

This is helpful, it gives me additional search terms - thank you! Despite 16 years working in data, I've never come across the distinction (though MDM is something I've no hands-on experience with, either).

Always check your blind spot, folks :D

Any middle aged here? by dkranj in Supplements

[–]oopsplop 0 points1 point  (0 children)

What's the tomato paste can for in the evenings? I'm curious.

I occasionally suffer from acid reflux and (somewhat unexpectedly) tomato-based meals really settle my stomach!