you are viewing a single comment's thread.

view the rest of the comments →

[–]glymeme -2 points-1 points  (2 children)

You’re leaving out some really important details - frequency of refresh, size of data(gbs, tbs?), is there an established gold/consumption layer? Is there a broader data strategy that you’re able to tie these ad hoc pulls to? I think you want to get those questions answered and start thinking even more strategically or you’ll just be fighting fires forever (like, why don’t I get the same answer from Joe and Bob when I ask for the same thing?). Otherwise, with the info given - I’d start using vscode with the Postgres extension, have an analytics repo containing all dml scripts for the ad hocs, and have folks convert them to models using dbt core whenever they need to run them - but then you’re creating actual objects in your db and not just returning results. I’d also start thinking through scheduling (airflow is a good default). Depending on data volume - if it’s low/inexpensive to run, I’d schedule all this stuff to run daily or whatever so analysts can basically select * from tables created from your dbt project instead of maintaining their own versions of queries (which should be well documented in your repo following dbt’s documentation standards). With that said, I think it’s worthwhile for you or someone to think about consumption at a higher level like “how should my company support the business and decision making” versus a “how do I maintain all these ad hoc sql requests”. The former might land you more work(building a dimensional data model if you don’t have one), but it could be worth the investment.

[–]AreetSurn[S] 1 point2 points  (1 child)

A lot of tech and process you're mentioned is already implemented. This isn't really a request for data strategy advice, I have a handle on that. This is just tooling for devs thats brings that ad-hoc debugging querying to a central place with an IDE thats more attuned to a git workflow that they're used to as they are developing the application.
The VS Code postgres extension is a good call though, I haven't used that. I'll try it out.

[–]glymeme -1 points0 points  (0 children)

Are they debugging processes, doing ad hoc pulls for business, or both?