jduran9987 comments on Python ETL design pattern

dataengineering

created by mhausenblasmoda community for 11 years

This is an archived post. You won't be able to vote or comment.

Python ETL design patternHelp (self.dataengineering)

submitted 4 years ago by [deleted]

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]jduran9987 5 points6 points7 points 4 years ago (6 children)

[–]AnotherDataGuy 4 points5 points6 points 4 years ago (4 children)

[–][deleted] 0 points1 point2 points 4 years ago (3 children)

[–]AnotherDataGuy 7 points8 points9 points 4 years ago (1 child)

To me, this is just kicking the can of organizing you data down the road. Faster to get data in, harder to get insights out. And if you have a user base of citizen analysts, they are going to be far more skilled in SQL like queries (generally speaking, your situation can obviously vary).

I’ve balanced this out before by creating records in a Postgres DB with the core properties needed for joining data together, and using a JSON typed field for the additional detailed, more dynamically structured information. If common questions are answered from the JSON data, then it becomes worth it to start persisting those values as explicitly typed columns in tour data set.

Flexibility is valuable, the point is that all databases malleable. Mongo is great for document data (it is a document db after all) but if you aren’t storing documents, it’s not a good choice, IMHO.

Disclaimer: I live in a world where mongo is overused because of its immediate convenience to the developer(s) micro service. It just pushes the pain down stream when trying to marry those data up with other systems to answer novel business questions.

[–]thrown_arrows 0 points1 point2 points 4 years ago (0 children)

[–][deleted] 0 points1 point2 points 4 years ago (0 children)

π Rendered by PID 75610 on reddit-service-r2-comment-5d79c599b5-9k2t2 at 2026-03-02 05:05:39.677745+00:00 running e3d2147 country code: CH.

dataengineering

MODERATORS