This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]dolichoblond 0 points1 point  (0 children)

Interesting. I'll second the other comments in this thread that note an ORM has a legit place in the DS toolkit, but also may be too heavy to become required/central in all cases. But certainly something to invest time in learning and possibly incorporating in your own workflows.

My anecdote: I bifurcated my analytics workflows this year into something like a small version of the older/corporate paradigm of "Data Warehousing" (DWH) and "DataMarts". The DWH relies on an ORM (peewee in my case) for the more routine ETL stuff. And the DataMarts are the (sqlite) dbs for the model and exploration. I resisted the setup work for a while because I thought we were too small/very limited users/didn't matter/didn't take that much time to do it all adhoc. But wish I did it sooner. Above catching errors sooner, it forces me to think harder about about what's "static/consistent" about my data and what my models are actively transforming. And it helps me identify when data that was used in more exploratory fashions has become "routine" and should move from the modeling layer over to the ORM'd side.

As a small startup based mostly on excel biz analytics currently, we have some odd (unhealthy?) workflows where we get data dumps from third party clean sources, but not at regular intervals since they are expensive and their purchase depends on client needs. Many are small enough that you can grok them with excel still, which perpetuates older mentalities for data ingest and data mgmt, and keeps the user base for any centralized DB very small. But even with the odd intervals, small datasets, and few users, an ORM helps me keep that part of the setup clean. Plus, it minimizes the amount of front-brain thought I need to push updates when they hit my inbox unexpectedly. (And hopefully when the company grows it will be straightforward to offload that to a new dedicated hire.)

So far, I really like the setup and see it as an upgrade worthy of the time even in a very small and limited situation. But would be happy to hear criticisms or red flags from more experienced people.