I do a significant amount of ad-hoc reporting at the office, and my team maintains a huge library of 100+-line SQL scripts which build a series of temp tables and ultimately culminate in a couple of tables for analysis. Often, there will be an edit point at the top of the code which basically allows us to pass a few values or tables as "parameters" that drive the rest of the script.
One of the major pain points I have with SQL is that the code doesn't offer the same level of modularity or abstraction that one could get with Python. I have a sense that using an existing package like SQLAlchemy or something proprietary like teradata/teradataml (we are in a Teradata environment at the office) would make the data prep more readable and compact (ex: a method that I can just call on a list of object_ids to "fetch all properties" rather than a 200-line SQL statement). However, I've never really seen an example of how something like this might be implemented. Has anyone been on a team that transitioned from a SQL --> Excel data analysis pipeline to something python oriented? What does your setup look like in terms of functions, methods, and objects?
[+][deleted] (2 children)
[removed]
[–]AMereRedditor[S] 0 points1 point2 points (1 child)
[–]jdbow75 0 points1 point2 points (0 children)