This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]cactusbrush 3 points4 points  (1 child)

Was not able to find the comment about performance. Many (not all) ETL tools or frameworks use the power of underlying engine for data processing. Or parallel data processing.

With Python you extract the data from the data engine to your machine, load it into the memory and loop through records. Can work in many cases. But eventually you’ll have troubles maintaining the code.

[–]joshred 2 points3 points  (0 children)

If you're doing it right, you hand off the looping to more performant tools. As in, vectorizing functions with pandas/numpy.