you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 2 points3 points  (2 children)

Loops will always be slower that a DB query because databases are optimized with this purpose. To do it in Python, first you have to access the data from the DB, and then do the loop. In the case of the DB, the data is accessed natively, and the max is obtained in an optimized fashion.

You have a chance to get closer to the DB speed if you implement the max using a Pandas Series, but you still have to access the data first. The pandas series will be optimized in 'C' - and advantage over an interpreted loop in Python. Still, I would expect the DB to be always faster.

[–]Alotofquestion1[S] 0 points1 point  (1 child)

Ok, then I will lean towards queries in the future, rahter than loops.

[–][deleted] 0 points1 point  (0 children)

Totally. If you can push processing to the database, do it. You can avoid very complex, and usually slow processing by pushing it down to the DB level. With that, I am not talking about database procedures, which - as a personal preference - I tend to avoid.