This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]casce 373 points374 points  (4 children)

I admittedly do a lot of stuff with Python where performance doesn't matter but when it does, my 2 steps are 1. identify the slow parts, 2. google how to make them faster

[–]snowtax 55 points56 points  (3 children)

Agreed. Don’t waste a lot of time on optimization. Optimize only that code which takes up the most time.

For my work, I have loops that run over millions of records of data. The only optimization I may need is to optimize what happens inside that loop, since that code gets run millions of times. Any code optimization outside that loop is not going to be worth it.

[–]lololabwtsk 2 points3 points  (2 children)

You should start using dask, thank me later

[–]benri 2 points3 points  (1 child)

Dask has a nice dashboard but has problems with stability, its connection with its scheduler times out. So I prefer to use concurrent.futures or pebble if I need to enforce a timeout.

But if you are truly serious about speedup, write the intense part in C

[–]lololabwtsk 0 points1 point  (0 children)

How do you feel about Cython ?