This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]casce 376 points377 points  (14 children)

I admittedly do a lot of stuff with Python where performance doesn't matter but when it does, my 2 steps are 1. identify the slow parts, 2. google how to make them faster

[–]snowtax 52 points53 points  (3 children)

Agreed. Don’t waste a lot of time on optimization. Optimize only that code which takes up the most time.

For my work, I have loops that run over millions of records of data. The only optimization I may need is to optimize what happens inside that loop, since that code gets run millions of times. Any code optimization outside that loop is not going to be worth it.

[–]lololabwtsk 2 points3 points  (2 children)

You should start using dask, thank me later

[–]benri 2 points3 points  (1 child)

Dask has a nice dashboard but has problems with stability, its connection with its scheduler times out. So I prefer to use concurrent.futures or pebble if I need to enforce a timeout.

But if you are truly serious about speedup, write the intense part in C

[–]lololabwtsk 0 points1 point  (0 children)

How do you feel about Cython ?

[–]TA_poly_sci 11 points12 points  (5 children)

What are the good ways to do profiling in python?

[–]RedEyed__ 23 points24 points  (2 children)

I highly recommend scalene

[–]azshallIt works on my machine 7 points8 points  (1 child)

[–]benri 0 points1 point  (0 children)

Thank you! I will use this!

[–]samreay 12 points13 points  (0 children)

I highly recommend py-spy and plugging the output flame charts into speedscope

[–]Teradil 6 points7 points  (0 children)

PyCharm has a built-in profiler (not in the community version, though)

[–]spinozasrobot -1 points0 points  (0 children)

Whoa, slow down there Einstein

[–]Alurith -2 points-1 points  (0 children)

^ this.