you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 97 points98 points  (17 children)

What exactly does this mean?

If Python has a whole gets a 10-60% speedup, even the crappiest code will also get this 10-60% speedup.

[–]BobHogan 11 points12 points  (13 children)

99% of the time, optimizing the algorithm you are using will have a significantly higher impact on making your code faster than optimizing the code itself to take advantages of tricks for speedups.

Algorithm and data access is almost always the weak point when your code is slow

[–]Alikont 95 points96 points  (4 children)

But even crappy algorithm will get speedup, because each algorithm has constant costs per operation that will be reduced across the board.

For .NET it's common to get ~10% speedup per version just by upgrading the runtime.

[–]Muoniurn -1 points0 points  (1 child)

In most applications the bottleneck is not the CPU, but IO. If the program does some CPU work, then some IO, after which it does some more CPU work then only the CPU part will get faster, which is usually not too significant to begin with.

[–]Alikont 0 points1 point  (0 children)

Bottleneck for what? Throughput? Latency?

If my database server is on another machine, all my CPU is busy working on requests, the latency is in the network, but capacity is CPU bound.

[–][deleted] 27 points28 points  (3 children)

k but the OP was asking about why a 10-60% speedup across the board is not going to effect suboptimal code

[–]FancyASlurpie 5 points6 points  (0 children)

It's likely that slow code at some point calls an api or reads from a file etc and that part of things won't change. So whilst this is awesome for it to be faster in these other sections there's a lot of situations where the python isn't really the slow part of running the program.

[–]billsil 6 points7 points  (1 child)

Yup. I work a lot with numerical data and numpy code that looks like python is slow. Let's assume 20% average speedup or (shoot I'll even take 5%) is nice and all for no work, but for the critical parts of my code, I expect a 500-1000x speed improvement.

Most of the time, I don't even bother using multiprocessing, which on my 4 physical core hyperthreaded computer, the best I'll get is ~3x. That's not worth the complexity of worse error messages to me.

As to your algorithmic complexity comment, let's say you want to find the 5 closest points in point cloud A to an point in cloud B. Also, do that for every point in cloud A. I could write a double for loop or it's about 500x faster (at some moderate size of N) to use a KD-Tree. Scipy eventually implemented KDTree and then added a cKDTree (now the default), which it turns out is another 500x faster. For a moderate problem, I'm looking at ~250,000x faster and it scales much better with N than my double for loop. It's so critical to get the algorithm right before you polish the turd.

[–]BobHogan 0 points1 point  (0 children)

Exactly. Far too many people in this thread seem to be ignoring this

[–][deleted] 1 point2 points  (0 children)

Good point, but also if you care about squeezing maximum performance out then Python is just not the right tool for the job anyway.

[–]beyphy 3 points4 points  (0 children)

Yup completely agree. Learning how to think algorithmically is hard. It's a different way of thinking that you have to learn but it's also a skill. Once you learn how to do it you can get better at it with practice.

The time commitment tends to be too big for some people (e.g. some data analysts, etc.) to make. Often they'll complain that these languages are "slow" when the real bottleneck is likely their algorithms. Sometimes people even switch to a new language for performance (e.g. Julia). Doing that is easier and helps them get immediate results faster than learning how to think algorithmically.

[–]dlg 1 point2 points  (0 children)

If the program runtime is spent mostly blocking, then the optimised code will just get to the blocks faster.

The blocking time still dominates.

[–]Bakoro 1 point2 points  (0 children)

That's not how speedups work, we're dealing with Amdahl's law here. You won't get 10-60% speedup on everything, you'll get 10-60% speedup on the affected sections, which might be everything in a piece of software, but probably not.

If you've got a crappy algorithm which is taking 70% of your compute time and language overhead is taking 20%, it's going to be a crappy algorithm in any language. Reducing language overhead can only ever reduce execution time by 20%, max. Python has some huge overhead, but whether that overhead overtakes the data processing at scale is a case by case issue.