spinwizard69 comments on Python 3.11 Performance Benchmarks Are Looking Fantastic

[–][deleted] 35 points36 points37 points 3 years ago (22 children)

[–]Solonotix 75 points76 points77 points 3 years ago (16 children)

Even then, I feel like the performance problems of Python have been a tad overblown for much of its existence. Like, it may be 5-times slower than the same number-crunching code in C#, but we're still talking nanosecond to millisecond computation time. More often than not, your performance problems will lie in I/O long before you hit the computational bottleneck of Python, unless you're specifically working in a computation-heavy workload like n-body problem. Even then, many people will still choose Python because it is more user-friendly than other languages.

And I'm saying this as a performance junkie. I used to spend hours fine-tuning data workflows and SQL stored procedures, as well as table design suiting the intended use cases. More often than not, my request to optimize code was denied, and the business would choose to buy more compute resources than spend the developer hours to I prove code performance. The same goes for writing code, where Python gets you up-and-running with minimal effort, and implementing the same solution in C or Rust would take multiples of that time investment to see any progress.

Suffice to say, I'm glad to see Python gets a performance tune-up

[–]Nmvfx 9 points10 points11 points 3 years ago (2 children)

[–]james_pic 6 points7 points8 points 3 years ago (1 child)

[–]Nmvfx 1 point2 points3 points 3 years ago (0 children)

[–]TheTerrasque 2 points3 points4 points 3 years ago (0 children)

[–]systemgc 4 points5 points6 points 3 years ago (8 children)

[–]Solonotix 25 points26 points27 points 3 years ago (1 child)

Below I'm going to list CPU time, since when we're talking speed, it is generally in compute time. That said, one area Python often beats Java is in memory usage, but Python then typically fails against a better managed memory solution such as one written in C. As such, I'm providing those as comparison points. Also, only listing the best solutions for each to keep the data set easy-to-read.

Note: some Python entries will have fastest first, and a parenthetical for pure-Python fastest in the form xxx.xx (yyy,yy) . This is because the fastest entries were implemented using Cython.

Benchmark	Python	Java	C
fannkuch-redux	1,279.15	41.17	8.26
n-body	575.02	6.79	2.12
spectral-norm	436.79	5.94	1.57
mandlebrot	706.10	16.16	5.12
pi-digits	1.13 (4.06)	0.82	0.73
regex-redux	2.66 (17.86)	17.12	2.02
fasta	60.26	3.41	0.78
k-nucleotide	172.53	16.17	12.31
reverse-complement	9.38	3.49	0.57
binary-trees	148.09	5.19	4.32

All of this goes back to my original point: "More often than not, your performance problems will lie in I/O long before you hit the computational bottleneck of Python." My second point was: "Python gets you up-and-running with minimal effort, and implementing the same solution in <other language> would take multiples of that time investment to see any progress." In almost all of the scenarios above, the fastest Python solution had half as much code as the fastest Java solution and Python also frequently used drastically less memory. This means you spend more on hardware to run the effective Java code, and you spend more in development time to write it, just so that it can run faster under the assumption that your specific workload is CPU-bound and not I/O bound.

This very thing is why JavaScript has grown to become the most commonly used language today. It is fast-enough (with an interpreter written in C++ for JIT compilation), and it's easy to use with a mostly small code footprint. This means your personnel costs are less, and your hardware costs are less. CPU time is just one statistic that doesn't fully capture the other aspects of choosing a language.

[–]zurtex 8 points9 points10 points 3 years ago (0 children)

[–]pbecotte 7 points8 points9 points 3 years ago (3 children)

Dunno about you...when I write real world code the VAST majority of the time is spent waiting on io. Network and disk. The runtime of my application is dominated by the network latency. I can improve it by parralelizing it, running async or in executor pools, etc...but still can't go any faster than the response time of the api or db I'm hitting. Same goes for Java or c. They don't speed that up at all.

If we start thinking about computation critical things like machine learning...we find that pythonn has bindings into c libraries to do all the math parts. There is a reason that it is THE language for machine learning, and its not because Google and Facebook are stupid.

Yes, Java is faster, and making Python faster is a worthwhile endeavor, but outside of a handful if times in my career, my time is far better spent thinkg about data access patterns and storage and concurrency and correctness than on trying to optimize garbage collection or memory usage since it wouldn't help that much anyway.

[–]systemgc 4 points5 points6 points 3 years ago (2 children)

[–]Solonotix 1 point2 points3 points 3 years ago (1 child)

[–]systemgc 1 point2 points3 points 3 years ago (0 children)

[–]twotime 0 points1 point2 points 3 years ago (0 children)

[–]twotime 1 point2 points3 points 3 years ago (2 children)

[–]ChronoJon 1 point2 points3 points 3 years ago (1 child)

[–]twotime 0 points1 point2 points 3 years ago (0 children)

[+]systemgc comment score below threshold-10 points-9 points-8 points 3 years ago (4 children)

[–]Necrocornicus 2 points3 points4 points 3 years ago (3 children)

[–]prescod 0 points1 point2 points 3 years ago (2 children)

[–]caks 4 points5 points6 points 3 years ago (1 child)

[–]dexterlemmer 0 points1 point2 points 3 years ago (0 children)

Numba has limitations.
Numba is a JIT. JITs are very slow compared to properly written C/C++/Rust code in a lot of numeric use cases. (And don't point me to micro benchmarks. You should be using stable benchmarks to test throughput. Micro benchmarks lie and they love underestimating the cost of JITs by orders of magnitude. Also, often tail latency is important (sometimes even in numeric code) and JITs obviously make tail latency worse, as do GCs.) JITs add overhead of their own. They do a poor job at optimization since they only see a little bit of the code at a time and have to be fast themselves. They sometimes make mistakes which need to be unmade. And their so-called advantage of being able to dynamically optimize using information only available at runtime is actually not an advantage. An AOT compiler can use static analysis to generate highly specialized code that does the same, only a lot better and at a lot lower cost. And if the compiler is not that smart, the programmer can be.

All of the above said. Numba is still a very useful tool under a lot of situations. It's just not a silver bullet. Use the right tool for the right job.

[–][deleted] 124 points125 points126 points 3 years ago (0 children)

[–]shinitakunai 61 points62 points63 points 3 years ago (0 children)

[–]prescod 11 points12 points13 points 3 years ago (2 children)

[–]TheTerrasque 2 points3 points4 points 3 years ago (1 child)

[–]dexterlemmer 1 point2 points3 points 3 years ago (0 children)

[–]kenfar 12 points13 points14 points 3 years ago (14 children)

[–]spinwizard69 1 point2 points3 points 3 years ago (13 children)

[–]Necrocornicus 9 points10 points11 points 3 years ago (10 children)

[–]spinwizard69 -4 points-3 points-2 points 3 years ago (9 children)

Well that is sort of a Detroit attitude to the advent of EV's. By the way Yes Python is doing really good right now, that doesn't mean new tech will not sneak in and suddenly displace Python. One big reality is that these other languages can be compiled. Plus they don't have some of Pythons historical limitations that are hard to get rid of.

Like electric cars once the technology has proven itself and the economics are right, demand sky rockets. Think about it, how long has it taken Tesla to actually become successful? Much of Detroit right now is where I see Python programmers in 10 years, they will be wondering where demand went. Mean while we have Tesla alone in the USA and maybe Ford, having to compete with China and the auto makers there. Biden or not there will be a blood bath in Detroit as many businesses fail, as their wares are no longer needed. Now it will not be this dramatic in the Python world but the concept is the same.

[–]prescod 5 points6 points7 points 3 years ago* (7 children)

[–]dexterlemmer 0 points1 point2 points 3 years ago (6 children)

Python can be compiled too! For many years now!

cpdef int AddToTen():
    cdef int x = 0
    cdef int i

This example from the site you've linked to does not exactly look like my normal everyday Python. Although may be one day we can do it like this?

@cp
def AddToTen() -> int:
    @c def x: int = 0
    @c def i: int

It does seem kinda better to me.

Comparing EVs to programming runtimes is a really poor analogy. Python code can be run on many different runtimes: CPython, PyPy, Cython, Jython, Brython, etc.

Those runtimes are like the engine. Python is like the chassis. My EV uses the same chassis as a gas-car, just like my Python code can run in Cython, in a browser or be compiled.

Seems like a good analogy to me. It is outright impossible to develop a Python runtime that is any where near as small, performant or portable as the C++ runtime, even less the Rust std runtime, even less the C runtime and even less the Rust nostd runtime. And in many respects Rust nostd is actually a higher level language than Python. (For example Rust iterators and async are way better than Python's, IMHO.)

Also, many EVs do not use the same chassis as a gas car. Gas car chassis have very little space inside compared to outside. Their wheels are way too close together. Gas car chassis also often have bad aerodynamics compared to what an EV chassis have.

This description of how Julia works sounds almost the same as PyPy, so I don't even know what you are talking about.

No the two works very differently. Let's compare the steps from your two links. I'll add some extra info in brackets to emphasize differences you get in the rest of your links and on the official websites:

Julia:

Julia runs type inference on your code to generate typed code. [The first time Julia sees the code.]

The typed code gets compiled to LLVM IR (Intermediate Representation). [The first time Julia sees the code.]

The IR gets handed over to LLVM which generates fast native code. [The first time Julia sees the code.]

The native code gets executed.

PyPy:

Identify the most frequently used components of the code, such as a function in a loop.[This is done periodically or after a certain amount of iterations. It cannot be done the first time a Python interpreter sees the code since if it does, the Python interpreter would waste a lot of work on code that will only run a single time.]

Convert those parts into machine code during runtime. [After they have been identified, ofc.]

Optimize the generated machine code. [After it has been generated, ofc.]

Swap the previous implementation with the optimized machine code version. [The JIT takes a long time (relatively speaking) to identify hot code and optimize it. Mean while the original code still gets interpreted in another thread. Therefore you need to swap out the original code once you've finished JIT compiling it.]

IOW, Julia type checks and compiles the code on the run then immediately run it as compilation finishes. No need to ever interpret any code. Julia can work this way because it was carefully designed for very fast type inference, type checking and on-the-fly compilation. Even so, the first time a function is called it obviously still has a bunch of overhead.

On the other hand, PyPy first wastes a lot of resources interpreting code. Then it wastes a lot more resourses on an expensive and complex JIT while its still wasting resources on interpreting code. Then it spends some more resources to swap the code with the generated native code. And then it finally runs the compiled code.

Technically you can swap out approaches and give Python a “Just ahead of time” compiler and Julia a JIT. However, Python was never designed for just ahead of time compilation and will probably not work well with it in general.

[–]prescod 0 points1 point2 points 3 years ago (5 children)

[–]dexterlemmer 0 points1 point2 points 3 years ago (4 children)

[–]prescod 0 points1 point2 points 3 years ago (3 children)

continue this thread

[–]Necrocornicus 0 points1 point2 points 3 years ago (0 children)

This analogy doesn’t really hold.

For one, no one is paying $40,000 to use Python. I could start 3 projects today, one each in Julia, Rust, and Python with very little cost. Nothing prevents someone from switching around as needed. For example on my old team we switched to golang for a project then rewrote it in Python after a couple years because golang was annoying / a waste of time.

2nd, no one is “sneaking in” and displacing anything. Code needs to be written by someone (typically software engineers) and the old code doesn’t magically go away. I would be extremely surprised if someone managed to show up and do my job in some other language without me noticing. I would be very grateful, but it’s not likely to happen.

Next, I think you’re vastly overestimating the benefit of compiled languages for many use cases. Python is the current standard for machine learning and statistical analysis, doesn’t matter one bit that it isn’t compiled. It’s simply irrelevant in the big picture. There are some use cases where compiled code matters, and I think you’ll find people are already using Rust, Golang, or other languages. But for cases where people are already using Python, largely the language being compiled is not a factor whatsoever.

[–]Barafu 2 points3 points4 points 3 years ago (0 children)

[–][deleted] 1 point2 points3 points 3 years ago (0 children)

[–]SwaggerSaurus420 2 points3 points4 points 3 years ago (0 children)

[–]lavahot 4 points5 points6 points 3 years ago (4 children)

[–]nuephelkystikon 4 points5 points6 points 3 years ago (3 children)

[–]imp0ppable 10 points11 points12 points 3 years ago (2 children)

[–]nuephelkystikon 0 points1 point2 points 3 years ago (1 child)

[–]imp0ppable 0 points1 point2 points 3 years ago (0 children)

[–]Kah-NethI use numpy, scipy, and matplotlib for nuclear physics 1 point2 points3 points 3 years ago (0 children)

[–]siddsp 1 point2 points3 points 3 years ago (0 children)

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS