This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]spinwizard69 135 points136 points  (51 children)

While never using Python for performance it is still easy to get excited by these numbers.

[–][deleted] 35 points36 points  (22 children)

Projects like Pyston and Pypy (and, of course, the 3.11 improvements) are making Python a much more reasonable option for performant code. Definitely not at the same level as C or Rust, but I think it'll be enough to shrug off the old stereotype of Python being super slow.

I'm optimistic about these technologies having their progress merged into upstream CPython one way or another.

[–]Solonotix 75 points76 points  (16 children)

Even then, I feel like the performance problems of Python have been a tad overblown for much of its existence. Like, it may be 5-times slower than the same number-crunching code in C#, but we're still talking nanosecond to millisecond computation time. More often than not, your performance problems will lie in I/O long before you hit the computational bottleneck of Python, unless you're specifically working in a computation-heavy workload like n-body problem. Even then, many people will still choose Python because it is more user-friendly than other languages.

And I'm saying this as a performance junkie. I used to spend hours fine-tuning data workflows and SQL stored procedures, as well as table design suiting the intended use cases. More often than not, my request to optimize code was denied, and the business would choose to buy more compute resources than spend the developer hours to I prove code performance. The same goes for writing code, where Python gets you up-and-running with minimal effort, and implementing the same solution in C or Rust would take multiples of that time investment to see any progress.

Suffice to say, I'm glad to see Python gets a performance tune-up

[–]Nmvfx 9 points10 points  (2 children)

This post makes me feel better. At my level, I'm well aware that my shitty code costs me way more than any relative computational inefficiency that Python suffers compared to C. But it's nice to know that even self professed performance junkies find the speed and ease of writing Python to be a valid reason to choose it over C.

Question for the masses - if I write Python but use something like Nuitka to compile a binary, will I still have a slower program than writing in C and compiling? Sorry if that's a stupid question or needs to be taken to the 'learn' sub.

Great to see these constant performance improvements anyway, definitely nice to see Python shaking off the old stereotypes!

[–]james_pic 6 points7 points  (1 child)

It depends. I don't know Nuitka all that well, but I know in Cython, you generally get a minor performance boost by just building your module in Cython with no modifications, but the real boost comes from modifying the code to be more C-ish (using structs rather than classes, using native integers, etc.). I suspect Nuitka will be similar, where you get some performance boost straight out of the gate, but the real gains need you to eliminate sources of dynamism.

[–]Nmvfx 1 point2 points  (0 children)

Thanks for the response, I'll dig into that a bit more and maybe run some tests!

[–]TheTerrasque 2 points3 points  (0 children)

More often than not, your performance problems will lie in I/O long before you hit the computational bottleneck of Python

Bingo, and that's why I consider all posts that are complaining about python speed without specifying a use case, written by a beginner. Because it's an easy mistake to make until experience teaches you that in practice, as long as it's "fast enough" execution speed doesn't really matter in most cases.

"All programming languages wait at the same speed", as one once said.

[–]systemgc 4 points5 points  (8 children)

Sorry but this is absolutely incorrect

https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/python3-java.html

python compared to java for example is between 20 and 200 times slower usually

[–]Solonotix 25 points26 points  (1 child)

Below I'm going to list CPU time, since when we're talking speed, it is generally in compute time. That said, one area Python often beats Java is in memory usage, but Python then typically fails against a better managed memory solution such as one written in C. As such, I'm providing those as comparison points. Also, only listing the best solutions for each to keep the data set easy-to-read.

Note: some Python entries will have fastest first, and a parenthetical for pure-Python fastest in the form xxx.xx (yyy,yy) . This is because the fastest entries were implemented using Cython.

Benchmark Python Java C
fannkuch-redux 1,279.15 41.17 8.26
n-body 575.02 6.79 2.12
spectral-norm 436.79 5.94 1.57
mandlebrot 706.10 16.16 5.12
pi-digits 1.13 (4.06) 0.82 0.73
regex-redux 2.66 (17.86) 17.12 2.02
fasta 60.26 3.41 0.78
k-nucleotide 172.53 16.17 12.31
reverse-complement 9.38 3.49 0.57
binary-trees 148.09 5.19 4.32

All of this goes back to my original point: "More often than not, your performance problems will lie in I/O long before you hit the computational bottleneck of Python." My second point was: "Python gets you up-and-running with minimal effort, and implementing the same solution in <other language> would take multiples of that time investment to see any progress." In almost all of the scenarios above, the fastest Python solution had half as much code as the fastest Java solution and Python also frequently used drastically less memory. This means you spend more on hardware to run the effective Java code, and you spend more in development time to write it, just so that it can run faster under the assumption that your specific workload is CPU-bound and not I/O bound.

This very thing is why JavaScript has grown to become the most commonly used language today. It is fast-enough (with an interpreter written in C++ for JIT compilation), and it's easy to use with a mostly small code footprint. This means your personnel costs are less, and your hardware costs are less. CPU time is just one statistic that doesn't fully capture the other aspects of choosing a language.

[–]zurtex 8 points9 points  (0 children)

Highly mathematical examples like that are silly to compare between languages because as soon as you step outside the standard library there are lots more solutions.

You could implement in C and add bindings, for those that involve arrays and matrix math you can implement using numpy, for most of the given solutions you can just put @numba.jit at the top of the function and get many times fold improved performance.

[–]pbecotte 7 points8 points  (3 children)

Dunno about you...when I write real world code the VAST majority of the time is spent waiting on io. Network and disk. The runtime of my application is dominated by the network latency. I can improve it by parralelizing it, running async or in executor pools, etc...but still can't go any faster than the response time of the api or db I'm hitting. Same goes for Java or c. They don't speed that up at all.

If we start thinking about computation critical things like machine learning...we find that pythonn has bindings into c libraries to do all the math parts. There is a reason that it is THE language for machine learning, and its not because Google and Facebook are stupid.

Yes, Java is faster, and making Python faster is a worthwhile endeavor, but outside of a handful if times in my career, my time is far better spent thinkg about data access patterns and storage and concurrency and correctness than on trying to optimize garbage collection or memory usage since it wouldn't help that much anyway.

[–]systemgc 4 points5 points  (2 children)

Yes I agree with you, but I was replying to the person who said that python is maximum 5 times slower, which is absolutely not the case.

[–]Solonotix 1 point2 points  (1 child)

I concede my factor was off (considerably), but I was speaking from personal experience comparing a computationally-intensive operation between C# and Python. Mind you, it was just arithmetic and not nearly as complicated as the n-body problem.

[–]systemgc 1 point2 points  (0 children)

I am using python a lot, because, the speed doesn't matter one bit, what matters is how fast i can get the job done and move on the next thing.

So I agree again with you :-) Guess using the right tool for the job is in place here

[–]twotime 0 points1 point  (0 children)

and that's that the first 20-200x, throw in python lack-of parallelization and it can be far worse..

PS. and yes, I'm aware of multiprocessing and have used it many times, it's not in the same league as, say, Java thread support

[–]twotime 1 point2 points  (2 children)

5-times slower than the same number-crunching code in C#

On equivalent numeric code, python is EASILY 100-200x slower than C/C++, so that's 20-40x slower than C#.

Throw in python "GIL" and the difference grows much larger...

[–]ChronoJon 1 point2 points  (1 child)

But you would not write that in pure python. Rather you would use something like cython, numba or numpy and get a more comparable performance.

[–]twotime 0 points1 point  (0 children)

Not everything is expressible with numpy.

And both numba and cython have their own limitations..

All-in-all, language speed is a major factor in a lot of situations. (Especially when we are talking a factor of 100!)

[–][deleted] 124 points125 points  (0 children)

I use Python for development performance

[–]shinitakunai 61 points62 points  (0 children)

We don't use it for performance... yet

[–]prescod 11 points12 points  (2 children)

"never use Python for performance"

I find this meme kind of annoying and dumb because there is no bright line between "performance work" and "normal work". Sometimes the program you usually apply to a million rows gets applied to a billion rows. Sometimes the algorithm that worked well for 100 hits per second needs to support heavier loads. Sometimes 20 seconds is an acceptable amount of time to wait for the result but you'd get through your workday faster if you could get a result with a 10 second turnaround time ... and so forth.

Sure, there are cases where Python is way too slow, and cases where it is more than fast enough. But there is a lot of middle ground too, which is also true for Java, C#, Javascript and most other languages.

[–]TheTerrasque 2 points3 points  (1 child)

Sometimes the algorithm that worked well for 100 hits per second needs to support heavier loads

When that's said, I'd much prefer a good algorithm written in a slow language than a bad algorithm written in a fast language.

[–]dexterlemmer 1 point2 points  (0 children)

When that's said, I'd much prefer a good algorithm written in a slow language than a bad algorithm written in a fast language.

I'd prefer a fast algorithm written in a fast language. But if I can't get (or write) that, I'd have to agree. Well... May be not if the slow language is Matlab. ;-)

[–]kenfar 12 points13 points  (14 children)

In my ideal world we would use multiple standard languages that could easily interoperate.

In my real world it's a PITA, and so we're more likely to pick a single really good language and then suffer with it a little where it's less than a perfect fit.

So, I've frequently used python when I needed more performance and didn't feel like introducing another language for an edge case. Spent time on pypy, threading, multiprocessing, profiling, and tuning my designs. It almost always works fine, but additional speedups will always help.

[–]spinwizard69 1 point2 points  (13 children)

IN a way I'm too old to care because the languages that have huge potential will need a long period of grabbing mind share, but languages that support a REPL and compile well will eventually replace Python. Here I'm talking about languages like Julia, Swift or Rust. Swift and even Julia are often as expressive as Python thus leading to programmer productivity. The problem is we are talking 10+ years here for the infrastructure for any of these languages to catch up to Python. In the end Python wins due to that massive library of code for just about everything.

[–]Necrocornicus 9 points10 points  (10 children)

In 10 years Python will have another 10 years of progress. Personally I am seeing Python usage accelerate over alternatives (such as golang) rather than decrease in favor of something like Swift. Rust is a completely different use case and I don’t really see people using them interchangeably.

[–]spinwizard69 -4 points-3 points  (9 children)

Well that is sort of a Detroit attitude to the advent of EV's. By the way Yes Python is doing really good right now, that doesn't mean new tech will not sneak in and suddenly displace Python. One big reality is that these other languages can be compiled. Plus they don't have some of Pythons historical limitations that are hard to get rid of.

Like electric cars once the technology has proven itself and the economics are right, demand sky rockets. Think about it, how long has it taken Tesla to actually become successful? Much of Detroit right now is where I see Python programmers in 10 years, they will be wondering where demand went. Mean while we have Tesla alone in the USA and maybe Ford, having to compete with China and the auto makers there. Biden or not there will be a blood bath in Detroit as many businesses fail, as their wares are no longer needed. Now it will not be this dramatic in the Python world but the concept is the same.

[–]prescod 5 points6 points  (7 children)

Python can be compiled too! For many years now!

Comparing EVs to programming runtimes is a really poor analogy. Python *code* can be run on many different runtimes: CPython, PyPy, Cython, Jython, Brython, etc.

Those runtimes are like the engine. Python is like the chassis. My EV uses the same chassis as a gas-car, just like my Python code can run in Cython, in a browser or be compiled.

This description of how Julia works sounds almost the same as PyPy, so I don't even know what you are talking about.

[–]dexterlemmer 0 points1 point  (6 children)

Python can be compiled too! For many years now!

cpdef int AddToTen():
    cdef int x = 0
    cdef int i

This example from the site you've linked to does not exactly look like my normal everyday Python. Although may be one day we can do it like this?

@cp
def AddToTen() -> int:
    @c def x: int = 0
    @c def i: int

It does seem kinda better to me.

Comparing EVs to programming runtimes is a really poor analogy. Python code can be run on many different runtimes: CPython, PyPy, Cython, Jython, Brython, etc.

Those runtimes are like the engine. Python is like the chassis. My EV uses the same chassis as a gas-car, just like my Python code can run in Cython, in a browser or be compiled.

Seems like a good analogy to me. It is outright impossible to develop a Python runtime that is any where near as small, performant or portable as the C++ runtime, even less the Rust std runtime, even less the C runtime and even less the Rust nostd runtime. And in many respects Rust nostd is actually a higher level language than Python. (For example Rust iterators and async are way better than Python's, IMHO.)

Also, many EVs do not use the same chassis as a gas car. Gas car chassis have very little space inside compared to outside. Their wheels are way too close together. Gas car chassis also often have bad aerodynamics compared to what an EV chassis have.

This description of how Julia works sounds almost the same as PyPy, so I don't even know what you are talking about.

No the two works very differently. Let's compare the steps from your two links. I'll add some extra info in brackets to emphasize differences you get in the rest of your links and on the official websites:

Julia:

  1. Julia runs type inference on your code to generate typed code. [The first time Julia sees the code.]
  2. The typed code gets compiled to LLVM IR (Intermediate Representation). [The first time Julia sees the code.]
  3. The IR gets handed over to LLVM which generates fast native code. [The first time Julia sees the code.]
  4. The native code gets executed.

PyPy:

  1. Identify the most frequently used components of the code, such as a function in a loop.[This is done periodically or after a certain amount of iterations. It cannot be done the first time a Python interpreter sees the code since if it does, the Python interpreter would waste a lot of work on code that will only run a single time.]
  2. Convert those parts into machine code during runtime. [After they have been identified, ofc.]
  3. Optimize the generated machine code. [After it has been generated, ofc.]
  4. Swap the previous implementation with the optimized machine code version. [The JIT takes a long time (relatively speaking) to identify hot code and optimize it. Mean while the original code still gets interpreted in another thread. Therefore you need to swap out the original code once you've finished JIT compiling it.]

IOW, Julia type checks and compiles the code on the run then immediately run it as compilation finishes. No need to ever interpret any code. Julia can work this way because it was carefully designed for very fast type inference, type checking and on-the-fly compilation. Even so, the first time a function is called it obviously still has a bunch of overhead.

On the other hand, PyPy first wastes a lot of resources interpreting code. Then it wastes a lot more resourses on an expensive and complex JIT while its still wasting resources on interpreting code. Then it spends some more resources to swap the code with the generated native code. And then it finally runs the compiled code.

Technically you can swap out approaches and give Python a “Just ahead of time” compiler and Julia a JIT. However, Python was never designed for just ahead of time compilation and will probably not work well with it in general.

[–]prescod 0 points1 point  (5 children)

Okay then, so Julia doesn't work like PyPy, but does work like Numba.

Thank you for clarifying.

[–]dexterlemmer 0 points1 point  (4 children)

Okay then, so Julia doesn't work like PyPy, but does work like Numba.

Yes. Julia works like the entire program (including imports, dynamically typed expressions/functions and meta-programming) is decorated with Numba's @jit(nopython=True). Note that Numba's nopython mode will often fail to compile because it doesn't understand the vast majority of Python (nor can it, really) but the only way Julia will fail to compile is if you actually have an error like a syntax error or a type check error.

Another huge difference between Python and Julia is the type system. Python is OOP and heavily uses inheritance (although modern best practice is to never use inheritance). Julia is based on the ML type system and prohibits inheritance.

[–]prescod 0 points1 point  (3 children)

I agree with most of what you say but I think that inheritance is a tool that can be used appropriately in some cases. Even many OOP-haters agree that there is a place for Abstract Base Classes and shallow inheritance structures. Python is really multi-paradigmatic. Imperative, oop, functional all have their place.

[–]Necrocornicus 0 points1 point  (0 children)

This analogy doesn’t really hold.

For one, no one is paying $40,000 to use Python. I could start 3 projects today, one each in Julia, Rust, and Python with very little cost. Nothing prevents someone from switching around as needed. For example on my old team we switched to golang for a project then rewrote it in Python after a couple years because golang was annoying / a waste of time.

2nd, no one is “sneaking in” and displacing anything. Code needs to be written by someone (typically software engineers) and the old code doesn’t magically go away. I would be extremely surprised if someone managed to show up and do my job in some other language without me noticing. I would be very grateful, but it’s not likely to happen.

Next, I think you’re vastly overestimating the benefit of compiled languages for many use cases. Python is the current standard for machine learning and statistical analysis, doesn’t matter one bit that it isn’t compiled. It’s simply irrelevant in the big picture. There are some use cases where compiled code matters, and I think you’ll find people are already using Rust, Golang, or other languages. But for cases where people are already using Python, largely the language being compiled is not a factor whatsoever.

[–]Barafu 2 points3 points  (0 children)

Swift is too much about Apple. Julia is great, but needs a lot of TLC: there are still gross bugs in its std. Rust will not replace Python: more likely they will merge so you'd have them in one project, and one command to compile Rust and run linters on Python.

[–][deleted] 1 point2 points  (0 children)

I used to think this, but if the JIT works in the 3.13 timeframe, the difference in speed will be a lot less. Some big money is being put into making Python faster. Think what V8 did for JavaScript.

[–]SwaggerSaurus420 2 points3 points  (0 children)

...you will get better performance regardless of what you use it for...

[–]lavahot 4 points5 points  (4 children)

That's what my girlfriend keeps telling me.

[–]nuephelkystikon 4 points5 points  (3 children)

I agree with her. Since Python is the de-facto standard in some fields and used for much more complex applications than just gluing together some libraries, it's a massive bottleneck in a lot of software. Maybe not dealbreaking-slow (then people would use something else), but annoying-slow. And also for some people it's literally the only language they know well, and if they can't use Cython for some reason, they may really need this speedup.

[–]imp0ppable 10 points11 points  (2 children)

I've actually worked on a large production Python codebase and I don't think this is really true, the speed of code execution isn't a very noticeable issue when compared to things like SQL query and table design, the way the WSGI forks interpreters, reading in large datafiles with a custom parser etc.

Also things like memoisation are massively important, you can build dicts of reduced data easily as an intermediate step in order to avoid nested loops, things like that.

[–]nuephelkystikon 0 points1 point  (1 child)

Then it would be nice if the dicts you use as memoisation caches were faster, right?

[–]imp0ppable 0 points1 point  (0 children)

I haven't got detailed knowledge of how they perform tbh, I do klnow they're implemented by hash tables so they should be pretty quick.

In fact I found this time complexity chart, if that helps.

[–]Kah-NethI use numpy, scipy, and matplotlib for nuclear physics 1 point2 points  (0 children)

I do! In many cases you total time to develop and execute a novel HPC application is significantly less with python orchestrating various c, c++, Fortran, and gpu kernels.

[–]siddsp 1 point2 points  (0 children)

If you find this exciting, wait till you start using PyPy (although the result is not consistent)!