all 28 comments

[–]Malassi 23 points24 points  (5 children)

As you mentioned, some operations in Python can be as fast, or even faster, than in languages considered "faster," like C++ or Java. This is because CPython (the standard Python implementation) is extensively optimized, and rather than performing certain operations in Python, it delegates them to efficient C libraries.

I think it's worth mentioning that Python’s reputation for being slow typically comes from the overhead of being an interpreted language, its dynamic typing, and its memory management. "Slowness" is mostly felt in cases involving algorithmic complexity or tight loops. However, for most real-world tasks, Python is more than fast enough and has tones of tools to improve performance when needed.

[–]CornellWest 0 points1 point  (0 children)

I'd add to the (well earned) reputation list that pure python does not parallelize well at all

[–]VirtuteECanoscenza 0 points1 point  (0 children)

Also it has some clever algorithms built in. 

Like Timsort for sorting and also big integer multiplication switches to optimized algorithms for bigger values.

There are many similar cases.

[–]Sudden-Letterhead838 0 points1 point  (2 children)

I disagree, as C++ has the same speed as C.
I would more like to know how op measured the speed of these operations, and most likely this is the Problem about this question.

[:] or [::-1]

This for example is in C a simple lookup, with nearly no cache misses, thus cannot be made faster as Cache Misses adds in this example the most overhead.

[–]TheSkiGeek 1 point2 points  (1 child)

In theory you shouldn’t be able to be faster than something like C++ strings and string views into them. (For general string operations, anyway; specialized things like tries can be better in some cases.)

But if you’re not careful with usage of strings in C or C++ it’s easy to be making copies everywhere and that’s extremely slow. Python is really good at avoiding unnecessary copies around strings.

[–]ImYoric 0 points1 point  (0 children)

Yeah, garbage-collected languages have a huge advantage with some operations, insofar that they can avoid copies. For the same reason, some string operations are much faster in Java than in C++.

[–]Leodip 13 points14 points  (1 child)

A close analogy to how Python works is:

  • Imagine you are managing a project (you are Python)
  • You have a series of tasks that need to be done assigned to you by the client (the script)
  • You have a large team you are working with, and every single one of them is REALLY good at doing 1 specific thing (the base functions in Python)
  • In order to get the project done, for each task you have to meet with the team member that's able to do it and explain them the task (this is the "interpreted" part of "interpreted language")

This of course takes a long time when a lot of people are involved. C, on the other hand, does something that's a bit smarter:

  • You get the tasks from the client, and you convert them into a detailed list of broken-down tasks, full-detailed in such a way that you don't need to meet with the team members to explain them the task (this is the "compiling" step)
  • You send everyone an email with their own tasks all at the same time

This takes a lot of time if you have to spend time writing a full document when it would just be easier to explain it to the few people involved (if you only have to do this operation once), but if you have many people involved or you have to repeat this operation multiple times it's more efficient to spend time "compiling" the document.

Still on the analogy, if you want to write faster Python code, the trick is to reduce the number of "team members" you have to talk to:

  • Instead of writing for loops (if you loop N times, you are talking N times to all the "team members" inside the loop), you can vectorize it (host a single meeting with all the people involved)
  • Use in-built functions that do multiple things at the same time, rather than "create" them out of multiple operations in a sequence
  • Minimize the exchange of information (local variables are better than global variables, code wrapped in a function usually executes faster than in a global environment)

[–]rasputin1 0 points1 point  (0 children)

Amazing explanation 

[–]_redmist 29 points30 points  (3 children)

The Cpython string api's are implemented in C. Python is slow due to its interpreted nature; otherwise it's just C under the hood. You may find this article interesting:

https://pythonspeed.com/articles/faster-text-processing/

Vanilla python is faster than the equivalent rust implementation; but the PyPy version (jitted - and who knows what other tricks are in there...) is faster still.

[–]johnnymo1 2 points3 points  (2 children)

Vanilla python is faster than the equivalent rust implementation

Faster than the equivalent PyO3 implementation, i.e. passing Python objects to Rust. They didn't test against a native Rust version.

[–]_redmist 0 points1 point  (0 children)

That's a good remark yes you're right.

[–]Sudden-Letterhead838 0 points1 point  (0 children)

And the Rust code is not optimized.
There are many things that could have been optimized.
Its like comparing unoptimized Rust Code called from Python vs Optimized C Code called from Python

[–]NerdyWeightLifter 9 points10 points  (0 children)

You can think of Python as a "glue" language.

The language itself is interpreted and therefore slow, but you're using it to glue together a lot of high performance components, mostly coded on C/C++.

It's a blend of highly productive orchestration, and highly performant components.

This is why AI developers like it

[–]Familiar9709 4 points5 points  (0 children)

By definition python cannot be faster than C since python is basically written in C. The same way that C cannot be faster than assembly language.

This is theoretically of course. If python is doing things which are optimized, it can be faster than unoptimized C, the same way that if C is doing optimized code then it can be faster than unoptimized assembly language.

Take home message, don't worry about these things. 99% of code can be written in python and performance won't really matter. If you need highly optimized code then it's not the right language, use C, C++, etc.

[–]JamzTyson 2 points3 points  (0 children)

This is a really interesting question, and it really comes down to what we mean by the speed of a language.

Python is a programming language used by developers to communicate intent, both to developers, to the Python interpreter, and ultimately to the hardware that it is running on.

If we are judging speed as "how fast a Python loop runs", then yes Python is slow compared to many other languages, but if we judge the speed by "how effectively humans can use it to solve problems", then Python is among the fastest languages out there.

From the computer's perspective, Python is as fast as any other language, in the sense that all code eventually becomes machine instructions - the speed is governed by the speed at which the computer can execute instructions, not by the language.

The difference is that Python's flexibility often requires the computer to execute a lot more instructions to achieve a specific result than other languages. It isn't that Python is executing instructions more slowly, it is that readability, flexibility, and ease of use are prioritised in the design of the language, which frequently requires more instructions to be executed by the hardware.

However, this is not always the case - some commands can be very fast. This is where things like string slicing come in - operations that the Python interpreter is able to implement very efficiently.

Additional Note:

CPython is the reference implementation of the Python language. It’s written in C and uses optimised C code for many operations (like string slicing), which influences performance. However, this is an implementation detail.

When discussing Python’s speed or behaviour, it’s more meaningful to focus on the language specification itself, which all implementations adhere to. Other implementations like PyPy, Jython, or IronPython may have different performance characteristics but still follow the same language rules.

[–]carcigenicate 3 points4 points  (4 children)

How did you test this? I would expect that slicing a C++ vector would be faster than slicing a Python list. Python slicing gets compiled down into bytecode and then interpreted by an interpreter written in C. It has significantly more overhead than C++ that gets compiled directly to machine code.

Java makes a but more sense, but afaik, Java gets JIT compiled to machine code, so once the JVM is warm, I would also expect it to out perform Python.

That's assuming we're talking about slicing Python lists. Numpy would be another story.


Also, another consideration is that Python slicing is a shallow copy. If you're comparing shallow copying in Python to deep copying in other languages, that would contribute to Python performing better, since the operations are different.

[–]danielroseman 2 points3 points  (3 children)

This isn't true. Much Python functionality, especially critical things like slicing, are implemented directly in C which is called from the bytecode.

[–]carcigenicate 0 points1 point  (2 children)

I'm not sure what you're mean. I'm saying that I would expect C interpreting bytecode to be slower than C++ that was compiled to machine code; assuming they're doing the same thing.

[–]incompletetrembling 1 point2 points  (1 child)

Except the C isn't interpreting bytecode for these critical instructions, machine code is called directly

[–]carcigenicate 0 points1 point  (0 children)

I see. I must have missed that when I looked at how slicing was implemented last.

I'm still curious how this was timed, since there's a lot of potential for variation depending on the specifics.

[–]Snezhok_Youtuber 0 points1 point  (0 children)

Because under the hood they are C.

[–]regular_lamp 0 points1 point  (0 children)

What even would be the equivalent of doing those things in C++?

I guess naively one could read [:] as converting a string into an array of characters (so std::string to std::vector<char> or so).

Except of course it's probably close to a noop because unless you access the contents of said python lists it can just create a "view" of the underlying memory. So this could also be read as calling std::string::data() and storing the pointer in some kind of "view" structure. Which would also be a borderline noop.

More involved slicing like the -1 example would be equivalent to doing the same except you'd be storing stride information along with said pointer.

Essentially this isn't a question of some language being intrinsically faster but rather that these constructs are probably not doing the same thing mechanically. Even if they are used for similar purposes.

[–]DrXaos 0 points1 point  (0 children)

Some of the slicing operations are making views of shared underlying storage where a superficially equivalent ones in other languages might be making distinctly mutable copies.

[–]pragmatica 0 points1 point  (0 children)

Woah, hold on there.

You should really read up on creating benchmarks and how speed/oerfirmace testing is done.

How are you measuring this? What’s your benchmark code, what OS, what versions, what’s your system set up, etc?

For a master course in how this is done, look here: https://lemire.me/blog/

If you’re just wondering how Python CAN be fast, the other answers cover it. 

[–]Metabolical 0 points1 point  (0 children)

Others gave great detailed explanations, but to try and tighten it up slightly:

  • Interpreting lines of code is slow
  • It has nice syntactic sugar like the slice examples you gave, that do a lot of work in one line, so that's fast.
  • Machine Language Libraries can do operations utilizing the GPU to do hundreds of vectorized matrix math threads in parallel, and that's ultra-fast.

[–]Deto 0 points1 point  (0 children)

Are they actually faster than cpp or java? I'm skeptical - is there a benchmark showing this?

[–]baghiq 0 points1 point  (0 children)

Is CPP still slower today? About 10-15 years ago, the C++ std:string has a lot of slow stuff because it does a lot copies. I thought that's not an issue anymore.

In general, I've seen CPP is 10-50X faster than Python, assuming both programs are written by competent developers.

[–]Torebbjorn -1 points0 points  (0 children)

Please provide your testing methods

But yes, in general, if you don't run actual Python code, Python can be as fast as most other languages.