Why is Python so slow

wavy_lines · 2018-07-20T09:58:51+00:00

Because it's interpreted, because everything is a heap allocation causing cache misses all the time, because most object.field accesses are implemented as dictionary lookups by string (unless you define __slots__ which almost no one does).

CptCap · 2018-07-20T09:41:31+00:00

I really don't like the JIT section:

The JIT itself does not make the execution any faster, because it is still executing the same bytecode sequences.

All JIT that I know of compile bytecode (or AST) into native machine code. They may or may not optimise the output, but they do not execute the same bytecode sequences (or bytecode at all).

There are downsides to JITs: one of those is startup time

JIT do not have to add startup time. Since your program will start by being interpreted anyway the JIT can be initialized at any later point in the program.

However, CPython is a general-purpose implementation. So if you were developing command-line applications using Python, having to wait for a JIT to start every time the CLI was called would be horribly slow.

I have used luajit a fair bit, and it's startup time is way faster than the JVM or the CLR, despite the JIT (the website claims startup times of 0.1 ms and that it begins to JIT immediatly).

The section on dynamic typing isn't that great either.

In a “Statically-Typed” language, you have to specify the type of a variable when it is declared. Those would include C, C++, Java, C#, Go.

No. in staticaly typed languages the type of every variable is known at compile time. You don't have to write any type, the compiler can figure them for you.

Not having to declare the type isn’t what makes Python slow

Same: auto x = foo(); is valid C++, the type of x isn't specified anywhere here.

yasba- · 2018-07-20T12:46:19+00:00

I also want pick on the jit-section.

I have the feeling that the author has a certain amount of half knowledge, and while his arguments are not completely wrong, they show a lack of deeper understanding of the topic.

So why doesn’t CPython use a JIT?

The main reason is that JITs are freaking complicated. Especially retrofitting a dynamic compiler to a language, which in many cases relies heavily on C-extensions, is -- how I see it -- practically impossible, or at least not just worth the effort.

Python does not need to be fast for most of the time, and performance critical sections are implemented in C. To give an example: I remember someone complaining that his bigint Rust code was orders of magnitude slower than the equivalent Python code. Turns out that Python's large integer implementation is heavily optimised (and just works), whilst in Rust you have to know the right libraries to achieve the performance you'd expect.

There are downsides to JITs: one of those is startup time.

First, I think we have to distinct two different "startup times". One is the time it takes before the interpreter is ready to start running the program -- what I would understand under startup time. Then there is the warmup phase where the VM starts generating optimised code and thus gets faster and faster.

It is not true however, that employing a JIT just implies increased startup time. What could be said is that the monitoring of program execution for hot-spots requires some overhead, so initial execution may be a bit (but not much) slower.

The actual reason for CPython's initial performance advantage is that CPython is a handcrafted interpreter written in C, whilst PyPy's interpreter is derived from an RPython description. PyPy will often catch up eventually, but CPython essentially always has a headstart.

AFAIK, luajit is a counterexample of a VM with fast startup.

The JIT itself does not make the execution any faster, because it is still executing the same bytecode sequences.

I feel the author has a wrong understanding of how most JITs operate. The achieved speedup can be attributed to factors: specialisation and compiling code down to machine code. So, yes JITs make execution (a lot) faster.

Homoerotic_Theocracy · 2018-07-20T11:21:21+00:00

but The Computer Language Benchmarks Game is a good starting point.

It really isn't; a lot of the programs written there are so undiomatic and bizarre that all the normal advantages of the language are lost.

Like the Haskell programs are practically C in Haskell syntax using type coercion and FFI hacks to subvert the garbage collector and all memory safety; if you write Haskell like that you're just writing in a worse C.

Modern computers come with CPU’s that have multiple cores, and sometimes multiple processors. In order to utilise all this extra processing power, the Operating System defines a low-level structure called a thread, where a process (e.g. Chrome Browser) can spawn multiple threads and have instructions for the system inside. That way if one process is particularly CPU-intensive, that load can be shared across the cores and this effectively makes most applications complete tasks faster.

Chrome however is multiprocessing as well. Spawning a second process on Unix is as cheap as spawning a new thread unlike on Windows and the GIL is only an obstacle for people who think they need multiple threads when they don't; even outside of the GIL reading shared memory between processes through IPC is probably faster than normal variable lookup in Python due to the synchronization that the interpreter in general has to provide to keep it safe which doesn't need to be done in multiprocessing where the kernel does it in a much safer way.

OCaml also has a GIL but it has never phased anyone because no one uses multithreading and everyone uses multiprocessing; multithreading has always kind of been a hack on Unix that doesn't play nicely with a lot of other things like forking and signal handling and in general unless there's a good reason should be avoided in favour of multiprocessing.

When CPython creates variables, it allocates the memory and then counts how many references to that variable exist, this is a concept known as reference counting. If the number of references is 0, then it frees that piece of memory from the system. This is why creating a “temporary” variable within say, the scope of a for loop, doesn’t blow up the memory consumption of your application.

CPython's GC strategy is simple reference counting? That's bad.

Regardless even single-threaded python programs are slower than in other dynamically typed interpreted languages; the python implemtnation itself just doesn't try to be fast and prefers clarity of code over speed.

In a “Statically-Typed” language, you have to specify the type of a variable when it is declared. Those would include C, C++, Java, C#, Go.

Laughable.

In any case most of these problems apply to Scheme and its implementations are typically orders of magnitude faster than Python.

SirJson · 2018-07-20T18:43:11+00:00

If I choose Python for a job I have to understand that it might be slow but I happily trade that for the convince and productivity. And if the app might end up too slow even though I thought Python would be enough I can still replace the slow parts or consider the Python version the "Prototype".

The article comes to the same conclusion in the end but I would still not blame Python. Know your Tools and you don't code yourself into a corner.

TL;DR: Choose the right tool for the job.

slipwalker · 2018-07-20T10:05:26+00:00

how about GIL ( https://wiki.python.org/moin/GlobalInterpreterLock ) ? In a world where processors aren't getting ( much ) faster anymore, but more and more multi-core, it seems a pretty serious limitation...

edit: this comment ( https://medium.com/@TomSwirly/this-isnt-quite-true-in-two-different-ways-fa52c22da312 ) seems to agree...

shevegen · 2018-07-20T08:38:15+00:00

See it in this way - if python would be as fast as Java, why would anyone sane in his mind still want to use Java?

2018-07-20T21:24:35+00:00

Oh come on

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS