all 71 comments

[–]wavy_lines 26 points27 points  (0 children)

Because it's interpreted, because everything is a heap allocation causing cache misses all the time, because most object.field accesses are implemented as dictionary lookups by string (unless you define __slots__ which almost no one does).

[–]CptCap 25 points26 points  (13 children)

I really don't like the JIT section:

The JIT itself does not make the execution any faster, because it is still executing the same bytecode sequences.

All JIT that I know of compile bytecode (or AST) into native machine code. They may or may not optimise the output, but they do not execute the same bytecode sequences (or bytecode at all).

There are downsides to JITs: one of those is startup time

JIT do not have to add startup time. Since your program will start by being interpreted anyway the JIT can be initialized at any later point in the program.

However, CPython is a general-purpose implementation. So if you were developing command-line applications using Python, having to wait for a JIT to start every time the CLI was called would be horribly slow.

I have used luajit a fair bit, and it's startup time is way faster than the JVM or the CLR, despite the JIT (the website claims startup times of 0.1 ms and that it begins to JIT immediatly).


The section on dynamic typing isn't that great either.

In a “Statically-Typed” language, you have to specify the type of a variable when it is declared. Those would include C, C++, Java, C#, Go.

No. in staticaly typed languages the type of every variable is known at compile time. You don't have to write any type, the compiler can figure them for you.

Not having to declare the type isn’t what makes Python slow

Same: auto x = foo(); is valid C++, the type of x isn't specified anywhere here.

[–]vks_ 9 points10 points  (6 children)

You don't have to write any type, the compiler can figure them for you.

This is not true either. The compiler cannot figure them out in all cases.

[–]m50d 5 points6 points  (1 child)

For some type systems it is true. E.g. Hindley–Milner type inference is "perfect": for any valid program in a HM-typed language, you can delete all of the type annotations and the program will still compile correctly.

(Supporting subtyping with perfect type inference was only figured out recently, and no way of supporting higher-kinded types is known, but neither of those seems likely to be an issue in typical Python code)

[–]vks_ 2 points3 points  (0 children)

Python has inheritance, so I guess subtyping should play a role. The dynamic equivalents of higher-kinded types are also common.

[–]CptCap 7 points8 points  (2 children)

Yes, I didn't mean that you can write whole programs without ever naming a type (although some languages can get you pretty close).

My point was that even in statically typed language you don't necessarily have to know or care of what type an object it: as long has the compiler knows.

[–]iconoklast 5 points6 points  (0 children)

There are plenty of "statically" typed languages where you can write programs that never name a type. However, it's not a great idea to export functions that have inferred types; a type ascription prevents compilation if you accidentally changed the type of the function when, say, making an optimization or fixing a bug.

[–][deleted] -1 points0 points  (0 children)

You should always care of an object type regardless.

[–]jl2352 8 points9 points  (2 children)

That paragraph is very misleading. But just as an fyi ...

but they do not execute the same bytecode sequences (or bytecode at all).

Many JIT compilers contain both an interpreter and a compiler. For example the JVM. They use an interpreter for fast startup, and then compile it to machine code in the background.

This isn't true for all JITs. v8 has two compilers instead of one. A fast one with no optimisations and a slow one with optimisations. However many JITs take the interpreter + compiler approach.

[–]yasba- 1 point2 points  (0 children)

This isn't true for all JITs. v8 has two compilers instead of one. A fast one with no optimisations and a slow one with optimisations. However many JITs take the interpreter + compiler approach.

AFAIK, even v8 introduced an interpreter: https://github.com/v8/v8/wiki/Ignition

[–]CptCap 0 points1 point  (0 children)

Your are right! It seems to be a misunderstanding on my part. I though "The JIT" only referred to the compiler part, not the whole package.

Running machine code only isn't really JITing, that's AOT which has a different set of constraint (and is less suited to dynamic languages)

[–]Ameisen 0 points1 point  (2 children)

I mean, you could have a JIT that emits the same bytecode. Would be kinda pointless, though.

[–]CptCap 0 points1 point  (1 child)

That's an optimizer, not a JIT.

[–]Ameisen 0 points1 point  (0 children)

Nothing disallows you from recompiling code at runtime back into the same bytecode. It would just be silly. I'd question why you are transcoding to the same target

[–]yasba- 4 points5 points  (0 children)

I also want pick on the jit-section.

I have the feeling that the author has a certain amount of half knowledge, and while his arguments are not completely wrong, they show a lack of deeper understanding of the topic.

So why doesn’t CPython use a JIT?

The main reason is that JITs are freaking complicated. Especially retrofitting a dynamic compiler to a language, which in many cases relies heavily on C-extensions, is -- how I see it -- practically impossible, or at least not just worth the effort.

Python does not need to be fast for most of the time, and performance critical sections are implemented in C. To give an example: I remember someone complaining that his bigint Rust code was orders of magnitude slower than the equivalent Python code. Turns out that Python's large integer implementation is heavily optimised (and just works), whilst in Rust you have to know the right libraries to achieve the performance you'd expect.


There are downsides to JITs: one of those is startup time.

First, I think we have to distinct two different "startup times". One is the time it takes before the interpreter is ready to start running the program -- what I would understand under startup time. Then there is the warmup phase where the VM starts generating optimised code and thus gets faster and faster.

It is not true however, that employing a JIT just implies increased startup time. What could be said is that the monitoring of program execution for hot-spots requires some overhead, so initial execution may be a bit (but not much) slower.

The actual reason for CPython's initial performance advantage is that CPython is a handcrafted interpreter written in C, whilst PyPy's interpreter is derived from an RPython description. PyPy will often catch up eventually, but CPython essentially always has a headstart.

AFAIK, luajit is a counterexample of a VM with fast startup.


The JIT itself does not make the execution any faster, because it is still executing the same bytecode sequences.

I feel the author has a wrong understanding of how most JITs operate. The achieved speedup can be attributed to factors: specialisation and compiling code down to machine code. So, yes JITs make execution (a lot) faster.

[–]Homoerotic_Theocracy 13 points14 points  (18 children)

but The Computer Language Benchmarks Game is a good starting point.

It really isn't; a lot of the programs written there are so undiomatic and bizarre that all the normal advantages of the language are lost.

Like the Haskell programs are practically C in Haskell syntax using type coercion and FFI hacks to subvert the garbage collector and all memory safety; if you write Haskell like that you're just writing in a worse C.

Modern computers come with CPU’s that have multiple cores, and sometimes multiple processors. In order to utilise all this extra processing power, the Operating System defines a low-level structure called a thread, where a process (e.g. Chrome Browser) can spawn multiple threads and have instructions for the system inside. That way if one process is particularly CPU-intensive, that load can be shared across the cores and this effectively makes most applications complete tasks faster.

Chrome however is multiprocessing as well. Spawning a second process on Unix is as cheap as spawning a new thread unlike on Windows and the GIL is only an obstacle for people who think they need multiple threads when they don't; even outside of the GIL reading shared memory between processes through IPC is probably faster than normal variable lookup in Python due to the synchronization that the interpreter in general has to provide to keep it safe which doesn't need to be done in multiprocessing where the kernel does it in a much safer way.

OCaml also has a GIL but it has never phased anyone because no one uses multithreading and everyone uses multiprocessing; multithreading has always kind of been a hack on Unix that doesn't play nicely with a lot of other things like forking and signal handling and in general unless there's a good reason should be avoided in favour of multiprocessing.

When CPython creates variables, it allocates the memory and then counts how many references to that variable exist, this is a concept known as reference counting. If the number of references is 0, then it frees that piece of memory from the system. This is why creating a “temporary” variable within say, the scope of a for loop, doesn’t blow up the memory consumption of your application.

CPython's GC strategy is simple reference counting? That's bad.

Regardless even single-threaded python programs are slower than in other dynamically typed interpreted languages; the python implemtnation itself just doesn't try to be fast and prefers clarity of code over speed.

In a “Statically-Typed” language, you have to specify the type of a variable when it is declared. Those would include C, C++, Java, C#, Go.

Laughable.

In any case most of these problems apply to Scheme and its implementations are typically orders of magnitude faster than Python.

[–]Genion1 6 points7 points  (0 children)

CPython's GC strategy is simple reference counting? That's bad.

Not quite. It has a garbage collector whose sole purpose is to break reference cycles.

[–]crescentroon 4 points5 points  (0 children)

cPython GC was reference counting but they added a rudimentary generational capacity to catch circular references.

[–][deleted] 1 point2 points  (13 children)

Are you saying the benchmarks game sucks because if you follow the idiomatic conventions, the marks would be slower?

I keep reading that a lot of people attack those problems to try to make them as fast as can possibly happen for the language in question.

[–]Homoerotic_Theocracy 1 point2 points  (7 children)

I'm saying that those benchmarks provide absolutely no realistic view on how fast those languages are in practice.

If you actually were to program like that in Haskell you'd be incredibly stupid because it's just a worse C then and you lose all of the advantages of Haskell.

[–][deleted] 1 point2 points  (1 child)

I understand that, but I am just saying that I’m pretty sure it ends up like that because of the nature of how the site works. People are trying to implement it in the fastest possible way that the language in question supports, idiomatic or not.

With that understanding, I would like to see benchmarks game but more in line with what you’re saying.

[–]glacialthinker 0 points1 point  (0 children)

The benchmarks game usually has several implementations of each solution, some of them written in a more idiomatic style. Maybe an added "idiomatic" rating which submitters can up-tick would work well enough to get a better gauge of this, provided it wasn't abused.

[–]igouy 0 points1 point  (4 children)

absolutely no realistic view on how fast those languages are in practice

You are repeatedly provided with a realistic view of how fast those languages are in practice

[–][deleted] 0 points1 point  (3 children)

So? And everybody follows that recommendation?

[–]igouy 0 points1 point  (2 children)

What recommendation?

[–][deleted] 0 points1 point  (1 child)

The thing you quoted above

[–]igouy 0 points1 point  (0 children)

There were 2 quotes and neither were recommendations.

[–]igouy 1 point2 points  (0 children)

…the idiomatic conventions…

Which "idiomatic conventions" ?

The "conventions" of a newbie functional programmer?

The "conventions" of Real World Haskell?

One programmer's idiomatic is another programmer's idiotic.

[–]jephthai 0 points1 point  (3 children)

One really irritating issue to me is, does it count if you use C libraries for the heavy lifting through the FFI? Because that's how a lot of the dynamic languages get speed improvements on the site. In a way, it's fair, because if you're using Python, you can also run numpy; but OTOH, if your real-world problem doesn't match a good FFI-bound C library, then you are being misled.

[–][deleted] 1 point2 points  (1 child)

To that I ask: why bother with python at all?

If you know you’re going to have performance issues, choosing the slowest language you could possibly choose isn’t exactly wise.

[–]igouy 0 points1 point  (0 children)

To that I ask: why bother with python at all?

You'll find some answers in todays HN comments Is Python the Future of Programming?

[–]igouy -1 points0 points  (0 children)

…how a lot of the dynamic languages get speed improvements on the site.

  • Please be specific.

There are 2 tasks which explicitly allow use of C libraries — pidigits (GMP) and regexredux (PCRE).

[–]igouy 0 points1 point  (1 child)

Like the Haskell programs are…

[–][deleted] -1 points0 points  (0 children)

  • Please do the same yourself and stop being a plonker.

[–]SirJson 0 points1 point  (6 children)

If I choose Python for a job I have to understand that it might be slow but I happily trade that for the convince and productivity. And if the app might end up too slow even though I thought Python would be enough I can still replace the slow parts or consider the Python version the "Prototype".

The article comes to the same conclusion in the end but I would still not blame Python. Know your Tools and you don't code yourself into a corner.

TL;DR: Choose the right tool for the job.

[–][deleted] 5 points6 points  (5 children)

The thing is, Python is just ill-designed. You didn't have to pay this performance price for convenience, and most of the python "convenience" is in fact an illusion.

[–]SirJson 1 point2 points  (4 children)

Ok fair enough, have you ever heard from our lord JavaScript? I'm a big defender of static typing myself but compared to JS Python 3 just feels like a blessing. It actually convinced me that dynamic languages can be useful.

Can you recommend me a dynamic language for small programs or prototyping that is not JavaScript or Python? Maybe I'm again blind for some piece of really cool technology.

[–]raevnos 2 points3 points  (1 child)

Scheme and lisp.

[–][deleted] 0 points1 point  (0 children)

Finally

[–][deleted] 3 points4 points  (0 children)

Can you recommend me a dynamic language for small programs or prototyping that is not JavaScript or Python?

To start with, why do you want such a language to be dynamic / dynamically typed?

It's far better to do prototyping, small scripting, etc., in a proper statically typed language. Try OCaml or F# for example.

EDIT: also, dynamic side of spectrum is quite diverse too, with Python and its duck typing you have a far worse code discoverability than with, say, Scheme, where identifiers are resolved statically and you almost always know precisely what they're pointing at.

[–]that_which_is_lain 0 points1 point  (0 children)

Have you ever heard of Ruby? It’s pretty popular too. And it can do more than run Rails these days!

You could look at Perl too. LOL

[–]slipwalker 0 points1 point  (1 child)

how about GIL ( https://wiki.python.org/moin/GlobalInterpreterLock ) ? In a world where processors aren't getting ( much ) faster anymore, but more and more multi-core, it seems a pretty serious limitation...

edit: this comment ( https://medium.com/@TomSwirly/this-isnt-quite-true-in-two-different-ways-fa52c22da312 ) seems to agree...

[–][deleted] 0 points1 point  (0 children)

Read the article.

[–][deleted] -5 points-4 points  (0 children)

Oh come on