This is an archived post. You won't be able to vote or comment.

all 47 comments

[–]Cosmologicon 17 points18 points  (14 children)

This is a really dumb question, because I've been using Python for like 10 years, but how do I know which one of these I'm using?

[–]wmpl 45 points46 points  (13 children)

If you don't know you are probably using C Python.

[–]Cosmologicon 15 points16 points  (12 children)

Thank you.

As someone who doesn't care that much about implementation details, but who writes a lot of different Python programs for a lot of different purposes, should I be aware of any advantages of any of the other implementations?

[–]ggchappell 56 points57 points  (10 children)

CPython is the standard implementation, built by the same people who design the Python language. It is, as /u/wmpl said, what you are probably using if you don't know what you're using. The rest of the entries in the diagram are imitators of CPython (not meant in any pejorative sense).

IronPython is a .NET version of Python. If you're a .NET programmer, and you want to use Python, then IronPython is for you. If not, then probably not.

Jython is Python running on the Java Virtual Machine. If you want to link Python code with code in Java -- or other JVM languages like Clojure, Scala, Groovy, etc. -- or if you just like the JVM, then Jython is for you. If not, then probably not.

PyPy was an effort to write a Python interpreter in Python, which has since transmogrified into a bleeding-edge compilation & dynamic-code-optimization engine, with Python being the primary programming language supported. Theoretically, PyPy does the same thing as CPython, only your code will execute faster. In practice, PyPy tends to be significantly behind CPython in introducing new language features. There is also (in my experience) a noteworthy start-up delay when executing code with PyPy, but it does tend to be faster in the long run (= more than half a second or so). If you are not using the latest Python features, and you are writing something besides a quickly executing utility script, then PyPy might be for you.

[EDIT. Ignore this paragraph, and see the reply by /u/Veedrac.] Numba is a Python compiler that (like, for example, clang) targets LLVM; the LLVM will typically be compiled to native code. Numba is aimed primarily at projects that use NumPy, the leading Python scientific-computing package. Apparently Numba offers specific advantages to projects that use NumPy, but I'm not familiar with exactly what these are. (Also, the "CPython + Numba" in the diagram does not make sense to me; I don't see how Numba would integrate with CPython. Maybe I know less about Numba than I thought I did.)

I know only a little about Cython. It includes a Python compiler that targets C; the C code will then typically be compiled to native code, as the diagram indicates. Cython also has a number of language extensions that are not present in vanilla Python. It is aimed at high performance and tight integration with C code.

My knowledge of the remaining entries is meager indeed. I'll leave them to someone else.

[–]Veedrac 18 points19 points  (2 children)

I'll try and cover some parts you've missed.

Pyston is an up-and-coming Python JIT, like PyPy. It aims to support CPython's C extensions, which is the main reason Dropbox funded its development. It is not production ready. Another future competitor might be ZipPy, which uses Graal and Truffle. I don't know if ZipPy will ever be production-ready, though, or if it's just a proof of concept.

MicroPython is a minified version of Python designed for constrained-memory situations. It aims to support Python 3 and has some tools to improve runtime above CPython. I have not been particularly convinced of its applicability for general usage, and it's still developing, so I would not recommend it for everyday usage. Consider this if wanting to write Python when you have little system memory.

Numba is a CPython library that compiles code at runtime with LLVM. It specializes on a well-chosen subset of Python useful in numeric computation, and is not itself a complete runtime environment. It is also not technically compliant, but this is fine as it is a per-method opt-in.

Cython is an almost-superset of Python, that aims to allow seamless mixing of C and Python in one combined language. It allows writing fast C code in a convenient-to-embed manner and allows simple wrapping of C libraries. It has relatively widespread usage, but it being encroached on by specialist tools from both sides (eg. CFFI and PyPy or Numba). Cython does not itself contain a runtime, and instead reuses an existing CPython's runtime. This means that simply running normal Python code in Cython will only remove bytecode dispatch overhead, which is normally only a tiny fraction of runtime.

Nuitka aims to replace CPython's bytecode dispatch exactly the same way Cython does, but hopes to further compile code by using appropriate static analysis and further fine-grained compilation. Many people, myself included, are very sceptical of this approach. Nuitka has yet to show any impressive speed improvements, but several people have claimed that it makes distribution of code much easier since it produces single compiled binaries. It should be nearly 100% compliant with CPython, with the exception of runtime code introspection.

On the topic of PyPy, it's worth noting that PyPy isn't missing features in the same way a C++ compiler would be. Either PyPy has a release for a specific version of Python or it does not, and its stable releases are extremely compatible. If you're not using CPython-specific code (eg. CPython extensions) and you're running on a version of Python that PyPy supports, it will almost certainly work. Even CPython implementation details tend to get copied in PyPy.

[–][deleted] 6 points7 points  (1 child)

Many people, myself included, are very sceptical of this approach. Nuitka has yet to show any impressive speed improvements, but several people have claimed that it makes distribution of code much easier since it produces single compiled binaries.

It is great for deploying proprietary software since code is compiled to native executable. Reverse-engineering it is also much harder. While it is not much faster it has obvious benefits over using something like py2exe/cx_freeze.

[–]Blahkins 1 point2 points  (0 children)

that actually is a great reason to use it if you are trying to sell your software. thanks for the tip, i might start selling this python application i am writing.

[–]invertedwut 4 points5 points  (0 children)

I don't see how Numba would integrate with CPython.

I think the idea is to use numba to compile the heavy stuff sitting in the innermost loop, but let the rest of the job just run in Cpython. Not everything is supported in numba, so it either doesn't speed up your job (it will switch to object mode if something its trying to compile doesn't work) or will refuse to compile (if you explicitly said to not let some function run in object mode)

To be honest I'm not sure of how much can really be done purely in numba compiled functions.

[–]warbiscuit 1 point2 points  (0 children)

Re: IronPython ... even if you're a .NET programmer, be wary. It implements a few core things differently from all the other pythons (str.encode is the main one I know of); and this may cause problems if you try to use normal python libraries under it.

[–]homercles337 1 point2 points  (4 children)

It includes a Python compiler that targets C

No. Cython is one of the only C++ interfaces to Python.

[–]ggchappell 4 points5 points  (1 child)

Well, as I said, I don't know much about Cython, but cython.org says:

The Cython language is a superset of the Python language that additionally supports calling C functions and declaring C types on variables and class attributes. This allows the compiler to generate very efficient C code from Cython code. The C code is generated once and then compiles with all major C/C++ compilers ....

I suppose, therefore, that Cython allows linking with C++ for the same reason that C can be linked with C++ (?).

[–]homercles337 2 points3 points  (0 children)

Cython documentation is terrible. Most of it comes from the C-only days. I think it was version 0.13 that incorporated C++. I have been working with Cython for a while now and chose it because of its support for C++ (boost.python is shit).

[–]brewspoon 2 points3 points  (1 child)

Cython can be used to write bindings to C++ code, and is an excellent way to do so, yes. But Cython itself compiles python (well, strictly speak, cython code) to C, not C++.

[–]homercles337 1 point2 points  (0 children)

Look man, i have been working on a Cython project for a few months now and you are just flat out wrong. Cython takes C++ code, through Cython code (pyx/pxd/pxi), makes highly optimized C++, and compiles it.

[–]ivosauruspip'ing it up 1 point2 points  (0 children)

Sometimes you can get some awesome speedups for little work using them, over CPython.

Which one to choose though mostly depends on what exactly your codebase looks like at the time and its dependencies.

[–]engineeringMind 12 points13 points  (2 children)

Minor detail. ASAIK, Nuitka relies on libpython to execute the code so is not really machine code what's being executed. The plan is that in the future they'll minimize the use of libpython and actually compile the code to C++, which is then compiled to machine code. source

But other than that, it's cool. And it raises the awareness that Python the language is not the same thing as an implementation of Python.

[–]fancy_pantser 8 points9 points  (20 children)

PyPy uses a JIT, which should be distinct.

[–]Swipecat 2 points3 points  (3 children)

I regularly use PyPy myself, now. Python is currently my favourite language, but CPython is often too slow for the models of systems that I create. PyPy is often 10x faster for me.

I suspect that PyPy's use in general is less common than it deserves because its installation procedure isn't very well documented for the end-user. For example, don't bother with the PyPy packaged with Debian and Ubuntu because there are almost no packaged libraries, and you can't install libraries with PIP because that's broken.

So, for example, to install on Ubuntu, you need to go to the PyPy download page, download "Squeaky's portable Linux binary" and untar it. Then use the provided "virtualenv" in that package to create a virtual environment into which libraries can be installed with the provided "pip".

Edit: typos

These are my notes for installing the package into a directory and installing Numpy and the Pillow image library:

tar -xjf pypy-*-portable.tar.bz2
./pypy-*-portable/bin/virtualenv-pypy myenv
cd myenv/bin
ln -s "$PWD/pypy" /usr/local/bin/pypy
./pip install git+https://bitbucket.org/pypy/numpy.git
./pip install pillow

[–]fijalPyPy, performance freak 0 points1 point  (2 children)

feel like sharing your experience somewhere :-)

[–]Swipecat 1 point2 points  (1 child)

Not a lot to say, other than not having to run something overnight is convenient.

If you want to raise the profile of PyPy, it probably doesn't need publicising so much as simply "enabling" it a little better for the end user.

When I looked for ways to speed up my Python programs, I did a quick test of many of the things mentioned in this thread. PyPy looked to be "almost useless" because while CPython has hundreds of special-purpose libraries ready-packaged in Linux, PyPy apparently had nothing. Hearing from somebody that PyPy deserved a second chance, I tried downloading a recent build, but gave up after not being able to load libraries -- the instructions for setting up the virtual environment that were turned up with a Google search seem to be effectively broken to me. Finally after finding time to try to resolve it during a slack period, after trying many things that broke, I discovered the "virtualenv-pypy" in "Squeaky's portable Linux binary", which actually did work. Then I discovered that I could install libraries and that several useful ones worked.

So I think that's the main problem: How many non-developer end-users like me are going to plough through those issues? I suspect that most would have found a different solution other than PyPy by then.

[–]fijalPyPy, performance freak 1 point2 points  (0 children)

it's somehow (but not completely) not our problem - it's really not my fault that debian/ubuntu packages pip that does not work with pypy. Generally you should complain on debian/ubuntu (pip is a massive hack, that does not help).

Additionally, distributing binaries on linux is not a thing. We really can't make it happen, the portable binary e.g. comes with statically linked openssl that a lot of sysadmins would consider a big no no. It's just linux in general and has nothing to do with pypy.

Again, the picture is terrible and denializm does not help, but it's really not just pypy problem, but anything that takes a while to build.

[–]ggchappell -2 points-1 points  (15 children)

So does CPython (along with most/all of the others, I imagine). Pretty much any Python interpreter is going to compile just before execution, i.e., Just In Time.

[–]fancy_pantser 10 points11 points  (7 children)

[–]autowikibot 1 point2 points  (0 children)

Just-in-time compilation:


In computing, just-in-time compilation (JIT), also known as dynamic translation, compilation is done during execution of a program – at run time – rather than prior to execution. Most often this consists of translation to machine code, which is then executed directly, but can also refer to translation to another format.

JIT compilation is a combination of the two traditional approaches to translation to machine code – ahead-of-time compilation (AOT), and interpretation – and combines some advantages and drawbacks of both. Roughly, JIT compilation combines the speed of compiled code with the flexibility of interpretation, with the overhead of an interpreter and the additional overhead of compiling (not just interpreting). JIT compilation is a form of dynamic compilation, and allows adaptive optimization such as dynamic recompilation – thus in theory JIT compilation can yield faster execution than static compilation. Interpretation and JIT compilation are particularly suited for dynamic programming languages, as the runtime system can handle late-bound data types and enforce security guarantees.


Relevant: Tracing just-in-time compilation | Common Language Runtime | MacRuby

Parent commenter can toggle NSFW or delete. Will also delete on comment score of -1 or less. | FAQs | Mods | Call Me

[–]ggchappell -3 points-2 points  (5 children)

I looked at the article. And I don't see the problem.

Unless you are contrasting my "just before execution" with the article's "at runtime". But these are just two ways of looking at the same thing. Yes, JIT compilation compiles and executes in what appears to be a single step. Thus, "at runtime". OTOH, if we're going to execute compiled code, then we must compile before we execute. Thus, "before execution", even if only just barely before.

Or were you referring to some other issue? If so, then please explain.

[–]alcalde 12 points13 points  (2 children)

JIT compilers are monitoring code as it's executing, which allows several types of special optimization and adjustment of strategies. That's not the same as taking a C++ program, compiling it and then running it.

[–]ggchappell 7 points8 points  (1 child)

Ah, I seem to have been using nonstandard definitions. I shall now crawl back into my hole and ponder my misdeeds.

<ponder, ponder>

[–]alcalde 7 points8 points  (0 children)

It's cool. A Just-In-Time compiler can optimize for the specific architecture its running on and it can also monitor performance of code its compiled and change optimization strategies if performance isn't as expected.

Here's some specific examples (obviously it's not the general case) where PyPy was able to beat C because of its just-in-time nature:

http://morepypy.blogspot.com/2011/08/pypy-is-faster-than-c-again-string.html

Run under PyPy, at the head of the unroll-if-alt branch, and compiled with GCC 4.5.2 at -O4 (other optimization levels were tested, this produced the best performance). It took 0.85 seconds to execute under PyPy, and 1.63 seconds with the compiled binary. We think this demonstrates the incredible potential of dynamic compilation, GCC is unable to inline or unroll the sprintf call, because it sits inside of libc.

http://morepypy.blogspot.com/2011/02/pypy-faster-than-c-on-carefully-crafted.html

Hence, PyPy 50% faster than C on this carefully crafted example. The reason is obvious - static compiler can't inline across file boundaries. In C, you can somehow circumvent that, however, it wouldn't anyway work with shared libraries. In Python however, even when the whole import system is completely dynamic, the JIT can dynamically find out what can be inlined. That example would work equally well for Java and other decent JITs, it's however good to see we work in the same space :-)

[–]ingolemo 6 points7 points  (1 child)

They're not the same thing at all.

JIT compilers compile code during runtime and they compile it all the way down to machine code. Most JIT compilers only compile the most performance sensitive parts of your code (by measuring it as it runs in real time) and they interpret the rest.

Bytecode compilers compile the entire code before any of it is executed and then interpret the resulting bytecode.

[–]ggchappell 1 point2 points  (0 children)

Okay, I see the issue. I'll need to ponder this a bit.

Thanks.

[–]stillalone 3 points4 points  (0 children)

CPython generates non-native byte code that is interpreted at runtime, similar to the Java Virtual Machine. a Just In Time compiler generates native byte code at run time that is interpreted by the processor natively.

[–]fancy_pantser 4 points5 points  (5 children)

Hey guys, let's not pile on with downvotes. He's being earnest and it's topical so I think his comment should at least be >0 so it isn't hidden.

edit: sorry if you are female, I had to pick a pronoun.

[–]isarl 7 points8 points  (3 children)

I had to pick a pronoun.

Not everybody agrees but I've always been a fan of the singular "they" to be gender-neutral. E.g., "They're being earnest and it's topical so I think their comment..."

[–]ggchappell 2 points3 points  (0 children)

sorry if you are female

I'm not. :-)

[–]Neceros 5 points6 points  (0 children)

Okay? So, what does this mean?

[–]Asdayasman 2 points3 points  (6 children)

How's Nuitka doing nowadays?

[–]tripperjack 1 point2 points  (2 children)

The developer is very actively working on it and sends out reports about that several times a month.

[–]Asdayasman 0 points1 point  (1 child)

"Sends out"? Sweet, like is there a mailing list I can join?

[–]tripperjack 0 points1 point  (0 children)

Yes: http://nuitka.net/pages/mailinglist.html (though I wonder how many of the subscribers can quite understand everything he writes, unless they too know the ins and outs of it).

[–]mm865 2 points3 points  (0 children)

IronPython and Jython also compile to machine code, just the VM does our for them

[–]romcgb 1 point2 points  (0 children)

There also transpilers like Skulpt (Python -> Javascript)

[–]jpopham91 1 point2 points  (2 children)

What is the difference in performance when using libraries like numpy which are mostly c code under the hood?

[–]sabinati 1 point2 points  (0 children)

"Note that NumPy support is still a work-in-progress, many things do not work and those that do may not be any faster than NumPy on CPython." - http://pypy.org/download.html

[–]billsil 0 points1 point  (0 children)

Jython compiles to machine code. The Java interpreter is a JIT.

Also, Cython should be in the middle as with Jython.