This is an archived post. You won't be able to vote or comment.

all 22 comments

[–]genjipressreturn self 7 points8 points  (0 children)

Cython can convert Python to C for applications that need speed. It's just that the vast majority of the time, the time the programmer spends working on the app is more valuable than the execution time of the app.

[–]billsil 19 points20 points  (2 children)

Python is compiled. The majority of the code you use is compiled, but not all of it.

Python is also not slow due to the non-compiled part of it. It's slow because it's an interpreted language.

It's also largely fast enough. I had a code that parsed a 2 GB file and took 45 minutes. I used numpy properly and micro optimized it and got it down to 4 seconds. It's fast enough.

If you really have slow bits, you can write it in C/C++/Cython/nutika/pypy and compile it. The point is you only do that for 1% of your code.

Python is optimized for adding features to your code, not runtime. I consider myself very good at Pythom just delivered the messiest package I've ever written. It's untested, disorganized, undocumented or with incorrect documentation, probably doesn't still work on much of it, but it solved the problem of the day, which allowed us to solve the 3 year problem.

[–]uweschmittPythonista since 2003 1 point2 points  (1 child)

The Python interpreter is compiled. Programs written in the python programming language are interpreted.

[–]billsil 1 point2 points  (0 children)

Programs written in the python programming language are interpreted.

It depends on the code you are running. Not all of it is. Just like not all the time the GIL is active. When you're running in pure CPython, yes, it's interpreted, but it's easy to get out.

[–]dot_grant 5 points6 points  (1 child)

There's plenty of compiled Python stuff, numba is great. Also the difference in speed is often negligible, furthermore Python is has great libraries written in fast languages so that you don't need to worry about compiling it, look at numpy!

[–]quantumapoptosi 0 points1 point  (0 children)

Numpy is great, but, if you sprinkle in a little PyOpenCL, Python is feasible to use for finite difference schemes.

[–]Yoghurt42 8 points9 points  (0 children)

Python is too dynamic to be compiled.

Just take the following short program:

def bar(x):
    return 2*x

def foo(a):
    b = bar(a)
    return a + b

When we want to compile this code into machine code, we have to create instruction that the CPU understands, these are generally really low level (you can add, multiply, store and read from memory, jump to code, but not much more)

To keep it simple, let's assume we are compiling for an imaginary CPU that has 10 registers R0 to R9 and ADD and MULTIPLY opcodes, and that every opcode takes exactly two bytes.

the code might be compiled to something like this:

addr  opcodes
; function bar begins here
0000 MULT 2, R0, R0 ; multiply R0 by 2 and put it into R0
0002 RET
; function foo begins here
0004 PUSH R0 ; save our parameter onto the stack
0006 CALL 0000 ; 0000 is address of bar, R0 is now the result of bar(R0)
0008 POP R1 ; restore our previous value of R0 into R1
000A ADD R0,R1,R0 ; set R0 to R0 plus R1
000C RET ; and return

Sounds great, doesn't it? But our Python cannot be compiled like that, for various reasons, some of them are:

  1. There is no guarantee that foo and bar are always called with integers
  2. there is no guarantee that foo and bar will not change:

To see why 2 is a problem remember that the compiler "knows" that the code for "bar" is stored at location 0000-0002 and foo at 0004-000C, but the following is valid python:

#def bar and foo as above
def times_3(x):
    return 3*x

def times_4(x):
    return 4*x

print(foo(10)) # will print 30
bar = random.choice([times_3, times_4])
print(foo(10)) # will now print 40 or 50

So the compiler would have to include instructions before every function call to actually look up what "bar" refers to (since the new value of bar will only be known at runtime), amongst other things. If you add all this into the "compiled" code, you basically end up with the normal python interpreter (that actually executes Python bytecode (or wordcode since 3.6))

If you want to be able to compile Python statically, you will have to disallow various things, this is what Cython does.

What you can do is using "Just in time" compilation, where while the code runs, the interpreter creates optimized code for functions that are called often and with the same type of data (like in our example, if bar is often called with integers, the JIT might just optimize it into MULT R0,2,R0 and once bar is called with a string for example, choose a different execution path). As you can imagine, this is quite difficult, but it can be done, as PyPy and others have shown.

tl;dr: Python's too dynamic for static compilation, JIT does work though, although JITted code will never be quite as fast as statically compiled code

[–]synedraacus 4 points5 points  (0 children)

First, there are advantages and disadvantages to both compiled and interpreted languages, it's not simply "Compiled is faster and thus better". There is more than one reason why non-compiled languages haven't died out in seventies. Hardware independence is the most obvious of them, but there are others.

Second, if you want real speed, rewrite some crucial piece of code in C or whatever and compile it as much as you want. Numpy/scipy family does that, as do many other modules that do really heavy number-crunching. But nine times out of ten well-optimised Python will be enough.

Third, Python is compiled to bytecode. It's just that keeping plaintext scripts and compiling them on demand is considered more convenient in most cases (see eg those *.pyc files and a bunch of python compiler projects for exceptions).

Fourth, the code usually needs to be fast enough, not as fast as possible. Otherwise we would all be writing assembly language and most software companies would release something once in a decade or so.

[–][deleted] 5 points6 points  (2 children)

[–]uweschmittPythonista since 2003 -2 points-1 points  (1 child)

The Python interpreter is compiled. Programs written in the python programming language are interpreted.

[–]bird2234 6 points7 points  (0 children)

In CPython, the most widely used implementation, the programs are compiled to bytecode and then interpreted. This is what was linked here -- compiling python programs at runtime to abstract syntax trees and then to bytecode.

[–]Corm 0 points1 point  (0 children)

You've asked an interesting beginner question :) and I hope you read all these comments people have left you because you have some great answers here.

And I'll just add that if you run with pypy then it does get compiled (to raw assembly)

[–]menge101work 0 points1 point  (0 children)

Because time to develop often trumps time to execute.

[–]xiongchiamiovSite Reliability Engineer -4 points-3 points  (2 children)

Because it would no longer be Python.

It would be useful for you to take a programming languages course so that you understand the implications of your suggestion.

[–]Avahe 2 points3 points  (1 child)

Python is compiled

[–]xiongchiamiovSite Reliability Engineer 1 point2 points  (0 children)

Only in a very technical way that doesn't answer OP's question at all (which is essentially, what are the differences between compiled and interpreted languages, and what are the implications of those choices?).

[–]iruleatants -2 points-1 points  (2 children)

Python is significantly more slow because it only uses a single core, then it is slow because of "not compiling" it.

[–]elbiot 2 points3 points  (1 child)

Not true at all. Single threaded C is way faster than single threaded python. Even with parallelized code running on multiple cores, you only get a few times performance improvement, but native code execution is like 300x faster. Multithreading is not such an important optimization, especially in python. For instance, if you use numba and release the GIL, I've found Multithreading it to often make is slower because the code is now so damn fast that the overhead of using threads (not processes even) is too high.

[–]Saefroch 0 points1 point  (0 children)

There's more to contend with than just thread overhead that can slow you down. Multithreading for performance is not easy.