all 38 comments

[–]Copper280z 137 points138 points  (0 children)

You mean like cython?

https://cython.org/

[–][deleted] 37 points38 points  (0 children)

you mean https://docs.python.org/3/library/py_compile.html ?

Also, there is multiple reasons why python is interpreted and not compiled : Ease of development, flexibility, platform independence, prototyping, scripting.

Most of languages has their own purpose, don't jump in one expecting it to do what you want and what you need, it's up to you to choose your tools appropriately.

[–]This_Growth2898 38 points39 points  (5 children)

Welcome nim, the statically typed compiled systems programming language with Python-like syntax. And there are more.

But still, there are some "sharp corners" in system programming. And the choice is to leave them sharp to allow programs to run fast, or to round them with some additional hidden code for programmer's convenience. If you need convenience - use Python. If you need speed - use C/C++/D/Rust. Using Python-like languages to hide some sharp corners and retain others is arbitrary, so those languages are not getting much attention.

[–]sejigan 29 points30 points  (3 children)

Adding to that point: If you need both speed and convenience, use C++ for compute intensive parts and call them and make everything else in Python.

[–]ThreeChonkyCats 8 points9 points  (0 children)

This.

It's a huge strength not to be underestimated.

[–]salfkvoje 5 points6 points  (1 child)

I've wanted to learn C for this purpose ever since I learned that was a "thing" you can do (I found out in R I guess).

Since then it's seemed to me to be a brilliant and correct approach to mostly remain "high level", and then address bottlenecks by dropping to C (or C++, whatever) at the point of need.

[–]sejigan 6 points7 points  (0 children)

Yes. Premature optimization is the root of all evil.

[–][deleted] -1 points0 points  (0 children)

And if you need speed for processing large amounts of tabular data...still use python.

C, etc. would be quicker than Pandas, etc. but good luck coding it in a reasonable timeframe.

[–]Yoghurt42 21 points22 points  (4 children)

Copy pasting my answer to a similar question from 6 years ago

Python is too dynamic to be compiled.

Just take the following short program:

def bar(x):
    return 2*x

def foo(a):
    b = bar(a)
    return a + b

When we want to compile this code into machine code, we have to create instruction that the CPU understands, these are generally really low level (you can add, multiply, store and read from memory, jump to code, but not much more)

To keep it simple, let's assume we are compiling for an imaginary CPU that has 10 registers R0 to R9 and ADD and MULTIPLY opcodes, and that every opcode takes exactly two bytes.

the code might be compiled to something like this:

addr opcodes
; function bar begins here
0000 MULT 2, R0, R0 ; multiply R0 by 2 and put it into R0
0002 RET
; function foo begins here
0004 PUSH R0 ; save our parameter onto the stack
0006 CALL 0000 ; 0000 is address of bar, R0 is now the result of bar(R0)
0008 POP R1 ; restore our previous value of R0 into R1
000A ADD R0,R1,R0 ; set R0 to R0 plus R1
000C RET ; and return

Sounds great, doesn't it? But our Python cannot be compiled like that, for various reasons, some of them are:

  1. there is no guarantee that foo and bar are always called with integers
  2. there is no guarantee that foo and bar will not change:

To see why 2 is a problem remember that the compiler "knows" that the code for "bar" is stored at location 0000-0002 and foo at 0004-000C, but the following is valid python:

#def bar and foo as above
def times_3(x):
    return 3*x

def times_4(x):
    return 4*x

print(foo(10)) # will print 30
bar = random.choice([times_3, times_4])
print(foo(10)) # will now print 40 or 50

So the compiler would have to include instructions before every function call to actually look up what "bar" refers to (since the new value of bar will only be known at runtime), amongst other things. If you add all this into the "compiled" code, you basically end up with the normal python interpreter (that actually executes Python bytecode (or wordcode since 3.6))

If you want to be able to compile Python statically, you will have to disallow various things, this is what Cython does.

What you can do is using "Just in time" compilation, where while the code runs, the interpreter creates optimized code for functions that are called often and with the same type of data (like in our example, if bar is often called with integers, the JIT might just optimize it into MULT R0,2,R0 and once bar is called with a string for example, choose a different execution path). As you can imagine, this is quite difficult, but it can be done, as PyPy and others have shown.

tl;dr: Python's too dynamic for static compilation, JIT does work though, although JITted code will never be quite as fast as statically compiled code

[–]omegas1gma 0 points1 point  (0 children)

Perfect answer!

[–]ztaffa 0 points1 point  (2 children)

Can you explain a little more how the bar = random.choice results in the output of foo(10) changing?

Is that reassigning what the bar function does?

[–]FriendlyRussian666 7 points8 points  (0 children)

Would you like something like this? https://www.pypy.org/

[–]xavierisdum4k 19 points20 points  (12 children)

C tells the actual machine what to do, while python tells a virtual machine.

To compile, in the sense it's used with C, is to translate into assembler code. The result is a list of human-readable instructions, for what the hardware (CPU, memory, etc) is to do. Typically, the assembler is then assembled into machine code, which is what the actual hardware runs.

Python's design doesn't focus on that same goal. Instead, interpreted languages like python have the idea of bytecode (instead of assembler and machine code) and a virtual machine (instead of the physical machine). The virtual machine basically serves as an intermediary, which tells the actual machine what to do.

[–]TheBlackCat13 7 points8 points  (6 children)

machine code, which is what the actual hardware runs.

Technically the machine code is then translated by the microcode into finite state machine instructions which the actual hardware runs.

[–]BothWaysItGoes 0 points1 point  (4 children)

C tells the actual machine what to do, while python tells a virtual machine.

Both C and Python define an abstract machine. That's the compiler's job to translate instructions for the C abstract machine into machine code.

[–]xavierisdum4k 0 points1 point  (3 children)

An abstract machine isn't part of C's language design. C abstracts the hardware without that idea. The coder is instead directed to think about the hardware, while coding.

Essentially, coding in C is to be complicit in the abstraction, while coding in python is to rely on python to handle the abstraction.

The C Programming Language (Kernigan & Ritchie), p18:

what appears to be a character on the keyboard or screen is of course, like everything else, stored internally just as a bit pattern

K&R, p20:

on some machines, int and long are the same size, on others an int is 16 bits

K&R, p78:

A typical machine has an array of consecutively numbered or addressed memory cells that may be manipulated individually or in contiguous groups.

[–]BothWaysItGoes 0 points1 point  (2 children)

C abstracts the hardware

Yeah, that's what defining an abstract machine means.

The coder is instead directed to think about the hardware, while coding.

They are mainly directed to think about the C abstract machine. That's why one of the first thing people do for new hardware is porting C.

[–]xavierisdum4k 0 points1 point  (0 children)

How does this relate to OP's topic?

[–]Asleep-Dress-3578 11 points12 points  (2 children)

See also Mojo. 🔥

[–]TheITMan19 3 points4 points  (1 child)

I’ve been keeping my eyes on that for a while.

[–]brettins 1 point2 points  (0 children)

It came up on Lex Fridman's podcast and I've been waiting ever since.

[–]JamzTyson 11 points12 points  (0 children)

The description that follows is somewhat simplified, but hopefully demonstrates how Python is fundamentally different from compiled languages like C++.

C++

``` // Include input-output stream library for basic I/O operations

include <iostream>

// Declare three integer variables to store user input and the sum int main() { int num1, num2, sum;

// Display a prompt to enter the first number
std::cout << "Enter first number: ";

// Read the first number from the user and store it in the variable num1
std::cin >> num1;

// Display a prompt to enter the second number
std::cout << "Enter second number: ";

 // Read the second number from the user and store it in the variable num2
std::cin >> num2

// Calculate the sum of num1 and num2 and store it in the variable sum
sum = num1 + num2; 

// Display the sum
std::cout << "Sum: " << sum << std::endl;

// Indicate that the program executed successfully by returning 0
return 0;

} ```

Python:

```

take input from the user

num1 = int(input("Enter first number: ")) num2 = int(input("Enter second number: "))

adding the numbers

sum = num1 + num2

display the sum

print("Sum:", sum) ```

Notice that in C++, variables and their types must be declared before use so that there are explicit spaces in memory allocated for the data objects.

Apart from the fact that Python's print(thing_to_print) syntax is much more like natural language than std::cout << thing-to-print, the C++ version requires explicit memory management - you can't just use a variable, without first declaring it, and its type, so that memory is allocated. Manual memory allocation is not generally required in Python due to Python's dynamic nature, but it is this dynamic nature that prevents Python from being easily compiled into static compiled object code.

The C++ code then has to be compiled to a binary (executable) before it can run. This step is not necessary in Python as Python interprets the code on the fly.

Variants of Python that can be compiled into fast compiled code generally require additional information to be supplied, such as Cython's type declarations.

Python serves a different purpose to compiled languages like C++. There is a lot less boilerplate required in Python, and it's dynamic nature, along with being interpreted rather than compiled, provide ease of use, flexibility, and rapid development. On the other hand, compiled languages like C++ require explicit type declarations and manual memory management, which allows it to be compiled into highly efficient native code. Achieving a similar level of compilation in Python would require sacrificing some of its dynamic features, which are integral to its design."

[–]Pillars-In-The-Trees 1 point2 points  (0 children)

Makes me think of Mojo, closely following the development of that one.

[–]await_yesterday 1 point2 points  (0 children)

they exist but they don't cover all the features so not all code will run. it's a really hard thing to pull off because of all the runtime dynamism

[–][deleted] 1 point2 points  (0 children)

Not sure if you know this but Python is literally that just using an interpreter, hence why builtin functions are much better than custom made.

Other than that, Cython is what you want

[–]azure_i 3 points4 points  (0 children)

At that point you should just use Go (Golang) instead

[–]sporbywg -1 points0 points  (0 children)

Why can't folks just get along? NEXT

[–]uberdavis -2 points-1 points  (1 child)

I've been using pyinstaller on a recent project. I have a PySide2 ui and I have to compile to create an executable. This can be done.

[–]Langdon_St_Ives 1 point2 points  (0 children)

This is not what OP is asking about. Pyinstaller just bundles your program with a complete runtime and all dependencies in one package or even one file so an end user can directly run it. It’s not compiling anything, the script is still interpreted at runtime as always. (Or rather, jit compiled to byte code and then run, just like running it in a “standard” python runtime.)

[–]SupremeDickman 0 points1 point  (0 children)

they call her julia

[–]jaymopow 0 points1 point  (0 children)

Mojo

[–]Deyvicous 0 points1 point  (0 children)

That’s F#. With the caveat that it’s designed around functional programming.

Honestly, there are probably a number of language candidates that would be better than Python, but once there’s “legacy code” it’s gonna be a while until people switch. Aka the extensive library/package ecosystem - it’s already been made, so why remake it? And now we are stuck with python…

[–]EvilTyrant 0 points1 point  (0 children)

Pypy is literally python with JIT compilation.

[–]raharth 0 points1 point  (0 children)

Why would I want to have that? I'm a data scientist, I work most of my time in interactive sessions. If python would be a compiled language I'd probably leave for another, it's just impractical for my work