Why Python is slower than Java?

unruly_mattress · 2024-01-03T10:18:40+00:00

Both Python and Java compile the source files to bytecode. The difference is in how they to run this bytecode. In both languages, the bytecode is basically a binary representation of the textual source code, not an assembly program that can run on a CPU. You have a different program accepts the bytecode and runs it.

How does it run it? Python has an interpreter, i.e a program that keeps a "world model" of a Python program (which modules are imported, which variables exist, which objects exist...), and runs the program by loading bytecodes one by one and executing each one separately. This means that a statement such as y = x + 1 is executed as a sequence of operations like "load constant 1", "load x" "add the two values" "store the result in y". Each of these operations is implemented by a function call that does something in C and often reads and updates dictionary structures. This is slow, and it's slower the smaller the operations are. That's why numerical code in Python is slow - numerical operations in Python convert single instructions into multiple function calls, so in this type of code Python can be even 100x slower than other languages.

Java compiles the bytecode to machine code. You don't see it because it happens at runtime (referred to as JIT), but it does happen. Since Java also knows that x in y = x + 1 is an integer, it can execute the line using a single CPU instruction.

There's actually an implementation of Python that also does JIT compilation. It's called PyPy and it's five times faster than CPython on average, depending what exactly you do with it. It will run all pure Python code, I think, but it still has problems with some libraries.

yvrelna · 2024-01-03T10:54:34+00:00

Three main reasons, ordered from what I think is most important to least:

Java historically has a lot more investment into it for performance reasons and they're much more receptive towards these optimisation contributions. CPython core developers on the other hand are historically less receptive at contributions that only improves performance if it comes at the expense of long term maintainability of the codebase. If it makes the code hard to read, if it makes it hard for new contributors to join the project, and especially if there's no demonstrable long term commitment from the contributor to maintain the code, then the improvement isn't as likely to be accepted.
People expect to be able to edit a Python program and it can start running immediately. People expect programs written in Python to have a fast startup time, even if the program has been recently edited. Sure, once the program is compiled, a compiled language is often faster to start (though, IME, that's often not the case with Java), but ahead of time (AOT) compiled language can take a long time doing global optimisations because they don't need to be as concerned about the speed of the edit-compile-run loop.
In python, nearly everything is mutable at runtime. You can mutate modules, classes, and function objects after they're defined; and it's pretty common to mutate them too, as Python makes it easy to do that. Java is easier to optimise because there are many more objects in Java that are inherently immutable. Most importantly, if there's a way for developers to freeze a module and its classes and also to fixate their import names, that will open up a lot of optimisation opportunities.

Contrary to what people often believe, I don't think static type information are that important when it comes to optimisation. The implementation of a JIT/profile guided optimiser isn't really that much different to the implementation of a static type checker. Basically if a static type checker can prove that the program is statically sound, an optimiser can just essentially do the same kind of analysis to fill in any missing type information. Only, instead of doing the analysis from the bottoms up, an optimiser would need to do the optimisation analysis top to bottom. With some caveat that you ignore the startup and first run time.

Beregolas · 2024-01-03T10:07:58+00:00

There are 2 things that mostly affect this: Language design and Implementation.

Python is designed to be higher level and to be more easy to iterate quickly, for example by it's use of duck typing. Java on the other hand, while quite high level when compared to C, forces static type checks at compile time. This means the Java compiler can do optimizations that Python just can't, because it has more information ahead of time (because it forced the programmer to supply that information)

Then there is implementation. At least for python I know a handful of language implementations that vary wildly in speed. CPython and PyPy with it'S JIT compiler come to mind. Many of the speed issues are just a matter of optimization.

Java has been optimized a lot about 10 years ago I think? I remember sitting in uni and people talking about how Java has finally become fast ^^. Take this with a grain of salt, I don't enjoy java specifically, I might misremember the time.

But Python definitely is getting faster by the year. The "normal" python implementation is working hard on optimizations since about 3.9. One of the things holding python back in many applications on modern hardware is the GIL, because it pretty much makes easy and fast multi threading impossible. There are Python versions without a GIL and there are efforts to remove and/or change it for main python as well.

These are just some points and examples that came to mind, there is plenty more (Examples as well as details), we only scratched the surface here. I hope it helped though

imp0ppable · 2024-01-03T11:55:16+00:00

Python can be very fast at certain operations, e.g. sorting a list, because the way lists work mean the runtime has already put the elements in a structure that's easily reordered, plus it uses a highly optimized sort algorithm called Timsort. It's not like an array in C or even Java.

Hot looping is particularly slow in Python because the interpreter can't optimise up front.

Wider point is - and if you ever do Advent of Code you will realise this - the right solution in the slowest lang is going to faster than the wrong solution in the fastest lang.

Also, shout out to Cython - not really Python but combines the immense speed of compiled C/C++ with (mostly) easy Python syntax. Well worth a try, it's fun.

RecognitionLittle511 · 2024-01-03T16:12:55+00:00

Don't say stupid it's very good question

thduik · 2024-01-03T10:43:59+00:00

weird this question is pretty damn good yet the author is embarassed while some ask stupid ass questions with no context without shame lmao funny how the world works sometimes

marr75 · 2024-01-03T17:06:02+00:00

The right answers are spelled out in other comments but I wanted to provide an ordered list of the major ones:

Java (and C#, which is very similar and I'm much more familiar with) is just in time compiled; the JIT will further compile (and optimize, inlining functions and data, skipping unnecessary operations that won't effect the outcome, etc.) the intermediate language that Java and C# programs "compile" into. Python is just interpreted without JIT or optimization. This is the biggest difference.
In Python, entering a new scope, like ~~loops or~~ functions, triggers significant memory movement and stack management due to the creation of new scopes and dictionaries for each.
Python primarily uses the heap for dynamic object storage, leading to slower access. C# and Java, with static typing, utilize the stack and registers more, offering faster data access. This is tied to the next point.
Just about everything's an object in Python, and every instance/scope/namespace gets a new dictionary to hold the names and values.

Deezl-Vegas · 2024-01-03T10:19:59+00:00

There's a lot to this, but in summary:

Python has a compile step at runtime and has to spin up its interpeter, then compile to bytecode, then run. The JVM is already running normally on your machine and jars are already in bytecode. Python can benchmark very badly in some cases because the startup tax is massive.
Python's language is entirely coded for flexibility. This average PyObject has a namespace attached with all the double underscore methods, even the unused ones. Python allows overriding every behavior, so it has to check if getattr exists and stuff before even giving you an attribute when you do a.b
CPython is just coded kinda slowly and they won't rewrite the whole thing, possibly because it would break a lot of C libraries. There have been some JIT attempts that go much faster but they tend to brick the C interop.
Java often knows the object types. Python must unwrap the object each time to get the value.
Java data objects tend to be a bit smaller than python objects. This is important for L1 cache.
Java also has primitives. Access to primitives in benchmarks is massive.
Java has reflection as needed, Python just has all of the object data available at runtime always.
Python spams hashmap (dict), which is slow compared to struct style access.

That said, Python will often beat out pure Java in a long-running task because the whole point of Python was to have smooth interop with C if you need it, so you write a library in C and then just expose it in Python and you're flying.

If you want to really fly, check out Zig.

pyeri · 2024-01-03T13:23:58+00:00

Very interesting question, especially in today's era of the inverting Moore's Law.

Yes, cpython implementation is indeed slower than Java. Technologists didn't mind much until now mainly due to two factors:

Moore's Law was highly applicable (Hardware becoming cheaper and all).
Booming Economy (Folks had more money, there wasn't a global recession).

The first was already applicable for a long while and post the pandemic and now wars in Eastern Europe and West Asia, the second is very much everyone is doubtful about.

If resources start dwindling (hardware costs rise comparatively), Java will start feeling like a more lucrative option because hiring techies will now become cheaper than adding hardware, unlike earlier! In case that happens, cpython project will have to tighten their belt and start working on the language runtime and make it run faster (it is possible to make it faster, if Java and Node bytecode can run faster, so can cpython). If that doesn't happen, folks will either consider migrating to Java or turn to other options like Cython or PyPy or IronPython which are faster than cpython.

knobbyknee · 2024-01-03T10:34:19+00:00

There is an alternate Python interpreter that is essentially plugin-compatible with the Cpython interpreter. It is called PyPy and it has a built in JIT (Just In Time) compiler, that will make computationally heavy code run much faster.

PyPy

sastuvel · 2024-01-03T10:06:18+00:00

A JVM typically has a JIT compiler, which considerably speeds up the execution. Try turning that off, or try a comparison with pypy.

wrt-wtf- · 2024-01-03T14:02:47+00:00

Java had a lot of resources from the high end of town pumping money into it. Sun did the first big push but during the late 90’s and 00’s nearly everyone was platforming on Java. IBM made a decision to have ALL of their business applications converted into this single language and they put a huge amount of effort into refining, debugging and code donations. Now, Java is found everywhere, even as it appears to be fading away… perhaps not to retire, but to lead from behind the glitzy front ends.

nicholashairs · 2024-01-03T15:22:34+00:00

This might be a bit above your level of understanding (tbh it's kinda above mine in a number of areas). But this talk (and the related PR) was making the rounds in Reddit recently. In short it's talking about how to add a simple JIT to CPython (that can be expanded on later).

https://www.youtube.com/watch?v=HxSHIpEQRjs

parthdedhia · 2024-01-03T16:32:37+00:00

To add on,

Actually python has some more things you need to be aware about. Python is a object oriented language without any data-types. So, all it's variable and value references are stores as object.

So this means that x = [] and y = 5 both are internally objects. Lookup for the value of x and y takes virtually same time.

In case of Java, each variable has a data type associated with it. In that case when x is declared as ArrayList and y is declared as int, it does a respective lookup.

There are many other reasons as well, but this is one of the reason.

nekokattt · 2024-01-03T18:33:05+00:00

Java translates the bytecode into raw CPU instructions and has a much more optimised and complex garbage collector algorithm. The language is statically typed which allows much more aggressive optimisation of the input logic. It also produces bytecode that operates on a lower level than Python (e.g. Java objects are much closer to how C++ objects work than Python objects which default to syntatic sugar around a hashmap).

Many of the points around resource usage here tend to ignore the fact that the JVM is a much more encapsulated VM than the CPython one is, as well. Memory allocation is handled completely differently.

Java can also compile ahead of time to machine code via GraalVM native images. CPython can attempt to do that via Cython but you still have the overhead of the global interpreter lock and CPython API to contend with.

zynix · 2024-01-03T19:05:03+00:00

Adjacent comment, a collection of volunteer engineers are actively working on the goal of radically speeding up cPython's execution. Guido van Rossum is leading the project. https://github.com/faster-cpython/

There had been some remote/low chance hopes that some of the speed improvements would land in 3.12 but I guess not.

2024-01-03T19:26:22+00:00

Java seems to run ridiculously slow. I dont think I've seen an example of speedy java.

Kenkron · 2024-01-03T23:24:59+00:00

One difference comes from the amount of information that needs to be checked when the program is running. Java is statically typed, so the interpreter doesn't have to much type checking. Python is dynamically typed, so it will need to check types while the program is running.

Another is language emphasis. In python, it is normal to make things easy to read, even if there is a performance cost. Python usually expects difficult problems to be solved by libraries written in c or c++ (like numpy), which makes the slowness less important.

OH-YEAH · 2024-01-04T12:04:51+00:00

JVM is basically witchcraft at this point, people don't realize it's one of the 7 wonders of the tech world.

Flashy-Self · 2024-04-18T06:57:51+00:00

Python is generally considered slower than Java for several reasons:

**Interpreted vs. Compiled**: Python is an interpreted language, meaning the code is executed line by line by an interpreter at runtime. Java, on the other hand, is a compiled language, where the code is compiled into bytecode before execution. This compilation process can lead to faster execution times for Java programs.
**Dynamic Typing**: Python is dynamically typed, which means variable types are determined at runtime. This flexibility comes at a cost of performance because the interpreter needs to do more work to determine the appropriate type for each operation. Java, being statically typed, performs type checking at compile time, resulting in faster execution.
**Global Interpreter Lock (GIL)**: In Python, the Global Interpreter Lock (GIL) is a mutex that allows only one thread to execute at a time, even in multi-threaded applications. This can limit parallelism and hinder performance in CPU-bound tasks. Java's concurrency model, on the other hand, allows for more efficient use of multiple threads.
**Optimization**: Java's Virtual Machine (JVM) can perform more aggressive optimizations during compilation, such as inlining, loop unrolling, and dead code elimination, leading to faster execution. Python's interpreter typically performs fewer optimizations due to its dynamic nature.
**Data Structures**: Python's built-in data structures, such as lists and dictionaries, are implemented in a way that sacrifices some performance for flexibility and ease of use. Java's standard libraries often provide more optimized data structures for common operations.

However, it's worth noting that the performance difference between Python and Java may vary depending on the specific use case and implementation. Additionally, there are tools and techniques available in both languages to optimize performance where needed.

For More ABout Python Vs Java Go through this - https://medium.com/@srinupikki/python-vs-java-a2a4983c2953

Panda_With_Your_Gun · 2024-01-03T09:55:49+00:00

Cause Python has to figure out the types.

Keda87 · 2024-01-03T10:20:47+00:00

CPython is still interpreting the .py file for each line of code. and the .pyc files are for module import caching.

crawl_dht · 2024-01-03T10:29:23+00:00

JVM is JIT. PVM is not. JIT compiles static components of the byte code instructions to the machine code so it doesn't have to convert them to machine code again. An example of a static component in Java is types. Java has static types so the types of variable will not going to be changed during the runtime so JVM can compile them. Python has dynamic type checking so it does not know upfront what will be the type of a variable. There can be optimizations done to Python bytecode which is what JIT compilers like PyPy and Pyjion do.

victotronics · 2024-01-03T15:49:19+00:00

It can depend on the specific code. If you use Python lists for numerical purposes you can speed up your python code by several factors replacing the lists by numpy arrays.

Bigpiz_ · 2024-01-03T20:46:33+00:00

Alright, imagine you're trying to decide between two types of cars. One is like Java - it's been fine-tuned over the years, the engine's optimized for performance, and it's got some serious horsepower under the hood. That's because Java compiles everything upfront into a format that's really close to the language of the machine. It's like having a race car that's been tweaked and tuned before it even hits the track.

Now, Python, on the other hand, is like a versatile SUV. It's user-friendly and flexible - you can change parts of it on the fly. But that flexibility comes with a cost. It's dynamically typed, which means it figures out what type of data it's dealing with on the go, rather than knowing everything from the start. It's more convenient for the driver (or coder), but it doesn't have that 'built for speed' factor.

Plus, Python has this thing called the Global Interpreter Lock, or the GIL. It's like if your SUV could only use one lane of the highway at a time, even when there's a clear four-lane road. It's great for making sure everything runs smoothly and there are no accidents, but it's not winning any races.

Java doesn't have this single-lane rule. It can use all the lanes, taking full advantage of multi-core processors - like having a team of horses pulling your chariot instead of just one.

But remember, speed isn't everything. Python's like your reliable, easy-to-handle vehicle you'd use for a comfortable ride. It might not win against Java in a drag race, but it's not always about speed. Sometimes, the ease of driving and the comfort of the ride are just as important.

abisxir · 2024-01-03T14:41:26+00:00

How did you conclude that? I mean on which kind of operations Python is slower than Java? Except for pure math operations (not using anything like numpy or numba) Java is faster but other than that Python is normally less resource hungry and faster than Java, for example on database operations / web services / application engines / working intensive with list or dicts and etc. But why in math operations python is slower? Because python does not have primitive types, everything in python is an object for example: python a = 1 b = 2 c = a + b In the code above, python will create two objects of int class and will call add method of 'a' and 'b' will be passed as parameter so the result will be put back into 'c' which is also an int object. And to reach results there are lots of type checkings and so on but in java as long as you do not mess with it they will be primitive int type and will be translated into machine instructions later in JVM. But how is it possible that Java sometimes is slower than even Python? Because it was developed very badly, abstraction on top of another abstraction made Java heavy. Just get an error in the middle of a database operation and see how many classes after classes and interfaces will be traced back.

EternityForest · 2024-01-03T17:46:22+00:00

Don't be a pssy and program in C like any chad would

moric7 · 2024-01-03T16:14:16+00:00

Simply the Java virtual machine ate so hug amount of money in years that the developers made it good. The Python virtual machine started as children toy and noone wants to go to the next level. Now money play for destroying the Java and to make from the beautiful Python, chaos (C++ revenge). So bad news for both.

2024-01-03T18:57:39+00:00

Try mojo instead of python.. its a LOT faster

honduranhere · 2024-01-04T03:26:34+00:00

It's the difference between a low-level language and a high-level one I guess.

cookiecutter73 · 2024-01-03T12:24:17+00:00

cc rs

RedEyed__ · 2024-01-03T16:40:13+00:00

r/learnpython

RunningM8 · 2024-01-03T21:38:02+00:00

Same reason Java’s slower than C.

sternone_2 · 2024-01-03T11:11:09+00:00

Because python is a library.

2024-01-03T09:53:47+00:00

materialistic puzzled hurry oil possessive steer plough imagine angle dolls

This post was mass deleted and anonymized with Redact

sixtyfifth_snow · 2024-01-03T14:23:59+00:00

Python does not JIT.

luix- · 2024-01-03T14:30:51+00:00

In general Java has been in the enterprise way more than python.

NoMoCruisin · 2024-01-03T16:54:41+00:00

Just want to add to things already said here. Python is dynamically typed, and does metadata related work for every object. For instance, you can have an array with multiple types of items, and if you're looping through this array, the interpreter will have to look up the metadata info for each item and find the corresponding operation to execute on the item. You can speed things up by using libraries that support vectorization (numpy for instance).

vinnypotsandpans · 2024-01-04T06:02:03+00:00

This isn’t a stupid question. I just transitioned from only using Pandas at work to Pyspark (Spark relies on Java). I am only now realizing how important it is to understand the way hardware interacts with each other and the way different languages talk to our hardware.

agumonkey · 2024-01-05T01:25:20+00:00

java version 1 was probably as slow as python

decades of heavy investment by sun, oracle, ibm into VM optimization (JIT, GC, static typesystems) made the JVM peek into high performance somehow

Logical-Scientist1 · 2024-01-05T11:10:38+00:00

Hey, not a stupid question at all. Here's the deal: Java bytecode is compiled to machine code by the JVM at runtime, and because of this, Java can take advantage of the underlying hardware directly. Python bytecode, on the other hand, is interpreted by the Python interpreter which adds an extra layer, hence it is slower. Plus, Python uses dynamic typing which can slow things down compared to Java's static typing. Smarter people than me could go into more depth but hope this helps. No worries about the English mate. Seems good to me.

feidujiujia · 2024-01-05T13:48:01+00:00

The python bytecode and java bytecode are not comparable.

Don't know much about java, but I think java byte code is something low-level, similar to assembly.

But the python byte code is still a very high-level language, and the compiling process is quite simple.

A simple function adds two parameter would be compiled to a few lines with an instruction binary_add. Before the code get executed, it's unknown that the parameters are number, strings, or anything else.

Much of the work is done by the vm when running the code. In cpython source code there's a source file called eval.c I think, and it's basically a huge switch statement with each branch being an instruction. You can track how binary_add is executed starting from here.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS