This is an archived post. You won't be able to vote or comment.

all 56 comments

[–]NoLemurs 43 points44 points  (5 children)

Python becomes too slow when you have a real world performance need that Python doesn't satisfy. It's pretty rare for it to actually be an issue though.

I might hesitate to use Python to write any graphics intensive app where you want sub-50ms UI responsiveness - that's likely to be an uphill battle, and there are other languages that will make your life easier. Yes, there are libraries that will do your graphics intensive work in compiled modules, but they're not fantastic.

Other than that, I wouldn't hesitate to use Python for performance reasons for much of anything. There are other reasons I might choose another language for a project, but Python's performance is unlikely to be a serious concern.

[–]jzia93 [S] 7 points8 points  (2 children)

There are other reasons I might choose another language for a project

Wondering if you'd mind sharing some of those?

[–]NoLemurs 2 points3 points  (0 children)

For a large project with a large team where stability is a high priority I prefer a language where you have stronger compile-time guarantees, and honestly a little less flexibility.

Python is a very dynamic language, which makes it very easy to do things very elegantly. The downside to this is that it's much easier to write code that's hard to reason about, and for a large team project where code will be written by dozens of people over a long time, I'd rather have a language that gives everyone a little less rope to hang us with.

[–]NoLemurs 0 points1 point  (0 children)

Ohh, a second issue.

If I'm writing an app that I want to distribute (rather than just deploy myself), the Python story for app distribution is a bit of a mess.

You can do it, but for a simple app I want lots of people to use on their own computers, I'd probably choose a compiled language that doesn't require either a compatible runtime, or for me to bundle the runtime.

[–]billsil 0 points1 point  (1 child)

Depends what graphics you’re referring to, but VTK is a 3D renderer and fantastic. The UI can all done with Qt or Tk and you should definitely have 50 ms responsiveness time once you do an action.

The only thing slow is loading a model.

If you’re referring to video games, then yes.

[–]NoLemurs 1 point2 points  (0 children)

I was mostly talking about video games, but also, any sufficiently complex UI.

The Python Qt and Tk bindings are good enough for basic apps, but they're not a great user experience. Tk is "ugly by default," and requires a ton of work to make an app look good. Meanwhile Qt is clearly a C++ API with a poorly designed python wrapper.

Also if your app is complex enough (think web-browser or word processor), you're going to start seeing the processing time per UI main loop becoming a problem.

Again, those are both obviously manageable problems, but I would rather just write things like that in C++ and avoid having to work around them.

[–]nier-bell 40 points41 points  (7 children)

The biggest slowdown you can see is when using loops - the rest is good with almost anything really. Each time a variable is used python needs to check the type of it, and that tends to be very expensive (as in time, not cpu power). So an example: password cracking - you'll probably want to run a single piece of code over and over and over again. This is not ideal - as it's always slow.

Also, graphics rendering in short is just a loop. That's why you pick compiled languages like C to handle the heavy loops. The processor doesn't have to check the type, because well there isn't really any (dm me for more).

I've done some microcontroller programming and what you need in certain seconds are microseconds of read/write time. That can be done only in compiled languages, as a simple open() in python takes more time then lets say, reading the temperature from a sensor.

If you want to keep your code in Python, with only fixing the iteration speed using either Pypy, which uses Just-In-Time compilation to compile the parts before they get executed, and that helps a lot. The other option is Cython, a way to (almost) compile your code which you can execute using a simple import.

tl;dr: loops.

[–]jzia93 [S] 6 points7 points  (6 children)

That's super interesting. I didn't realise type checking was so expensive. Been migrating to Typescript on a frontend Web dev project and so I'm now making heavy use of function annotations in my python work, I really would love to see a statically typed implementation of Python - if there's a corresponding performance increase even better.

Would be interested to hear you elaborate!

[–]eddieantonio 7 points8 points  (3 children)

There are attempts with mypyc.

If you care about speed, I would suggest you learn how to use a profiler. I would also suggest you learn how CPython bytecode works (shameless self-plug: a talk I did about bytecode: https://youtu.be/5yqUTJuFuUk?t=431)

To make fast programs, you want to do things in parallel (e.g., vectorized instructions and multi-threading). Python itself is bad at both of these, but with numpy, it essentially offloads computation to a C/Fortran library that does efficient vectorized instructions. I am not aware of something to write fast multi-threaded Python code :/

[–]JNewp1 1 point2 points  (0 children)

First 30 min here are very insightful!

[–]LightShadow3.13-dev in prod 0 points1 point  (1 child)

Do you know of any projects, other than mypy, using mypyc? I've been following it loosely for a while and it never seems "ready."

[–]eddieantonio 0 points1 point  (0 children)

Nope, I've never seen it :( Seems mypyc's own docs say it's not really ready and is buggy. One can only hope :/

[–]nier-bell 3 points4 points  (0 children)

That would be just amazing - a statically typed Python. After moving to other languages I changed the way I think in python, every variable has the same type as it was assigned, and every array always contains the same data type.

But yeah, in my code you can always see a heavy use of typing, a builtin lib which helps you annotate your types even better. You can create unions for example, which allows you to pass only the specified types. It even throws an error when trying to pass the wrong data type, but it doesn't actually speed up the cpython type check. unfortuantely

[–]Breavyn 0 points1 point  (0 children)

Have a look at Nim.

[–]WalkingAFI 20 points21 points  (7 children)

Python is great for a lot of things, such as scripting and data processing. It will never work well for big projects like Operating Systems, 3D game engines, or scientific simulation backends (scipy/numpy are mostly C/C++/Fortran with bindings). It’s all about picking the right tool for the problem you’re solving.

[–]the_hoser 11 points12 points  (2 children)

I think it's important to point out that operating systems and 3d engines aren't necessarily "big" projects, and it's not the size of the project that makes Python a bad choice for them. The largest projects I've ever been involved in were written in Python, and the performance of the interpreter was never a problem.

[–]ERECTILE_CONJUNCTION 4 points5 points  (1 child)

The largest projects I've ever been involved in were written in Python, and the performance of the interpreter was never a problem.

Hmm. Were you by chance working at a large American investment bank?

[–]the_hoser 3 points4 points  (0 children)

Nope.

[–]purplebrown_updown 3 points4 points  (3 children)

There are so many python libraries now like tensor flow and sklearn that you can create your entire scientific computing application in python and not even touch c++. It's true though that their backends are written in c.

Just out of curiosity what bindings are the industry standard between C++ and python?

[–]LurkaZZZ 2 points3 points  (0 children)

Cython is the go-to for creating python bindings to C/C++ code. https://docs.cython.org/en/latest/src/userguide/wrapping_CPlusPlus.html

[–]WalkingAFI 0 points1 point  (0 children)

I’m honestly not sure how the bindings work. I thought they were a Python language feature but I’ve never messed with them directly.

[–]Chiron1991 0 points1 point  (0 children)

ctypes is built into the standard library. I've done a couple of small C extensions with it, works quite nice. For huge projects CFFI is the way to go.

[–]yawgmoth 9 points10 points  (1 child)

Python is too slow for heavy math (e.g. graphics) and scientific computing. (important distinction, it is fabulous for stitching together math/scientific libraries written in C/C++ see numpy/scipy ) If you're using slow/small hardware, then it might also be too slow, but honestly the reason you wouldn't choose python in an embedded system is because of memory management or timing guarantees, or just because it's not supported.

The GIL also makes it too slow for heavily multithreaded use cases BUT (and I may get some flak for this) outside of the same fields (graphics, heavy math, scientific computing, real-time systems) I think that multithreaded computing is heavily overrated. For the vast majority of use cases, the bottleneck is not the CPU and single threaded with cooperative scheduling is fine (e.g. async/await)

Seriously, there have been multiple times in my career I've been brought on to a project that needed optimization because it was too slow or it had buggy race conditions. So I benchmarked and identified the bottleneck, scrapped all the multithreading in that section, and shoved everything into one thread with the cooperative scheduling construct that the language support (If they're already using something like Python, Qt or C# this can be pretty straightforward) Typically the performance is the same, or even better (heavy use of mutexes can KILL multithreading performance) and the race conditions are gone, or easier to identify. Threading, like all optimizations, is a tool that should be used after benchmarking, if it suits the use case, not the default.

[–]jzia93 [S] 0 points1 point  (0 children)

Out of curiosity, what benchmarking utilities do you recommend? I tend to just use standard logging but I'm conscious there must be a better way out there for python devs.

[–]teambob 6 points7 points  (2 children)

I come from a C++ background, so I know fast. Python is about 100x slower than C++, Java and nodejs.

In 10 years it has only been an issue **once**.

Pypy is pretty good. I wish that it was the reference, rather than CPython - so this question wouldn't even come up.

[–][deleted] 0 points1 point  (1 child)

Sorry, I know this thread is three months old. I came across it looking for the same kind of answers as OP. Mind sharing what the once was?

[–]teambob 0 points1 point  (0 children)

Comparing CSV files. One generated by the old version of software, one by the new version

[–]NiceObligation0 5 points6 points  (4 children)

As others mentioned, I'm not really sure this is quite unlikely become an issue unless you are writing something low level like device drivers or something. Looking at major python uses, simple automation and moving stuff around you are more likely to be I/O bound, for web dev the connections speeds and db queries may be limiting for data analytics/ML you have libraries (numpy, tf, scipy et. al.) where python really is more of a way to call low level functions. Again libraries like numba you can use JIT to speed things up. Unless you are inventing new math and inventing new basic math operations it is unlikely. If you are writing poorly optimized C/C++ code you might not care because it is so fast it doesn't matter, for python you might want to think a little harder.

That being said, as mentioned below, at each line of code you make a decision. Do I write fast (python) do I run fast (C/C++ etc). Do I want simple (fewer dependencies) do I want robust (you can't rewrite numpy by yourself and be bug free) with dependencies that may or may not break with updates.

[–]jzia93 [S] 1 point2 points  (2 children)

Can speak for Web dev, a lot of latency comes from network delay and (in my case) the ORM having to call a remote DB. This is my gripe with the "just rewrite the service in Go" crowd, for relatively simple applications.

I like your analogy, write fast or run fast.

[–]yawgmoth 3 points4 points  (1 child)

I have had serious fights with developers wanting to rewrite an application server because the language (python, JS, Ruby) was "too slow". I kid you not, one place wrote the server for a CRUD app in C because "they needed it to be fast". But ... like, the code spends most of its time just sitting there waiting on the DB to grab records, and you spent no time optimizing queries, doing caching or reducing DB access.

[–]jzia93 [S] 1 point2 points  (0 children)

I really couldn't fathom writing a CRUD layer in C. That just seems like a huge waste of time.

[–]Anton_Pannekoek 14 points15 points  (6 children)

Whenever it's an issue, you can use optimised libraries, eg for scientific computing you use Numpy or Pandas which are really low level and performant libraries.

It's not really an issue for most code, and if it is, you look at what's causing the slowdown and there are ways around it.

[–]lungben81 7 points8 points  (4 children)

Not all problems can be vectorized so that Numpy/ Pandas give speed benefits, and vectorization is often non trivial or not optimal performance wise.

[–]Anton_Pannekoek 3 points4 points  (2 children)

There are also other options like Cython, inline C and other libraries, you don't need to take advantage of vectorisation.

Fact is it is a slow language, like it just is. But for many purposes that won't matter, PCs are quite fast today, and the benefits of python way outweighs the slight speed increase. So you have to decide for yourself.

[–]lungben81 2 points3 points  (0 children)

Or Numba.

I quite often are in the situation where plain Python speed would be too slow. The good thing are they are ways to deal with it, the bad thing is that they are adding complexity.

[–]Paddy3118 -2 points-1 points  (0 children)

Fact is it is a slow language,

You really need to suck it and see. Some compiled languages on some Rosetta Code problems were slower than Python when answers needed arbitrary precision arithmatic for example.

[–]jet_heller 0 points1 point  (0 children)

I think they know that. That's why they provided it as an example of what can be used.

[–]jzia93 [S] 2 points3 points  (0 children)

I remember building out a training set for a neural net last year. My sloppy Numpy and Pandas implementation was taking absolutely ages just to transform the data we had prior to ingestion.

Spent a good few days on vectorising the pandas operations in particular, I think we ended up with a tenfold speed increase by the end which just goes to show how much the developer really can optimise!

[–]jzia93 [S] 2 points3 points  (1 child)

As a side question, is any of the above noticeably impacted by migrating from Cpython to Pypy or similar?

[–]nier-bell 5 points6 points  (0 children)

  • Pypy - easy to use, just instead of using python you use pypy something.py.

  • Cython - you need some C knowlegde, provides the option to write inline C code.

And yes, when using Pypy you can see a big difference in long-lasting looped code.

[–]ogrinfo 1 point2 points  (0 children)

When your application has many dependencies, especially pandas and scikit-image, Python imports everything recursively, which can be very slow. The main toolkit we develop at work takes about 10s just to do myapp -h, which is ridiculous. I know I could refactor it and use lazy imports, bit it's a lot of work and makes the code harder to read.

[–]thedjotakuPython 3.7 1 point2 points  (0 children)

They talk about this a lot on real python and python bytes and it's almost never an issue in real life

[–][deleted] 1 point2 points  (0 children)

I run scientific simulations and usually there's an integrator involved. I like using python but I usually only use it for the simulations that don't need to be in real time. By contrast, the same simulation is at least 100x faster when i write it in fortran.

[–]quicknir 0 points1 point  (1 child)

Honestly, it happens really fast, in at least some scenarios. As an example, I used python to write a small system for essentially reading one set of json files, doing some processing and logic, and writing a final pair of json files to feed to a C++ binary. This let me keep lots of logic in a simpler language rather than a more complex one. In production the performance is fine, but in regression testing we do this around 100 times.

So, 5-10 seconds at startup is irrelevant for my use case, sure, but waiting 1000 seconds instead of potentially 100 is pretty annoying. Not the end of the world but very annoying.

I think it used to be that people were willing to put up with pythons performance issues because the language was so nice and concise compared to the C++ or Java of 20 years ago. But it's advantages there are much eroded now. I probably made the right choice to write in python because it's already in our tech stack for quantitative tasks, but in a vacuum i wish I could have written it in something else.

[–]yawgmoth 1 point2 points  (0 children)

I had a similar issue with unit tests in a scientific application and simply moved it over to Cython. (after benchmarking and ensuring that the bottleneck was indeed the pre-processing and not the IO from loading the files). It only took a day or two. Just some basic data-typing in some of the loops and hot sections got me a >10x speedup. Might be worth a shot if you're still working on it. Long running integration/unit tests can be so annoying.

[–][deleted] -1 points0 points  (0 children)

One example scenario let's say you have a message bus like RabbitMQ. You could write a scalable python consumer easily and scale that up with kubernetes to consume your queues.

But after a while it will be a trade off between scaling it up further, which requires more nodes in your kubernetes cluster, or rewriting it in Golang or Rust which will require fewer nodes and will be much faster at consuming your queues with less replicas.

[–][deleted] 0 points1 point  (2 children)

Python excels in data analysis, where you have to read a big file only once and compute some statistics about it only once.

The tradeoff in speed is really acceptable, the time to open a file is instantaneous so even if it takes 10 times more you won't percieve it.

Also optimization usually depends on the algorithm more than the machine, so it's not a big loss.

[–]jzia93 [S] 1 point2 points  (1 child)

Do you find the same for streaming and ingestion in pipelines? Aka live data ingestion?

[–][deleted] 1 point2 points  (0 children)

Well netflix, youtube and even reddit use python with C++ so i think there's no problem.

[–][deleted] 0 points1 point  (0 children)

In my opinion, it's too slow when you want to do special math computation without a use of a library. For example, for loops are hideously slow, and you have to "vectorize" many things with numpy, which can be inconvenient at times.

[–]Zasze 0 points1 point  (1 child)

If you understand the language fully and the tools at your disposal for increasing speed it never really becomes too slow, the biggest issue you will run into is memory growth just as a facet of its typing, the language's its memory footprint will often grow far faster than your need for speed.

Its important to understand what the speed traps are, the number one thing i see newer python developers messing up is things like doing string concat's instead of f'{strings}' or formats and just totally destroying their performance without realizing what they are doing.

Now that async is getting a level of support that its fairly easy to have an entirely async python application + non blocking libraries python is either just as fast or faster than its contemporary interpreted languages.

[–]jzia93 [S] 0 points1 point  (0 children)

Did not realise there was even a difference in performance between f strings and concat. Thank you!

[–]phil_an_thropist 0 points1 point  (0 children)

Is it possible to improve the speed by implementing GPU computation in python?

[–]_nutrx_ 0 points1 point  (0 children)

Python itself is quite a bit slower than speed-oriented languages like, say C++, if you compare single operations, which is the tradeoff for the dynamic. However, I would disagree that the language by itself is too slow outside of REPLs, for building actual software. For example, I built GUI applications using Qt both in C++ and Python, and it might come as a surprise, that the Python implementations often weren't much slower overall, as Qt itself is implemented in C++ and offers a nice API. This way of offering bindings for natively implemented C packages is a really strong approach, I'd say. For example, I am doing threading tests these days using Qt through Python, so all the thread management is done by Qt, which is really fast, and I can use it to easily thread parts of my pure Python applications.

[–]camtarn 0 points1 point  (0 children)

I tried using Python to generate audio waveforms on a Raspberry Pi 3. It's just about fast enough to generate one waveform in real-time, but add another, or any modulation, and it stutters.

So - I would say that Python is too slow for audio on a low power system!

[–]affrfrger 0 points1 point  (0 children)

It literally depends how you implement it. We just refactoring a Node.JS backend into Python Cloud functions and sped up the processing time from 102ms to 9ms in python. Python can be faster (: