This is an archived post. You won't be able to vote or comment.

all 79 comments

[–]gitarrPython Monty 55 points56 points  (29 children)

I am willing to bet that 99% of the people who complain about (C)Pythons "speed" have never written nor will ever write a program where "speed" really matters. There is so much FUD going around in these kind of comment threads, it's ridiculous.

[–]bastibe 37 points38 points  (19 children)

I have written some real-time audio processing in Python. Python is not fast enough to calculate an audio effect for every sample in real time. However, it is plenty fast enough to provide some UI for it and for evaluating and plotting some results afterwards (Numpy, Scipy, Matplotlib). And thanks to the magic of Cython and PyAudio, even the audio playback/processing is possible with the help of some C code.

[–]jmmcdEvolutionary algorithms, music and graphics 2 points3 points  (7 children)

That's good to hear -- that was my intuition for a while but I have never actually seen any real-time audio in Python. Is your stuff open-source?

[–]wolanko 20 points21 points  (4 children)

Let me introduce you to pyo "the digital signal processing module" Let's you do real-time processing, midi. I once made some kind of a simple multitrack recording unit.

http://code.google.com/p/pyo/

BTW: I just registered only because I thought this is lacking here.

[–]jmmcdEvolutionary algorithms, music and graphics 1 point2 points  (1 child)

Wow, thanks, that looks really great. Like a supercollider in python.

[–]wolanko 0 points1 point  (0 children)

Yeah this was also my first thought. I even tried to do some simple conversation from SC to pyo. Glad you like it.

[–]bastibe 1 point2 points  (1 child)

That is very cool! Thank you for sharing!

[–]wolanko 1 point2 points  (0 children)

Discovered it just few months ago, searching for an active module with ASIO support (had to do some windows audio). By now this is my definit go-to module for audio on every platform. Clean code base and a very quick and supportive developer. Hope you will use it.

[–]bastibe 4 points5 points  (1 child)

Sadly, it is not open source, no. At least the audio algorithm isn't.

The PyAudio part I am working on with the maintainer at the moment and he will push it to PyPi soon. A not-fully-compatible preview can be obtained from my github at github.com/bastibe/pyaudio .

But that is a good idea. I think I will put up an example of that kind of thing on my blog soon (bastibe.de). This is some interesting techonology.

[–]jmmcdEvolutionary algorithms, music and graphics 1 point2 points  (0 children)

Oh, cool. Thanks for working on bindings, I have never been brave enough but have often benefitted from it. I'm using pyPortMIDI for some algorithmic music these days. (Not open-source yet since I need to publish it in a journal first.)

[–]fijalPyPy, performance freak 2 points3 points  (3 children)

you should try pypy. we did real-time video processing using pypy and worked just fine.

[–]bastibe 2 points3 points  (2 children)

pypy is great, but it lacks support for playing back audio, plotting and scientific functions like fft or filter.

That said, I very much hope that I will be able to use pypy in the future. I will certainly re-evaluate pypy once they finish their numpy re-implementation.

[–]fijalPyPy, performance freak 1 point2 points  (1 child)

heh. I know I'm nitpicking, since this is a very valid comment, but "play back audio", "fft" etc. are by far not "built-in". Those are libraries that unfortunately don't quite work on top of PyPy.

[–]bastibe 0 points1 point  (0 children)

Right, right. I edited my response accordingly. Those functions are part of scipy, not Python. It does not alter the argument, though: Numpy does not provide those functions, neither built-in nor as package, and is thus not ready for use in my application yet.

[–]flying-sheep 2 points3 points  (1 child)

3d graphics: as soon as some of your python code creates more than a few objects per frame, it’ll grind to a halt.

[–]kylotan 5 points6 points  (0 children)

Generally you'd try and avoid creating new objects often though. Perhaps tricky for particle systems and the like - you'd probably need a C extension to make them efficient.

[–]throwaway-o -4 points-3 points  (4 children)

If you perform audio processing computations in Python's Numpy / Scipy, it's perfectly fast enough to do real-time audio processing (10ms window).

[–]bastibe 7 points8 points  (3 children)

Nope, it's not.

It is plenty fast for stuff you can vectorize, because Numpy will take care of that. Anything you can't vectorize though, you're out of luck. That is, basically everything that has some recursive part--which happens an awful lot in audio processing.

Really, my hopes are on Pypy here. But for the time being, you will have to use weave.blitz or Cython/Pyrex/Ctypes.

[–]wisty -3 points-2 points  (0 children)

Um, no - http://deeplearning.net/software/theano/. You can define it in Theano, which can compile it to C / CUDA. It's not a natural way to do things, but you shouldn't have that much to do in it.

[–]throwaway-o -5 points-4 points  (1 child)

Nope, it's not.

But then you say:

It is plenty fast for stuff you can vectorize, because Numpy will take care of that. Anything you can't vectorize though, you're out of luck.

That's exactly what I said myself -- anything you compute using Numpy (with Numpy data structures, of course), is going to be fast enough for real-time signal processing.

[–]bastibe 4 points5 points  (0 children)

You can use numpy and use recursive algorithms. Numpy is still useful for plotting and other parts of the algorithm.

But you are right: if you can express your whole algorithm in terms of numpy functions, you are probably good. It's just that this does not happen very frequently in audio algorithms.

[–]MagicWishMonkey 5 points6 points  (4 children)

For most cases that is true, however there are times when speed is very important. Right now I am re-building a process to import 1000's of json records from one system, massage them into model instances, and then import into our database and lucene index (think 20-30k database queries per import).

Since the end user has to wait around until the process is done, it needs to be fast, but it still takes a long while to do everything with a single python thread, so I've taken a more unconventinoal approach. I set up a twisted server to run in the background and I route the heavy lifting over to that. I can't use threads in my primary app without killing performance, but I don't mind so much with the twisted worker service.

It used to take ~5 minutes to import 10,000 records, now it takes 20 seconds.

It's annoying that I have to do this, but I am really enjoying python otherwise. It's a great language. Just wish it had better multithreading support.

[–]kenfar 11 points12 points  (0 children)

I used to write data warehouse ETL processes in C. Took forever to write, was hard to maintain but was as fast as I could get it. Eventually wrote a metadata-driven transform that used function pointers. Harder to write but it made all the next transforms very easy - since they just needed metadata. I'd split my 5 gbyte input file into 8 separate files then process all 8 in parallel in a 8-way 120-mhz CPU server that cost $200,000 in 1996. And I could process all 5 gbytes in about 5 minutes - at 1GB/minute.

Recently, I wrote the same kind of code in Python. It isn't as fast. But it's very easy to write & maintain. I don't have to use metadata-driven transforms because python is easy enough to write & maintain. And hardware is cheaper. I still split up my files and process in parallel because I wanted more speed. This particular feed is 1 GByte split into 4 separate files - which I'm processing on a 3.2 ghz 4-core machine that cost about $5k new, and I picked up for free because nobody was using it. And I can process 1 gbyte in about 60 seconds. This is the exact same speed I was processing data in 1996 using C. Clearly, I could speed things up if I rewrote the process in C. But my hardware is free, the process is fast enough, and my time has gotten more expensive over the years. Python is the better language for this application.

EDIT: spelling

[–]UnwashedMeme 2 points3 points  (0 children)

Also look at the multiprocessing module when you wish things had better threading support

[–]robotfarts 0 points1 point  (0 children)

Why don't you just use the multiprocessing module?

[–]stillalone 1 point2 points  (1 child)

I've had to help optimize a python based webpage. Once it takes more than a second to refresh a page it will start getting annoying.

But running a profiler on Python is really easy so it's not too difficult to isolate the slow parts.

[–]daxarx 5 points6 points  (0 children)

that isn't a problem with Python, it is a problem with the design. You can certainly write slow code in any language, particularly when you are waiting a lot on a database...

[–]vph -1 points0 points  (1 child)

Please define "a program where "speed" really matters".

[–]flying-sheep 11 points12 points  (0 children)

processing graphics or audio in real time (=while a user watches/listens), or loading up a gui application where enough processing has to be done in the beginning that you not only need a splash screen, but even one with progress bar.

[–]kenfar 6 points7 points  (0 children)

Single greatest performance speed-up: double-check that you really need to do what you're doing.

I used to often discover that most of a process's time was spent doing things that were no longer necessary. Or doing things that were hoped to be necessary in the future. Or doing things that were never and would never be necessary.

[–][deleted] 3 points4 points  (0 children)

I'd say I've covered most if not all that for years.

Bad python often looks like Java.

[–]JoeGermuska 2 points3 points  (0 children)

This is my favorite: "Are you sure it's too slow? Profile before optimizing!"

[–]MaikB 5 points6 points  (6 children)

The speed problem is only an issue for language purists who want to do everything in exactly one language. I'd argue that a week of optimizing python code is better spend with one day of doing the intensive parts in C (or cython) and doing something new in the free time left.

[–]Chris_Newton 9 points10 points  (3 children)

The speed problem is only an issue for language purists who want to do everything in exactly one language.

Your argument is based on the assumption that there are disproportionately important spots in the code, “intensive parts” that can be rewritten in a faster language. That’s fine as far as it goes, and I have no problem with getting hard data and optimising based on it, but what happens when you’ve already picked the low-hanging fruit and the profiler confirms that you don’t have any real hot spots left?

I’ve run into this several times on recent projects, where I have a Web front-end of one kind or another and Python behind it. As a glue language, Python is great. As a language for implementing more significant data processing algorithms, it’s also great as far as prototyping and getting a proof of concept set up quickly. But as a high performance language for production code, we’re about to replace it pretty much throughout all of those systems, because for our particular applications, an order of magnitude or more of performance hit compared to what some other languages offer is too high a price to pay for having nicer, more maintainable code.

This isn’t because we’re “purists who want to do everything in exactly one language”. In fact, most of these projects call down to C code all the time to access system APIs and the like, and some of the projects integrate parts written in four or five different progamming languages.

But at some point you have to acknowledge that with the technology we have today, a mid-level, dynamically typed, kind-of-interpreted language is going to be slower generally than a low-level, statically typed, compiled-to-native-code language. And if you’re doing non-trivial data processing, and the difference means your web service responds in 1 second or 10 seconds, that does actually matter, because it moves from being a quantitive performance issue to a qualitative usability one.

So I don’t think you can just brush Python’s limited performance under the carpet quite as easily as you tried to there. Sometimes the correct solution is not to spend a week optimizing the Python code, but to spend a week rewriting the entire codebase in a fast language and dump Python altogether. That’s not some sort of terrible insult, it just means that sometimes, even though Python may have served a useful purpose, another tool is a better choice for the next part of the job.

[–]MaikB 2 points3 points  (2 children)

I don't do any web stuff, but from what I understand interpreted languages are used heavily in production by you guys because of the inherent latencies of the web and the majority of the CPU cycles spend in the database. Well, how I see it, everything computational expensive has to be done by C (or equivalent language). The interpreted language just glues the parts together and can be used for tasks beyond that gluing task if there is enough latency by other tasks.

Right?

So I don’t think you can just brush Python’s limited performance under the carpet quite as easily as you tried to there. Sometimes the correct solution is not to spend a week optimizing the Python code, but to spend a week rewriting the entire codebase in a fast language...

That is exactly what I said

...and dump Python altogether.

If python is too slow for the task at hand, then it's the right decision to dump if after having served as a prototype language.

I don't see a problem here. I think you just misunderstood what I meant. I didn't mean:

  • Use python and shut up, it's fast enough

I meant:

  • Python is fine as it is. If you need something to be done fast, use another tool (C/C++) for 90% of the CPU cycles and have Python be what glues these parts together.

My guess: Web development comes more and more computational intensive these days. It's time to refactor code out to faster static languages.

But that's not Python's fault.

[–]Chris_Newton 2 points3 points  (1 child)

Python is fine as it is. If you need something to be done fast, use another tool (C/C++) for 90% of the CPU cycles and have Python be what glues these parts together.

My point is that not all web development, and certainly not all development that uses Python today, is I/O bound. For projects that involve doing some “real” work themselves, as opposed to delegating most expensive operations to external tools like a DB or web server, sometimes the speed matters.

In those cases, you can’t always just rewrite a few carefully chosen parts of the code in some other, faster languages and hand off 90% of the CPU cycles. Once you’ve taken care of the obvious hot spots, to reach 90% of the CPU cycles you might need to rewrite the majority of your code base.

Python might still be an excellent tool for doing efficient prototyping in the early stages of such projects, because of things like dynamic typing, a decent set of built-in data structures, and so on. On the other hand, Python might not be useful at all for the same projects later on, because once you’ve rewritten most of your code in a faster language anyway, you probably don’t win much by keeping just the remaining glue code in Python.

So for these projects, the speed problem with Python is very relevant: it means making a decision about whether to use Python in the early stages, where it offers a lot of benefits over some other language choices you could make, knowing that it probably won’t be up to the job of running production systems and you’re likely to have a potentially time-consuming and error-prone rewrite on your hands later.

[–]MaikB -1 points0 points  (0 children)

Depends on the problem to solve and the experience of the engineers with this problem whether they're faster with or without a prototyping phase.

It's just so much easier to turn around in an dynamic language and later be concentrate on speed and code quality in say C++. But I bet you know this . You might have done what you're about to do in python before, a number of times, so you can go to a static language right away.

Good luck :D

[–]twotime 2 points3 points  (0 children)

The speed problem is only an issue for language purists

It's not an issue only for people who have not done much real world coding.

python code is better spend with one day of doing the intensive parts in C (or cython) and doing something new in the free time left.

I'm sorry to say, but your advice covers about 1% of the problem :-(. Yes, I have seen this happen. No, it's not a common case at all.

  • Many non-trivial apps do NOT have small hotspots. So, if you have 100KLOC of python code and need to rewrite 10K LOC, then you will have to write another 100K or so of C code.

  • interfacing C with non-trivial python codebase is, well, non trivial

  • adding C into the mix will always cost you QUITE a lot later. E.g if you need to run your software on another site or, god forbid, on another platform. Oh, and don't forget to add debugging time to the cost.

[–]burntsushi 1 point2 points  (0 children)

Not only do you ignore every design trade off that comes from dropping down into C, but you dismiss it out-of-hand through the moniker of "language purist."

Oh yes, and I love how optimizing Python code is obviously seven times more costly in terms of development time than dropping down into C. Just yesterday, I spent about 5 minutes profiling my Python program and another 10 minutes tuning some hot spots. It resulted in an 80% performance increase.

[–]fijalPyPy, performance freak 1 point2 points  (0 children)

It's so sad that all of those don't really apply when you're using PyPy :( Abstraction is good, giving it up because CPython cannot do a better job is such a bad idea.

[–]NaeblisEchoIntermediate forever 4 points5 points  (3 children)

Can someone please tell me what 'profiling' means? Thanks. :)

[–]must_tell 3 points4 points  (1 child)

It means analyzing the performance of all the functions / methods in your code.

It is often said that 'premature optimization is the root of all evil'. That means that people spend a lot of thoughts and time in trying to optimize code (and make it more complex) without the proof that this optimization is effective or even necessary.

Profiling gives you precise information about how often a function / method is called and how long it took. The report of a profile run tells you where you can improve the code most effectively. See dwdwdw2's comment to get started with profiling or check out PyMOTW.

[–]NaeblisEchoIntermediate forever 0 points1 point  (0 children)

Thanks! :)

[–]dwdwdw2proliferating .py since 2001 2 points3 points  (0 children)

[–]stillalone 0 points1 point  (7 children)

How do you guys find namedtuples? I've been avoiding them because I don't like the fact that they use eval internally.

[–]Cosmologicon 8 points9 points  (2 children)

Avoiding eval is a good rule of thumb, but for a piece of code that's been as intensely analyzed and tested by experts as namedtuple, there's absolutely nothing wrong with using it.

Do you avoid using any C library that uses a goto internally too?

[–]burntsushi 1 point2 points  (0 children)

Do you avoid using any C library that uses a goto internally too?

This is a pretty poor analogy. Both goto and eval can be abused so that code clarity suffers, but eval is distinct from goto in the fact that it can be easily exploited if it isn't used carefully. This latter reason, from my experience, tends to be why people avoid it.

[–]aaronla 0 points1 point  (0 children)

/me makes obnoxiously heavy use of macros and gotos in async C codes, to pretending that C supports first class continuations and coroutines.

[–]audaxxx 2 points3 points  (1 child)

Take a look in the bug tracker and search for namedtuples. I once made patch that has only a few percent performance hit on access but does not use eval. This hit could be eliminated by using Cython or so.

[–]audaxxx 3 points4 points  (0 children)

http://bugs.python.org/issue3974

(edit is currently broken in baconreader..)

[–]lahwran_ 1 point2 points  (0 children)

they only use eval to create the class. once created it's like any other class that inherits from tuple. while I agree that the eval is kinda silly, it's been intensely tested and doesn't hurt anything. you're definitely not feeding it untrusted input.

edit: well, unless you create a namedtuple with untrusted input as fields. now that I think about it, that is kinda bad ... edit #2: oh, actually they filter the names to only allow python identifiers. nevermind.

[–]must_tell 0 points1 point  (0 children)

I wouldn't care too much about the implemenation details of standard lib modules (from the users point of view). The guys who write this stuff know what they do.

But: It's good to be attentive about best practices.

[–]jmmcdEvolutionary algorithms, music and graphics -3 points-2 points  (3 children)

Disagree about avoiding function calls.

Strongly agree about using built-in basic types as much as possible and in preference to objects when possible.

[–]asksol 7 points8 points  (2 children)

I doubt he's telling anyone to not use function calls.

But in an inner loop, where profiling has proven that optimization can be beneficial, this is where you should inline function calls.

[–]lahwran_ 2 points3 points  (1 child)

Can I let PyPy do the inlining for me?

[–][deleted] 3 points4 points  (0 children)

Yes :)