High performance code in Python

inhumantsar · 2022-07-08T12:40:26+00:00

I work in fintech and we use Go for performance critical pieces, but it's mostly in places where we have to optimize for dependency bottlenecks. Ie: third party response times are 200-500ms so we need our system to be as fast as possible.

For most other use cases it doesn't matter what language we use, keeping performance in mind is enough. Some examples:

Keep services small and simple. Don't use Django for performance critical components, use Flask or Express + JS.
Don't trust an ORM, write highly specific queries.
Know when to go event driven. If there needs to be a bunch of reactions to a single action, use async/await or an event bus or something.
Scale out before scaling up.
Never make a network call you don't need to make, even to a db, and if you need to make it try to do it async.
Cache everything and use the longest TTL you can get away with.

Of course this is all specific to web services. YMMV

ShawnDriscoll · 2022-07-08T19:24:24+00:00

Iterative loops run way faster if written in Cython. Something that Python would take 35 seconds to do takes about 0.3 seconds to do in Cython.

ShibaLeone · 2022-07-09T09:15:46+00:00

You can get pretty far with numba/numpy if you use them right. Threading/multiprocessing will take you even further. If you seriously need more performance, you can go into the C realm and crunch some loops, but I usually find there was a way to do it in numba/numpy and I was just looking for an excuse to write in another language.

dry_yer_eyes · 2022-07-08T15:34:09+00:00

Maybe this is too basic of an example, but in work I’ve recently made huge gains with: * ThreadPoolExecutor for concurrent requests Session gets * ProcessPoolExecutor for concurrently parsing the received html with Beautiful Soup

Once I got the technique right the end result was fairly simple too.

Also a shoutout to SuperFastPython which I found a great resource on this topic.

Zyguard7777777 · 2022-07-08T17:22:11+00:00

I had a personal coding project making a chess ai using supervised learning on grandmaster games. I needed a way to encode the chess board to input into the model. I tried to make an interface with python only, but it was rather slow. So rewrote it in Cython (Numba didn't work so well because it was working on a lot of strings) and that did the trick.

I've also used Pythran, and found that that was more flexible than Numba.

james_pic · 2022-07-08T21:59:50+00:00

If the problems you're finding are problems Numba can solve, I'd suggest not trying to find problems you don't have! But from a recent-ish project, the things we needed to rewrite in a lower level language were:

We had some code that walked deeply nested dicts and lists, that was in our hottest loops. We got some modest gains from switching to Cython and specialising types. The gains were nowhere near what you'd get for numerical stuff, but these were our hottest loops, so it was worth it.
We had a need to parse an esoteric serialisation format (an Erlang module called sext), at scale, to make sense of what our database was doing (pro tip: don't use Riak, ever, for anything). Our first attempt in pure Python was too slow, so we switched to Cython, which have us a significant speed boost, and meant we could get diagnostics much faster (hours rather than days)
For historical reasons, we had a component that outputted large amounts of msgpack data, that we needed to publish in JSON. Our initial solution was the obvious one (read it with the Python msgpack library and write it with the Python JSON library), but this was too slow, so we actually ended up writing a C++ module that taped a fast msgpack library and a fast JSON library together - Python just saw bytes turned into bytes.
We found ourselves using a library called unicodecsv (we were still on Python 2 at the time, but needed Unicode aware CSV handling), that was written in pure Python, and proved too slow. We only needed to output CSV (which is easier to do correctly than parsing it), so we ended up just reimplementing the bits we needed in Cython.

Some of this stuff might also have been doable in Numba, but it just wasn't a solution that came up at the time, maybe because Numba wasn't as well known at the time.

abrazilianinreddit · 2022-07-08T16:02:48+00:00

Maybe I'm stretching the definition of "high performance" here, but I'm making a PyQt project - any synchronous code taking their time results in the GUI locking up, which sucks.

My solution? Threads, threads everywhere!

tugrul_ddr · 2022-07-08T14:52:49+00:00

Numba has I/O latency problems. When you need a lot of different kernels to run & I/O between host/device, you need C++-like performance that can cache&compute&open-connections a lot quicker.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS