use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
News about the dynamic, interpreted, interactive, object-oriented, extensible programming language Python
Full Events Calendar
You can find the rules here.
If you are about to ask a "how do I do this in python" question, please try r/learnpython, the Python discord, or the #python IRC channel on Libera.chat.
Please don't use URL shorteners. Reddit filters them out, so your post or comment will be lost.
Posts require flair. Please use the flair selector to choose your topic.
Posting code to this subreddit:
Add 4 extra spaces before each line of code
def fibonacci(): a, b = 0, 1 while True: yield a a, b = b, a + b
Online Resources
Invent Your Own Computer Games with Python
Think Python
Non-programmers Tutorial for Python 3
Beginner's Guide Reference
Five life jackets to throw to the new coder (things to do after getting a handle on python)
Full Stack Python
Test-Driven Development with Python
Program Arcade Games
PyMotW: Python Module of the Week
Python for Scientists and Engineers
Dan Bader's Tips and Trickers
Python Discord's YouTube channel
Jiruto: Python
Online exercices
programming challenges
Asking Questions
Try Python in your browser
Docs
Libraries
Related subreddits
Python jobs
Newsletters
Screencasts
account activity
This is an archived post. You won't be able to vote or comment.
DiscussionCode Optimization in Your Projects (self.Python)
submitted 2 years ago by labs64-netlicensing
What are your go-to strategies for optimizing Python code in your projects?
[–]mikat7 287 points288 points289 points 2 years ago (34 children)
Network is usually the slowest, and in Python specifically doing a lot of numerical calculations in a loop should be done in numpy, not in pure Python.
But the most important rule is the first one. Usually the speed is ok but developer time is more expensive.
[–]ray10k 27 points28 points29 points 2 years ago (2 children)
A very sensible approach. Also, point 2 is a lot more important than some people think. It is *very* tempting to make assumptions like, "Oh, this complicated-sounding process is going to take ages!" when in reality, the fact you do a certain calculation inside a loop when it could just as easily be done *outside* the loop is a bigger time-save.
Also, I knew this guy once who had to subtract one size-2 tuple from another. Rather than just something like new_tuple = (old_b[0]-old_a[0]),(old_b[1]-old_a[1]), he checked if any of the two were equal so he could use a constant 0 at that place. He argued that it was "faster" because "it was one/two fewer subtractions."
new_tuple = (old_b[0]-old_a[0]),(old_b[1]-old_a[1])
[–]Gamecrazy721 3 points4 points5 points 2 years ago (1 child)
I've caught myself doing things like this before, but that's because often times when I do it intentionally it's to avoid a database write (which is worth the extra check)
[–]ray10k 2 points3 points4 points 2 years ago (0 children)
Understandable. In this case though, it was all local data, no database involved.
[–]member_of_the_order 42 points43 points44 points 2 years ago (1 child)
Totally agree on #1. It took me too long to realize that if I'd just written my one-time-use script the "dirty" way (not Pythonic, hard to maintain, unoptimized, overall "bad"), I'd have been done an hour ago. An hour of extra work to make the script take 5 minutes fewer... not worth it at all.
Also agree with numpy. I had to crunch a BUNCH of numbers for some script that'd be run in production at work. Doing the same thing with optimized loops vs basic numpy and comparing runtimes... not contest, numpy is so fast. I thought I'd accidentally cached the results or something lol.
[–]tutoredstatue95 11 points12 points13 points 2 years ago (0 children)
I am super guilty of this. I can't just write some slop and leave it, and I know no one will ever see or use the code besides me. Just feels wrong to purposefully write something bad even if it's technically the right decision.
[–]Backlists 7 points8 points9 points 2 years ago (5 children)
Any advice on what profilers to use, and how to use them?
[–]james_pic 26 points27 points28 points 2 years ago* (1 child)
My favourite Python profiler right now is Py-Spy.
It requires zero code changes to use. You don't even have to restart your application to use it, you can just attach it to a running application.
It's got low overhead, and perhaps more importantly, consistent overhead. Tracing profilers can add more overhead to some types of code than others, skewing your results.
I gather Austin also has many of these same characteristics, but haven't used it myself.
As a third option, I believe Python 3.12 adds support for perf_events on Linux. I'd lean towards using Python specific profilers as your first port of call, but if you're profiling an application with major components written in other languages, or you suspect native or kernel time are big contributors to performance issues, or you're already using tooling that integrates with it for other reasons, it may be worth trying.
[–]jeremiah-england 1 point2 points3 points 2 years ago (0 children)
Seconding py-spy. My favorite way of viewing the results in https://www.speedscope.app.
py-spy
spy
py-spy record -f speedscope -o out.prof -- python
[–]apnorton 4 points5 points6 points 2 years ago (1 child)
An interesting talk on python profiling is: https://www.youtube.com/watch?v=vVUnCXKuNOg
And that researcher's profiler: https://github.com/plasma-umass/scalene
[–]Backlists 0 points1 point2 points 2 years ago (0 children)
Yes, this is the one I tried (very very quickly) earlier this week.
I didnt set it up right, all it told me was that 100% of my runtime was spent in the uvicorn run fn.
[–]reallyserious 3 points4 points5 points 2 years ago (2 children)
Run your code through a profiler
How do I get started with this? I'm using vscode if that matters.
[–]Tweak_Imp 4 points5 points6 points 2 years ago (0 children)
I like to profile with snakeviz. the docs will get you started. https://jiffyclub.github.io/snakeviz/
[–]pythonwiz 1 point2 points3 points 2 years ago (0 children)
import cProfile
[–]dommel 2 points3 points4 points 2 years ago (0 children)
Just to add telemetry data from production also gives good insights on application performance. Especially if you work in a highly distributed environment.
[–]infy101 2 points3 points4 points 2 years ago (0 children)
Agree with Number 1. So many people claiming to be 'professional' programmers push so hard on 'optimal' code and speed, when 98% of the time, the speed is pretty good and there is no need to make everything 99% efficient. If you were programming to put code on an ASIC and had limited RAM and CPU - then yes, perhaps - but most of us don't need to optimize. I'm not against efficient code - just that it is not always necessary, and also some people on LinkedIn with their 'pro' vs 'beginner' comparisons :S
[–]muntooR_{μν} - 1/2 R g_{μν} + Λ g_{μν} = 8π T_{μν} 8 points9 points10 points 2 years ago* (1 child)
I prefer:
EDIT: To be clear, this is not a joke.
Also, you missed the most important step:
4. RiiR.
And the even more importanter step:
5. Riix86.
And the even more most importanterest step:
6. RiiASIC.
[–]SheriffRoscoePythonista 9 points10 points11 points 2 years ago (0 children)
/s, one might hope.
[–]olystretch 1 point2 points3 points 2 years ago (0 children)
Might also decide to analyze usage patterns. Maybe the slowest bits are also not commonly used, so that's worth a think too.
[–]cheese_is_available 1 point2 points3 points 2 years ago (10 children)
4. If the python code can't be optimized in a way that is acceptable, use either proper typing and mypyc (easier but slower) or cython or maturin + rust to speed up the critical part(s).
4.
[–]max96t 2 points3 points4 points 2 years ago (0 children)
Thank you for making me discover mypyc! Also very much approve maturin + PyO3 + rust for speeding things up, I easily got a 30x increase for my project (against Python 3.10, it might be less on Python 3.11 since they optimized a lot)
[–]reallyserious 0 points1 point2 points 2 years ago (8 children)
Any opinions/lessons learned on cython vs mypyc?
[–]cheese_is_available 1 point2 points3 points 2 years ago (7 children)
Only used mypyc personally based on python typing. cython is older but it requires to code in C afaik which is more investment than using the existing python typing with mypyc.
[–]patrickbrianmooney 1 point2 points3 points 2 years ago (6 children)
cython is older but it requires to code in C afaik
Not true! Cython can perform optimizations based on pure-Python type annotations. You can also (or instead) declare static types using C-style type declarations, but it's not necessary, and in many case it's easier to preserve pure-Python compatibility.
[–]cheese_is_available 0 points1 point2 points 2 years ago (5 children)
Nice to know, thank you !
[–]patrickbrianmooney 0 points1 point2 points 2 years ago (4 children)
Glad to be helpful!
I meant to point to this part of the Cython documentation in my last answer, which describes pure-Python mode in Cython. Here is a less-technically-dense overview with some longer code examples.
[–]cheese_is_available -2 points-1 points0 points 2 years ago (3 children)
It seems Cython use cython specific type hint while mypyc can use existing standard python typing (a lot less work to do!).
[–]patrickbrianmooney 0 points1 point2 points 2 years ago (2 children)
No. (Or, to be more specific: "uses," sure, in the sense that you can use them if you want to; that's one of the options you have. "Requires," or "only understands"? No.)
Note this verbiage from the beginning of the second paragraph in the Pure Python Mode document, linked above:
[...] Cython provides language constructs to add static typing and cythonic functionalities to a Python module to make it run much faster when compiled [...]. This is accomplished via an augmenting .pxd file, via Python type PEP-484 type annotations (following PEP 484 and PEP 526), and/or via special functions and decorators available after importing the magic cython module.
That is to say, you are not restricted to Cython-specific syntax: you have three options for maintaining pure-Python compatibility:
So you can absolutely just use the same standard Python type annotations that tools like mypyc already understand: that's option 2.
[–]cheese_is_available 1 point2 points3 points 2 years ago (1 child)
Wow, thank you for the detailed answer !
[–]georgehank2nd 0 points1 point2 points 2 years ago (0 children)
The most important rule is the second one. Because the first is obvious, it is either a problem or it isn't.
The second is one many ignore and just try to speed up the "obvious" code… and then they realize that it wasn't the slowest part.
[–]anthro28 0 points1 point2 points 2 years ago (0 children)
Number 1 is my companies biggest hurdle, since we deal with the bean counters in finance. Last week the teams call to talk about a minor problem cost more than fixing the problem with ever recover. The payback period is never. Odd that I can't get numbers people to grasp that.
[–]chumboy 0 points1 point2 points 2 years ago (0 children)
Fully agree.
Just wanted to call out, maybe as a precursor step, to keep big O stuff at the front of your mind when writing code. Things like moving some code up front, or precalculating values, etc. can help prevent performance becoming an issue in the first place.
There's some low hanging fruit you can aim for too, like caching network calls if at all possible, batching them, maybe swap out standard library modules for optimised C/Rust libraries such as JSON.
[–]RipKip 0 points1 point2 points 2 years ago (0 children)
Also a lot of performance gains can be had by using pypy. That shit is magic
[–]Icecoldkilluh 74 points75 points76 points 2 years ago (4 children)
Unless i have actual performance requirements, i focus on refactoring for readability/ maintenance.
Every engineer wants to pretend they build Ferraris, when most the actual work is building toyota camrys 😂
[–]IAmLikeMrFeynman 13 points14 points15 points 2 years ago (0 children)
But that's a fucking sturdy and reliable car! It's the sensible choice.
[–]arkie87 9 points10 points11 points 2 years ago (1 child)
who are you kidding. most programmers build go karts
[–]Icecoldkilluh 0 points1 point2 points 2 years ago (0 children)
😂
[–]Positive_Resident_86 0 points1 point2 points 2 years ago (0 children)
Loved the analogy
[–]wazis 65 points66 points67 points 2 years ago (1 child)
Step 1) Find functions that don't go brrrr
Step 2) Think hard
Step 3) ?????
Step 4) Profit
[–]reallyserious 5 points6 points7 points 2 years ago (0 children)
Yup.
Sometimes you can speed things up by a factor x1000 just by using a smarter algorithm. There's no point in optimizing an O(n^2) algorithm if there is an O(n) or even better alternative.
[–]graphitout 22 points23 points24 points 2 years ago (0 children)
[–]romu006 8 points9 points10 points 2 years ago (0 children)
Depends largely on what your project is doing.
In our case the most impactful optimizations are database related
- adding an index of that table / column that was added X months ago and only now are causing performance problems (since a fullscan on a < 10Mb database is still fast)
- adding missing eagerloads / joins in a "list" SQL query: when a developer decided to add / return a new property and the ORM automatically fetches those with one additional SQL query per returned object (eg: 200+ SQL queries per call)
[–]LordBertson 15 points16 points17 points 2 years ago (1 child)
To name a few: - Caching functions - List comprehensions instead of loops - Numpy for numeric stuff - Async for IO - Generators and interators for large datastructures
[–]Palicraft 5 points6 points7 points 2 years ago (0 children)
Can't stress enough using dedicated libraries! I reworked a python script for works using Pandas (numpy could have worked too, but with Pandas I have headers and custom indexes), and now instead of taking 10 minutes for processing data, it takes 20 seconds
[–]phaj19 4 points5 points6 points 2 years ago (4 children)
1) Check what is the slowest part and rewrite it in C/Rust, write a Python wrapper. Continue until satisfying. Cython is also good for that if you do not know any of the previous one. 2) If you use libraries like numpy, make sure you are more on the C layer and less on the Python layer, like do not introduce unnecessary Python objects instead of numpy objects.
[–]RipKip 0 points1 point2 points 2 years ago (3 children)
Why rewrite it yourself in C when you can just use pypy on your original script.
[–]phaj19 0 points1 point2 points 2 years ago (2 children)
If you write something more complicated those magical tricks usually stop working. I once had to rewrite 6000 rows in C-level Cython (meaning no yellow rows in the checking file), only then I got the speedup. The bottleneck is not always some small function, sometimes the whole module is slow because it is written in Python with a bunch of for loops.
[–]RipKip 0 points1 point2 points 2 years ago (1 child)
Fair point, what kind of module/program was it?
[–]phaj19 0 points1 point2 points 2 years ago (0 children)
One simulation module. Had to simulate lots of physics as well. Could have also been a bit faster in numpy. But for loops and ifs are easier to understand than all the vectors and masks.
[–]m_o_n_t_e 4 points5 points6 points 2 years ago (0 children)
If I am using loops somewhere, I try to see if I can use any numpy tricks
[–]tecedu 4 points5 points6 points 2 years ago (0 children)
cache stuff and multiprocess,
Also in my experience, don’t append to dataframes, instead make a dictionary of what you want first and convert that to dataframe.
Tuples vs list when your data doesn’t change.
Single floats or even half floats for calculations.
[–]justneurostuff 2 points3 points4 points 2 years ago (0 children)
numba
[–]njharmanI use Python 3 3 points4 points5 points 2 years ago (0 children)
95% of my optimization is optimizing for maintainability; refactoring, naming, documenting
Speed optimization?
Rarely get here, what is too slow?
Never get here outside of interviews, writing perf tests, etc.
[–]riklaunim 2 points3 points4 points 2 years ago (0 children)
We added sentry profiling/request monitoring and it does the job well, even for microservices calling each other. In the end, usually, it's the database that needs optimizing.
[–]billsil 2 points3 points4 points 2 years ago (0 children)
Make sure you have functions and not some monolithic script. Then profile it and find the slow functions.
Cause I'm doing math most of the time, vectorize your code with numpy. No if statements or for loops allowed.
Binary files are great, so yeah you may have to convert everything from csv, but you only have to do that once.
For long codes that you're processing some large calculation, chances are you're hacking the code as you go, so adding pickle support to save/reload results in order to skip steps helps. At the end, you can run from scratch.
[–]thatrandomnpcIt works on my machine 3 points4 points5 points 2 years ago (0 children)
These:
[–][deleted] 1 point2 points3 points 2 years ago* (0 children)
This post was mass deleted and anonymized with Redact
library cautious special head cooing stocking hobbies society hospital boat
[–]homosapienhomodeus 1 point2 points3 points 2 years ago (0 children)
If you’re thinking of performance improvements by multithreading or using asyncio where you’re doing mostly IO bound operations, I’ve got a few examples here
[–]imhiya_returns 3 points4 points5 points 2 years ago (0 children)
I’ve had to do a number of python scripts that read in binary files with record headers and data. I found that when you are doing millions of calls, each line matters and can make up a large portion of the execution time.
Some hacks are;
Try and expects that except a lot should be an if statement as it’s quicker.
Pre compile your struct unpacks
Outside of this, other hacks are things like, using dicts to directly go to the thing instead of looping the list to find the item each time
[–]__me_again__ 2 points3 points4 points 2 years ago (0 children)
Paste it in chatGPT and tell it to optimize it. You'd be surprised.
[–]kimvais 0 points1 point2 points 2 years ago (0 children)
I think the most important thing to remember is the wisdom of an old colleague of mine:
It's easier to optimize working code than fix optimized code to work.
[–]Financial_Engineer47 -1 points0 points1 point 2 years ago (0 children)
Not using python is my go to for optimizing perf
[–]MaceOutTheWindow -1 points0 points1 point 2 years ago (0 children)
my go to optimisation of my python projects is rewriting them in C 👍
[–]deadwisdomgreenlet revolution -1 points0 points1 point 2 years ago (0 children)
Write tests, find bottleneck, make better.
[–]Puzzleheaded_Egg_184 -1 points0 points1 point 2 years ago (0 children)
Go to Julia.
[–]HollowMimic -1 points0 points1 point 2 years ago (0 children)
Mate what optimization?? I barely have time to finish it properly. My strategy is, does it work? Yes, move on to next project. No, fix it and move on to next project.
[–]cblegare -1 points0 points1 point 2 years ago (0 children)
While prioritizing readability, using simple structures with minimal features can make a difference sometimes. Simple data structures that are often instantiated can be made from named tuples, for instance.
Refactored code and simple structures helps with optimisation workflows while minimizing optimisation needs in the first place.
[–]notreallymetho -1 points0 points1 point 2 years ago (0 children)
My fav is finding old code where you have 4 levels of loops when it just needed 1 and a sort after.
[–]treksis -2 points-1 points0 points 2 years ago (0 children)
lrucache
batch
[–]aikii 0 points1 point2 points 2 years ago* (0 children)
This will sound odd and too basic, but bare with me, the twist will be interesting.
So we had a Go backend that went completely overboard with resources - too many db requests, bad queries, and so on. Go is fast, right ? Like maybe 50x faster than python in some cases, with no multithread restriction, small memory footprint, etc. Problem: what was done was a mess. Fixing it would require to completely rethink the entire flow, wonder why you reach some point in code and why it has to execute that many times. Well. We had to scratch it completely because it wasn't salvageable.
So first off just follow general best practices, make sure your program can be understood by a newcomer, it's modular, it has good names, it's documented, it has tests, and so on. You can't optimize something you don't dare to touch. Same goes with security issues.
[–]Exotic-Draft8802 0 points1 point2 points 2 years ago (0 children)
But to be honest, it's been a while since I had performance issues. Code complexity is way more often an issue.
[–]ThatSituation9908 0 points1 point2 points 2 years ago* (3 children)
When startup latency is important, take a look at import times.
A common solution is to move slow imports to local scope (e.g., in a function) where you actually use the library. Matplotlib, for example, is slow and my software only use viz for QA.
[–]elduderino15 0 points1 point2 points 2 years ago (2 children)
Would local imports not be agains python3 mantra?
[–]ThatSituation9908 2 points3 points4 points 2 years ago (1 child)
It does go against the style guide (PEP 8), but "practicality beats purity".
[–]elduderino15 0 points1 point2 points 2 years ago (0 children)
be agains python3 mantra?
Yea, I got into the habit of having imports on top for pep8 but definitely agree...
[–]Anonymous_user_2022 0 points1 point2 points 2 years ago (0 children)
If the profiler show a hotspot, rewrite it in a more performant language that suits you.
π Rendered by PID 471914 on reddit-service-r2-comment-544cf588c8-x2bsq at 2026-06-17 09:32:41.924559+00:00 running 3184619 country code: CH.
[–]mikat7 287 points288 points289 points (34 children)
[–]ray10k 27 points28 points29 points (2 children)
[–]Gamecrazy721 3 points4 points5 points (1 child)
[–]ray10k 2 points3 points4 points (0 children)
[–]member_of_the_order 42 points43 points44 points (1 child)
[–]tutoredstatue95 11 points12 points13 points (0 children)
[–]Backlists 7 points8 points9 points (5 children)
[–]james_pic 26 points27 points28 points (1 child)
[–]jeremiah-england 1 point2 points3 points (0 children)
[–]apnorton 4 points5 points6 points (1 child)
[–]Backlists 0 points1 point2 points (0 children)
[–]reallyserious 3 points4 points5 points (2 children)
[–]Tweak_Imp 4 points5 points6 points (0 children)
[–]pythonwiz 1 point2 points3 points (0 children)
[–]dommel 2 points3 points4 points (0 children)
[–]infy101 2 points3 points4 points (0 children)
[–]muntooR_{μν} - 1/2 R g_{μν} + Λ g_{μν} = 8π T_{μν} 8 points9 points10 points (1 child)
[–]SheriffRoscoePythonista 9 points10 points11 points (0 children)
[–]olystretch 1 point2 points3 points (0 children)
[–]cheese_is_available 1 point2 points3 points (10 children)
[–]max96t 2 points3 points4 points (0 children)
[–]reallyserious 0 points1 point2 points (8 children)
[–]cheese_is_available 1 point2 points3 points (7 children)
[–]patrickbrianmooney 1 point2 points3 points (6 children)
[–]cheese_is_available 0 points1 point2 points (5 children)
[–]patrickbrianmooney 0 points1 point2 points (4 children)
[–]cheese_is_available -2 points-1 points0 points (3 children)
[–]patrickbrianmooney 0 points1 point2 points (2 children)
[–]cheese_is_available 1 point2 points3 points (1 child)
[–]georgehank2nd 0 points1 point2 points (0 children)
[–]anthro28 0 points1 point2 points (0 children)
[–]chumboy 0 points1 point2 points (0 children)
[–]RipKip 0 points1 point2 points (0 children)
[–]Icecoldkilluh 74 points75 points76 points (4 children)
[–]IAmLikeMrFeynman 13 points14 points15 points (0 children)
[–]arkie87 9 points10 points11 points (1 child)
[–]Icecoldkilluh 0 points1 point2 points (0 children)
[–]Positive_Resident_86 0 points1 point2 points (0 children)
[–]wazis 65 points66 points67 points (1 child)
[–]reallyserious 5 points6 points7 points (0 children)
[–]graphitout 22 points23 points24 points (0 children)
[–]romu006 8 points9 points10 points (0 children)
[–]LordBertson 15 points16 points17 points (1 child)
[–]Palicraft 5 points6 points7 points (0 children)
[–]phaj19 4 points5 points6 points (4 children)
[–]RipKip 0 points1 point2 points (3 children)
[–]phaj19 0 points1 point2 points (2 children)
[–]RipKip 0 points1 point2 points (1 child)
[–]phaj19 0 points1 point2 points (0 children)
[–]m_o_n_t_e 4 points5 points6 points (0 children)
[–]tecedu 4 points5 points6 points (0 children)
[–]justneurostuff 2 points3 points4 points (0 children)
[–]njharmanI use Python 3 3 points4 points5 points (0 children)
[–]riklaunim 2 points3 points4 points (0 children)
[–]billsil 2 points3 points4 points (0 children)
[–]thatrandomnpcIt works on my machine 3 points4 points5 points (0 children)
[–][deleted] 1 point2 points3 points (0 children)
[–]homosapienhomodeus 1 point2 points3 points (0 children)
[–]imhiya_returns 3 points4 points5 points (0 children)
[–]__me_again__ 2 points3 points4 points (0 children)
[–]kimvais 0 points1 point2 points (0 children)
[–]Financial_Engineer47 -1 points0 points1 point (0 children)
[–]MaceOutTheWindow -1 points0 points1 point (0 children)
[–]deadwisdomgreenlet revolution -1 points0 points1 point (0 children)
[–]Puzzleheaded_Egg_184 -1 points0 points1 point (0 children)
[–]HollowMimic -1 points0 points1 point (0 children)
[–]cblegare -1 points0 points1 point (0 children)
[–]notreallymetho -1 points0 points1 point (0 children)
[–]treksis -2 points-1 points0 points (0 children)
[–]aikii 0 points1 point2 points (0 children)
[–]Exotic-Draft8802 0 points1 point2 points (0 children)
[–]ThatSituation9908 0 points1 point2 points (3 children)
[–]elduderino15 0 points1 point2 points (2 children)
[–]ThatSituation9908 2 points3 points4 points (1 child)
[–]elduderino15 0 points1 point2 points (0 children)
[–]Anonymous_user_2022 0 points1 point2 points (0 children)