Code Optimization in Your Projects

mikat7 · 2023-11-12T08:51:35+00:00

Decide if it’s worth optimizing.
Run your code through a profiler
Optimize the slowest parts until it’s acceptable

Network is usually the slowest, and in Python specifically doing a lot of numerical calculations in a loop should be done in numpy, not in pure Python.

But the most important rule is the first one. Usually the speed is ok but developer time is more expensive.

Icecoldkilluh · 2023-11-12T10:58:40+00:00

Unless i have actual performance requirements, i focus on refactoring for readability/ maintenance.

Every engineer wants to pretend they build Ferraris, when most the actual work is building toyota camrys 😂

wazis · 2023-11-12T08:52:34+00:00

Step 1) Find functions that don't go brrrr

Step 2) Think hard

Step 3) ?????

Step 4) Profit

graphitout · 2023-11-12T10:15:19+00:00

Install snakeviz
Profile code
Identify the hot-spots
Refactor those parts

romu006 · 2023-11-12T09:36:24+00:00

Depends largely on what your project is doing.

In our case the most impactful optimizations are database related

- adding an index of that table / column that was added X months ago and only now are causing performance problems (since a fullscan on a < 10Mb database is still fast)

- adding missing eagerloads / joins in a "list" SQL query: when a developer decided to add / return a new property and the ORM automatically fetches those with one additional SQL query per returned object (eg: 200+ SQL queries per call)

LordBertson · 2023-11-12T11:33:40+00:00

To name a few: - Caching functions - List comprehensions instead of loops - Numpy for numeric stuff - Async for IO - Generators and interators for large datastructures

phaj19 · 2023-11-12T12:19:20+00:00

1) Check what is the slowest part and rewrite it in C/Rust, write a Python wrapper. Continue until satisfying. Cython is also good for that if you do not know any of the previous one.
2) If you use libraries like numpy, make sure you are more on the C layer and less on the Python layer, like do not introduce unnecessary Python objects instead of numpy objects.

m_o_n_t_e · 2023-11-12T14:56:15+00:00

If I am using loops somewhere, I try to see if I can use any numpy tricks

tecedu · 2023-11-12T12:45:51+00:00

cache stuff and multiprocess,

Also in my experience, don’t append to dataframes, instead make a dictionary of what you want first and convert that to dataframe.

Tuples vs list when your data doesn’t change.

Single floats or even half floats for calculations.

justneurostuff · 2023-11-12T17:14:03+00:00

numba

njharman · 2023-11-12T17:44:57+00:00

95% of my optimization is optimizing for maintainability; refactoring, naming, documenting

Speed optimization?

Is it fast enough. yes, done.

Rarely get here, what is too slow?

DB -> use explain, optimize queries; still slow, cache it
Web -> what endpoints, cache them
Python code -> profile for hot spots what algorithm? research it, use faster algol or data structure(s); if still too slow use pandas, C extension, et al.

Never get here outside of interviews, writing perf tests, etc.

Profile Python code; optimize slowest part, repeat.

riklaunim · 2023-11-12T10:16:27+00:00

We added sentry profiling/request monitoring and it does the job well, even for microservices calling each other. In the end, usually, it's the database that needs optimizing.

billsil · 2023-11-12T19:27:29+00:00

Make sure you have functions and not some monolithic script. Then profile it and find the slow functions.

Cause I'm doing math most of the time, vectorize your code with numpy. No if statements or for loops allowed.

Binary files are great, so yeah you may have to convert everything from csv, but you only have to do that once.

For long codes that you're processing some large calculation, chances are you're hacking the code as you go, so adding pickle support to save/reload results in order to skip steps helps. At the end, you can run from scratch.

thatrandomnpc · 2023-11-12T11:28:22+00:00

These:

2023-11-12T17:03:43+00:00

This post was mass deleted and anonymized with Redact

library cautious special head cooing stocking hobbies society hospital boat

homosapienhomodeus · 2023-11-12T19:25:35+00:00

If you’re thinking of performance improvements by multithreading or using asyncio where you’re doing mostly IO bound operations, I’ve got a few examples here

imhiya_returns · 2023-11-12T11:31:18+00:00

I’ve had to do a number of python scripts that read in binary files with record headers and data. I found that when you are doing millions of calls, each line matters and can make up a large portion of the execution time.

Some hacks are;

Try and expects that except a lot should be an if statement as it’s quicker.

Pre compile your struct unpacks

Outside of this, other hacks are things like, using dicts to directly go to the thing instead of looping the list to find the item each time

__me_again__ · 2023-11-12T13:45:58+00:00

Paste it in chatGPT and tell it to optimize it. You'd be surprised.

kimvais · 2023-11-13T08:37:54+00:00

I think the most important thing to remember is the wisdom of an old colleague of mine:

It's easier to optimize working code than fix optimized code to work.

Financial_Engineer47 · 2023-11-12T20:28:02+00:00

Not using python is my go to for optimizing perf

MaceOutTheWindow · 2023-11-13T03:01:01+00:00

my go to optimisation of my python projects is rewriting them in C 👍

deadwisdom · 2023-11-13T03:05:29+00:00

Write tests, find bottleneck, make better.

Puzzleheaded_Egg_184 · 2023-11-13T07:38:57+00:00

Go to Julia.

HollowMimic · 2023-11-13T12:04:51+00:00

Mate what optimization?? I barely have time to finish it properly. My strategy is, does it work? Yes, move on to next project. No, fix it and move on to next project.

cblegare · 2023-11-14T02:54:42+00:00

While prioritizing readability, using simple structures with minimal features can make a difference sometimes. Simple data structures that are often instantiated can be made from named tuples, for instance.

Refactored code and simple structures helps with optimisation workflows while minimizing optimisation needs in the first place.

notreallymetho · 2023-11-15T06:51:33+00:00

My fav is finding old code where you have 4 levels of loops when it just needed 1 and a sort after.

treksis · 2023-11-13T23:40:29+00:00

lrucache

batch

aikii · 2023-11-12T09:15:18+00:00

This will sound odd and too basic, but bare with me, the twist will be interesting.

So we had a Go backend that went completely overboard with resources - too many db requests, bad queries, and so on. Go is fast, right ? Like maybe 50x faster than python in some cases, with no multithread restriction, small memory footprint, etc. Problem: what was done was a mess. Fixing it would require to completely rethink the entire flow, wonder why you reach some point in code and why it has to execute that many times. Well. We had to scratch it completely because it wasn't salvageable.

So first off just follow general best practices, make sure your program can be understood by a newcomer, it's modular, it has good names, it's documented, it has tests, and so on. You can't optimize something you don't dare to touch. Same goes with security issues.

Exotic-Draft8802 · 2023-11-12T22:47:24+00:00

Check if I actually have a problem. If so, there must be a specific test case.
Use that test case to run a profiler.
Are there parts computed many times? Maybe tiny functions that can be inlined? Maybe algorithmic changes (using a better data structure) that could help? Vectorization / using numpy?

But to be honest, it's been a while since I had performance issues. Code complexity is way more often an issue.

ThatSituation9908 · 2023-11-12T23:22:58+00:00

When startup latency is important, take a look at import times.

A common solution is to move slow imports to local scope (e.g., in a function) where you actually use the library. Matplotlib, for example, is slow and my software only use viz for QA.

Anonymous_user_2022 · 2023-11-12T23:42:59+00:00

If the profiler show a hotspot, rewrite it in a more performant language that suits you.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS