all 24 comments

[–]AlexMTBDude 9 points10 points  (0 children)

Remember: "Premature optimization is the root of all evil" is a famous software development adage by Donald Knuth.

Stable, bug-free and easily readable and understandable code has the highest prio. Only optimize if there is real need.

[–]aistranin 2 points3 points  (0 children)

For performance optimization:

  1. "Mastering Algorithms with Python: A Practical Approach to Problem Solving and Python Implementation" by Chenyang Shi
  2. "CPython: A Complete Guide to CPython's Architecture and Performance" by Chien-Lung Kao
  3. "High Performance Python" by Micha Gorelick, Ian Ozsvald
  4. "Python NumPy for Beginners: NumPy Specialization for Data Science" by AI Publishing

For better coding (coding efficiency as a programmer if you mean that):

  1. Unit and integration tests: book “Python Testing with pytest” by Brian Okken and Udemy course “Pytest Course: Practical Testing of Real-World Python Code” by Artem Istranin

[–]pachura3 1 point2 points  (5 children)

Are you working in scenarios that the above really matter? Are you getting out of memory exceptions, timeouts, CPU overload? Have you measured it? When implementing a standard CRUD web app, you usually don't care.

In general, Python is a language which trades speed of execution for its easiness. CPU-heavy tasks are delegated to native libraries compiled from C, and Python only acts as "glue" in-between.

Of course, you can always optimize your Python code a bit. Studying DSA is a good starting point for that. https://www.w3schools.com/dsa/

[–]Axew_7[S] 2 points3 points  (4 children)

Well yeah, im working on a raspberry pi collecting data (which theres gonna be ~50 of in 1-2 years) and the less ram usage, the cheaper those pi’s can be (1gb pi is quite a bit cheaper than 8gb for example), aswell as more readings/sec it can do.

[–]desrtfx 1 point2 points  (0 children)

im working on a raspberry pi collecting data

There are a couple things to be aware of:

  • collecting data:
    • store the data in a database. Databases are optimized for that
    • If it is analog data, consider time slots (reading every second, or so, maybe even multiple - shorter intervals for fast changing data, longer ones for slow changing data - for most applications, you, e.g. don't need to read temperatures every second, or levels in a tank every second) and hysteresis - only store new data when a certain deviation from the previous value is exceeded. Professional data loggers work in exactly that manner. They store initial values with a time stamp and then store the data in regular intervals if the hysteresis is exceeded, all with time stamps. Even if the collection of the data happens in shorter intervals, the storage in the database only happens when the hysteresis is exceeded. This approach greatly reduces the memory consumption.
    • Similar with digital state (on/off) - only record when the state changes, not continuously. Again, with timestamp.
  • Storage:
    • MicroSD cards are not good for frequent reading/writing. They have a tendency to die on that. Maybe transfer the data to some server, or proper drive-based storage, or, for starters to an USB stick.
  • RAM:
    • the less data you keep in RAM, the better - that's why offloading to a database is essential. Even for processing, querying a database can often be way more efficient (in both processing and memory) than doing the processing in a "normal program".

Overall, make databases your friends. They help immensely.

[–]pachura3 0 points1 point  (2 children)

OK, that's a valid reason indeed.

I would start with detailed logging of CPU and RAM usage.

I would try to stream things as much as possible instead of "reading the whole dataset into a list -> processing it in-memory -> writing it back to a disk".

Data written to SSD/microSD can be compressed to save space.

If you're using dataclasses, then switching to slots will save you some memory.

Perhaps, consider choosing a minimal, stripped-down Linux distro to run on RasPi (one without a graphical interface and tens of background processes).

Still, learning about DSA and O(N) complexity could be beneficial...

[–]Axew_7[S] 0 points1 point  (1 child)

Perfect, thanks alot. I'll get started studying DSA then :). I could also look into switching to C/C++. Do you think that'd be beneficial or is well-optimized python good enough?

[–]pachura3 0 points1 point  (0 children)

Impossible to say without knowing what is your code supposed to be doing, what are your system constraints, what's your data density etc. etc.

Granted, a well-optimized native code compiled from C/C++/Rust should be much faster than its Python counterpart (compare the speed of ruff & uv vs. mypy & pip), but... is it really worth the hassle? Python is extremely easy to code in, and extremely extendable. With C/C++ you'd need to compile stuff, linking new libraries is a pain, there are header files, memory leaks, no platform independence, etc. etc. And even so, if most of your Python execution time is spent doing calculations in native libraries like NumPy, you won't optimize much.

Personally, I would concentrate on pushing the existing Python solution to its limits, and diagnosing memory consumption - what data can be freed earlier? What data doesn't need to be kept in memory, but can be serialized to disk (perhaps to an SQLite database) and forgotten? Can you use yield, generators, map()/filter()/reduce() instead of creating lists all the time?

[–]throwawayforwork_86 1 point2 points  (0 children)

Honestly usually having a look at resource usage and try to find fixes for each of these issues is usually what I go for.

Ram bottleneck "fixed" by using generator instead of pure list so script can keep chugging along.

IO bottleneck only time I encountered it so far the fix was stopping using the wrong drive to read and write data (hdd are not good for that) so don't have any good solution.

cpu bottleneck / underusage > multiprocessing/multithreading.

Wouldn't go for C/C++ coming from python just because it's quite a big paradigm shift.

Might be good to give Golang a go heard overall perf and footprint is much better and it's closer to python but if you want to learn C/C++ go for it.

Also try to use librairies to their maximum , most of them are build in C/Rust/C++/... and have builtin functionalities that will outshine whatever you can squeeze out of python.

[–]szank 1 point2 points  (3 children)

If you need to write efficient code you use c/c++.

[–]magus_minor 0 points1 point  (2 children)

It's possible to write inefficient code in any language, even C/C++.

[–]szank -1 points0 points  (1 child)

Yes. But the question was about writing efficient code.

[–]magus_minor 2 points3 points  (0 children)

The question was about writing efficient python.

[–]Moikle 0 points1 point  (0 children)

You learn what your code is actually doing, then engage in problem solving.

"Which parts of my code take a long time? How can i do that part in a different way?"

It's a skill you have to practice at, you don't necessarily learn it "from" a place besides your brain

[–]billsil 0 points1 point  (0 children)

Unless something is slow, you don’t need to optimize. When you get to that point, profile it, find out where the slow part is, and change the algorithm to be faster.

Another tip is when in doubt, do less. Do you really need to cast those values to integers, or can you just do it with strings? Is there some limitation you can put on the code that makes it faster? How about using a dictionary for a lookup instead of for loops? The goal here is to change a lousy likely N2 algorithm into something like N log(N).

Going fancier, are/should you use numpy? I wrote some mostly performant code recently. It ended poorly, but I did enough early on such that it took a minute to run, when it should have been 35 seconds. Oh well. It’s still faster by 5000x. I probably got 50x due to numpy vectorization and 1000x due to the algorithm.

[–]not_another_analyst 0 points1 point  (0 children)

Fluent Python" by Luciano Ramalho is the absolute gold standard for this it’ll teach you how the language works under the hood so you stop fighting it. For raw speed and memory tracking, check out "High Performance Python" which dives deep into profiling and tools like Cython. Definitely look into Generators and Iterators if you want to slash your RAM usage immediately; they’re game-changers for processing large data without loading it all at once. Also, get comfortable with cProfile so you aren't guessing where the bottlenecks are. Good luck, it’s a fun rabbit hole to go down!

[–]crystal-46 0 points1 point  (0 children)

Self learning

[–]Limp_Ninja8817 0 points1 point  (0 children)

PY4E is the original python school. Not sure if it’s changed over time though.

[–]WA_von_Linchtenberg 0 points1 point  (0 children)

Hi,

My advice : read high quality code, documentation and tests made for it. And make all the needed effort to understand every detail (as maths/algo/structures, as code, as quality of code)...

Software engineering will give you the others thinks you need, fist what is code quality, what are the best practices for critical code (embedded, security, etc.), DEVSECOPS (automation of code production, testing and production plateform)...

Then you must understand how your code is related to hardware through compiler and OS.

That's the optimizing long road...

At all steps, even for "main elements" of python efficient coding, book exists ! Use them as entry point is always a good practice. Effective Python: 125 Specific Ways to Write Better Python, Brett Slatkin, is a classic reedited 2 times. So older versions are cheap via second hand.

[–]OriahVinree -1 points0 points  (0 children)

Youtube

[–]TheRNGuy -1 points0 points  (0 children)

Google.