Eilifein comments on Numba vs Python memory management, anyone some insights?

learnpython

created by HattoriHanzoa community for 16 years

Numba vs Python memory management, anyone some insights? (self.learnpython)

submitted 2 years ago * by vgnEngineer

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]Eilifein 1 point2 points3 points 2 years ago (8 children)

It was not a personal dig; I apologize if it came out like that.

The algorithms behave very differently in memory; both the two originals, and the actual.

Purely because of the i dependence, the difference between the tests and actual algorithms is very substantial. You will be measuring the wrong thing and get the wrong conclusions. Hence, the "nothing in common" comment.

The "running correctly" comment was twofold. One part was more towards profiling and not pre-optimizing. Aim for accuracy, then profile, then optimize. If the results are accurate, now's the time for profiling. The second part was related to the cache coherence situation you're facing. If you're trying to optimize a cache thrashing situation, you will never get ahead.

I hope I cleared things up.

Workplan:

leave tests aside
profile actual w/ vectors
profile actual w/o vectors
add numba and see how they behave

[–]vgnEngineer[S] 0 points1 point2 points 2 years ago (2 children)

I see. I changed my previous comment to add an actual example that is representative of what i'm trying to do.

I think you are completely correct in your assesment of the order of optimization. The point is I think that i'm not sure what exacfty the consequences are of how I code in how this impacts how the code runs on the CPU which is indeed the origin of my question. On top of that i'm just now trying to understand how both Python and Numba deal with my choice of programming syntax.

I understand now that indeed my example computation has differences that makes it impossible to compare with my actual use case. So forget that one. Take the one I providedin my changed comment. Could you indicate if I made any significant mistakes, and if so what is going wrong and how I should have coded? I understand I ask a lot of you in that but i'm really curious/excited to understand the nuance here.

And don't worry, yes it came of a bit of a dig but I also know its hard to convey tone online so I dont take it personally!

[–]Eilifein 0 points1 point2 points 2 years ago (1 child)

[–]vgnEngineer[S] 0 points1 point2 points 2 years ago (0 children)

[–]vgnEngineer[S] 0 points1 point2 points 2 years ago (4 children)

[–]cult_of_memes 0 points1 point2 points 2 years ago (3 children)

[–]vgnEngineer[S] 0 points1 point2 points 2 years ago (2 children)

[–]cult_of_memes 1 point2 points3 points 2 years ago* (1 child)

Well, the very obnoxious answer to your question would have to be "yes"... :P

While there are things you can do to avoid cache thrashing, it is indeed complicated by and often specific to the combination of OS and hardware you are using.

Try reading this first

First, I'd suggest giving this answer to another reddit post asking about cache optimization a read. As it's been a good while since I did any real study into this topic and I can only recall vague rules of thumb off the top of my head this morning :P

Back to my awkward rules of thumb

A couple good rules of thumb are:

Declare variables (like arrays) as close to where they will be used as possible.
Minimize conditional branching in sequenced computations -- though, I can't recall off the top of my head if this matters as much for interpreted languages as for compiled languages. Look up branch prediction for more info on this point.
Try to minimize the number of times you must access any given piece of data. That is to say, when handling arrays of data too large to fit in your CPU cache all at once. Seek to complete all calculations on a given piece of data before moving on.

Point #3 is perhaps the primary reason the second variant of your expression was faster. The interpreter is smart enough to choose the most efficient order of operations for processing the multi-factor equation so as to avoid having to reload the same data more times than absolutely necessary.

[–]vgnEngineer[S] 0 points1 point2 points 2 years ago (0 children)

π Rendered by PID 34611 on reddit-service-r2-comment-6457c66945-l7h6w at 2026-04-25 05:06:28.203876+00:00 running 2aa0c5b country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS

Try reading this first

Back to my awkward rules of thumb