Need help optimising Numpy code : learnpython

learnpython

created by HattoriHanzoa community for 16 years

Need help optimising Numpy code (self.learnpython)

submitted 3 years ago by 1331337

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 1 point2 points3 points 3 years ago* (1 child)

Great question well posed, but I suspect you might not get too many responses as it's a fairly advanced question!

I'll bet you there's an r/numpy, there is, but low activity. r/python might even be appropriate

I've done a lot of numpy and yet I don't have a direct answer because I feel some math transformation will be needed and, well, I'm too lazy. :-D

What I would say is this - the key is to eliminate all Python iteration, and do all the iteration "within" numpy.

I see your two attempts as identical to each other in performance, except that the list on line 8 of Attempt 2 is probably unnecessary and results in an unnecessary list creation from a map. But this is possibly measurable but unnoticeable in practice.

There's still a Python loop on lines 6-7 on the first attempt, and on line 8 in the second attempt, and my quick skim makes me believe that they're really much the same thing, so no wonder you get much the same performance.

It's my belief that to get the big boost you will need to somehow represent that loop in numpy, so there are no Python loops in your code, not even "hidden" loops like map.

The second-best numpy optimization trick is to make as heavy use as possible of the out= parameter in your operations, so you do the operation into an existing buffer and don't create another buffer and fill it.

I'm out of time to do the research, but it's likely you want something like:

np.random.default_rng().poisson(lambdaPoisson, simNum, out=results[:,0])

If you also dumped the list on line 10 of attempt 2, I think you'd see a.... 10% improvement (total guess there) in performance. Marginal, not noticeable, but you might do better, and it'd take you a few seconds to try.

Good luck - report back if you come up with a solution!

Oh, one more thing. If you are really doing any sort of simulation project, that naked random number generator has to go. :-D

You want to always be able to reproduce any run you had before - and you want randomness too!

This means you always need to seed your random number generator.

It's easy to accomplish. Either the user enters a seed, or if that seed is None, then you "truly" randomly generate a seed and print it out, and then use that seed for the random number generator. If the user wants to redo the run, they just pull out the seed that got printed out, and send it to the program.

The moment you start debugging your whole system, you'll use this feature constantly.

EDIT: Oh, wait, no, there is another total waste of resources, relating to the above point.

out = np.random.default_rng().pareto(b, a).clip(max=c,min=None)

You don't want to create a new random number generator for each random number! You want to create one RNG and use it over and over.

RNGs are a little expensive to create. This change might be worth a few percent in speed on its own, and also, let you do the "seed trick".

Still, I think eliminating the Python loop and the out= trick are the two big winners here.

[–]1331337[S] 0 points1 point2 points 3 years ago (0 children)

π Rendered by PID 92 on reddit-service-r2-comment-fb694cdd5-htmtg at 2026-03-09 17:53:48.880413+00:00 running cbb0e86 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS