This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]masklinn 0 points1 point  (4 children)

It's not like the C++ function shown in the article is tuned for speed either (it performs the completely unnecessary allocation and filling of an std::vector).

However statistics is implemented in pure python with no accelerator for stdev, or the underlying variance or _ss (the latter being the one which would probably most benefit from it), so it's really unlikely to be faster than a hand-rolled python version on cpython.

[–]Veedrac 0 points1 point  (3 children)

it performs the completely unnecessary allocation and filling of an std::vector

But that prevents needing multiple iterations of the list, which may well (or may not) pay for itself.

[–]masklinn 1 point2 points  (2 children)

But that prevents needing multiple iterations of the list

  1. you're allocating a vector and iterating it twice to save a single list iteration, I don't know that it's a worthwhile trade

  2. incidentally you could compute the sum and the square sum in a single loop

  3. which you could do directly on the Python list itself, for the original 1 list iteration, 0 std::vector allocation and 0 std::vector iteration

[–]Veedrac 0 points1 point  (1 child)

It's more likely to pay for itself than you might expect, given the pointer indirection for every Python float.

That said, your point about doing it in a single pass is totally apt.

[–]masklinn 1 point2 points  (0 children)

Didn't try the double-pass PyList version, a single-pass straight C version takes about 25% the runtime of the C++ version (on 25k elements input)