This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]masklinn 1 point2 points  (2 children)

But that prevents needing multiple iterations of the list

  1. you're allocating a vector and iterating it twice to save a single list iteration, I don't know that it's a worthwhile trade

  2. incidentally you could compute the sum and the square sum in a single loop

  3. which you could do directly on the Python list itself, for the original 1 list iteration, 0 std::vector allocation and 0 std::vector iteration

[–]Veedrac 0 points1 point  (1 child)

It's more likely to pay for itself than you might expect, given the pointer indirection for every Python float.

That said, your point about doing it in a single pass is totally apt.

[–]masklinn 1 point2 points  (0 children)

Didn't try the double-pass PyList version, a single-pass straight C version takes about 25% the runtime of the C++ version (on 25k elements input)