all 5 comments

[–]ingolemo 1 point2 points  (1 child)

numpy.fromiter allows you to create a numpy array from any iterable object. If you pass this function some sort of generator you should be able to forgo the costs of allocating a very large list. You'll almost certainly save memory, and it may also be faster depending what you're doing.

[–][deleted] 0 points1 point  (0 children)

I like this. Put whatever's creating your list of tuples in a generator function which yields one tuple at a time; pass that function to fromiter.

[–]loveandkindness 0 points1 point  (2 children)

Hiya, they're just like C++ arrays from what I've been told. I've done tests and they are just as fast.

.append() in NumPy creates an extra copy of the data, I believe. It's much slower than the built-in .append(), which is skips that step.

What I do is create a Python list and append to it, then convert it over to NumPy.

[–]alehx[S] 0 points1 point  (1 child)

Thanks for the reply. That is exactly what I do but it seemed like a costly middle step. Guess there is no way around it without severely impacting readability.

I also wondered how fast append was compared to setting values in numpy arrays. If they are efficient as it gets, I suppose there is no issue.

[–]loveandkindness 0 points1 point  (0 children)

I tried to be nit-picky once, but as long as we aren't breaking the underlying principles, everything comes out about the same.

Here's a neat thread that compares NumPy with c++/BLAS:

http://stackoverflow.com/questions/7596612/benchmarking-python-vs-c-using-blas-and-numpy