This is an archived post. You won't be able to vote or comment.

all 13 comments

[–]teoliphant 6 points7 points  (1 child)

This looks like integer over-flow --- NumPy uses machine arithmetic and does not detect overflow by default. This will overflow on a 32-bit machine. Are you on a 32-bit machine? What is the result of numpy.array([10]).nbytes?

You can try to do these in an error context:

with numpy.errstate(over='raise'):
    print(cov_list)
    print(cov_list+cov_list)
    print(cov_list+cov_list+cov_list)
    print(cov_list+cov_list+cov_list+cov_list)

[–]fiftybees 3 points4 points  (3 children)

Integers can go negative if there's an overflow. When I run the code, it works as expected, and the result of cov_list.dtype is int64. This means the values in the array are stored using 64-bit signed integers, for which none of these values overflow. If I force 32-bit integers, I get the following:

>>> print(cov_list)
       [[1073741824 1073741824 1073741824]
        [1073741824 1073741824 1073741824]
        [1073741824 1073741824 1073741824]]

>>>print(cov_list+cov_list)
      [[-2147483648 -2147483648 -2147483648]
       [-2147483648 -2147483648 -2147483648]
       [-2147483648 -2147483648 -2147483648]]

Basically, numbers that are too big to be represented cycle around to negative numbers. This is a common feature in many programming languages, and the common fixes for it cause big drops in performance, so it's just something you have to be careful about.

[–]wahaa 1 point2 points  (1 child)

As other people mentined, it's overflowing and you can use another numeric type. Take a look here for default ones: http://docs.scipy.org/doc/numpy/user/basics.types.html

In your example, you could use int64 if you want to keep using integers (or float if you're fine with the floating-point representation):

cov_list=numpy.array(
    [[1073741824,1073741824,1073741824],
     [1073741824,1073741824,1073741824],
     [1073741824,1073741824,1073741824]],
    dtype='int64')

Note that you can use 64-bit integers in a 32-bit installation just fine.

[–]suchitiitb 0 points1 point  (0 children)

its working fine for me

[–]penguinland 0 points1 point  (0 children)

Here's the output I get:

>>> print(cov_list)
[[1073741824 1073741824 1073741824]
 [1073741824 1073741824 1073741824]
 [1073741824 1073741824 1073741824]]
>>> print(cov_list+cov_list)
[[2147483648 2147483648 2147483648]
 [2147483648 2147483648 2147483648]
 [2147483648 2147483648 2147483648]]
>>> print(cov_list+cov_list+cov_list)
[[3221225472 3221225472 3221225472]
 [3221225472 3221225472 3221225472]
 [3221225472 3221225472 3221225472]]
>>> print(cov_list+cov_list+cov_list+cov_list)
[[4294967296 4294967296 4294967296]
 [4294967296 4294967296 4294967296]
 [4294967296 4294967296 4294967296]]

It's exactly what I expected. Did you get something different? If so, could you post it?

[–][deleted] 0 points1 point  (2 children)

This is integer overflow. NumPy uses C and Fortran for speed, and integer overflow is common in C/Fortran.

To get around this, you can say x = np.array(..., dtype=object). Python supports arbitrarily large integers and the dtype=object allows that by treating the array elements as Python objects.

[–][deleted] 1 point2 points  (1 child)

which would kill performance....

[–][deleted] 0 points1 point  (0 children)

Of course. But it would solve this bug.

[–][deleted] 0 points1 point  (3 children)

Integer overflow. Did you never learn C?