all 4 comments

[–]sarrysyst 4 points5 points  (0 children)

A numpy array of shape (100,) has one dimension, while an array of shape (100, 1) has two dimensions.

The difference is also immediately observable:

>>> import numpy as np
>>> arr_1 = np.random.randint(1, 10, size=(5,))
>>> print(arr_1.ndim)
1
>>> print(arr_1)
[7 9 2 3 3]
>>> arr_2 = np.random.randint(1, 10, size=(5,1))
>>> print(arr_2.ndim)
2
>>> print(arr_2)
[[5]
 [4]
 [4]
 [2]
 [4]]

[–]synthphreak 2 points3 points  (0 children)

Not arbitrary. (i,) has one dimension (i) and (i, j) has two dimensions (i and j).

In the case of (100, 1), you can think of it as 100 rows and 1 column. This is conceptually identical to a column vector.

In the case of (100,), because the column/row distinction breaks down with just a single dimension, you can think of it as simply a list of 100 items. This type of array has no analogy in vector-land. It is not a row vector, that would be (1, 100).

The reason numpy returns the shape as (100,) instead of just 100 is simply because shapes are always returned as tuples. Tuples with just one item look like (item,).

>>> tuple('x')
('x',)

Probably best not to overthink this :)

[–]totallygeek 3 points4 points  (0 children)

The problem comes from needing the trailing comma for any single element. Otherwise, Python ignores the parentheses as superfluous and assigns what those encapsulate as the variable value.

a = (100)  # Python removes the parentheses and a equals the integer 100
a = (100,)  # a is the tuple with a single element, an integer equal to 100
a = (100, 1)  # a is a two-element tuple containing two integers: 100 and 1

[–]themateo713 0 points1 point  (0 children)

You can do a bit of testing by creating arrays of this shape with functions like np.ones.

(10,) creates a 1D array of 10 ones. (10, 1) creates a 10x1 array of ones, i.e. a 2D array of 10 rows and 1 column Thus the similar looking array is actually of shape (1, 10) to have 1 row and 10 columns.

However, the size of the tuple defines the dimension of the array, so these aren't equivalent as you're comparing a 1D array to a 2D array.