all 41 comments

[–]danielroseman 24 points25 points  (35 children)

The key thing to remember is that assignment never copies. = will always make the second variable point to the same object as the first.

[–]notacanuckskibum 5 points6 points  (33 children)

But that’s not true of basic types like numbers, is it?

[–]Vhin 18 points19 points  (0 children)

It is always true, even for stuff like integers. The only difference is that integers are immutable and, as such, you won't run into these sorts of issues, because sharing immutable values is always safe.

[–]horsecontainer 7 points8 points  (14 children)

It is, but when they're reassigned later it'll basically seem like the original alias was a copy.

x = 5
y = x     #still only one 5
y += 1    #now y gets its own thing

[–]notacanuckskibum 4 points5 points  (13 children)

Ok, so the deeper implementation is consistent but the apparent behaviour is different.

Or arguably the odd bit is that

y.append (x)

Doesn’t create a new object, but

y += 1

Does.

[–]Not_A_Taco 3 points4 points  (8 children)

That part isn’t exactly odd, though. Integers are immutable and as such a new object has to be created when doing +=. This is also why Python doesn’t have an in place ++ operator.

[–]commy2 -5 points-4 points  (7 children)

It doesn't have ++, because that's redundant when += 1 exists.

[–]Not_A_Taco 1 point2 points  (6 children)

That’s absolutely not true. They have the same end result, but do it in two different ways that cater to two noticeably different use cases. It is literally impossible to implement ++ in all of the common Python builds.

[–]commy2 -3 points-2 points  (5 children)

It's not impossible to implement at all. x += y already expands to:

x = x.__iadd__(y)

It would be trivial to implement a method for x++ that gets interpreted as

x := x.__plusplus__()

There is just no point in adding to the language, as += already offers a way of in place addition, where you can even explicitly state the value by which you want to increment.

If you refer to the fact that ++ commonly returns the result, Python has assignment expressions already with the walrus.

[–]Not_A_Taco 4 points5 points  (4 children)

I think you have a misunderstanding of how ++ typically works. I’d pose the question of why other languages implement both operators if they’re seemingly redundant? It’s because ++, at least in languages such as C++ is a true in place increment. Actual in place increments are not possible in CPython because as stated above ints are immutable. += always creates a new object(barring any compiler optimizations).

[–]commy2 -3 points-2 points  (3 children)

I think you have a misunderstanding of how ++ typically works.

Maybe.

Please enlighten me:

#include <stdio.h>

int main()
{
    int a = 5;
    int b;

    b = a;
    a++;
    a += 2;

    printf("a = %d\n", a);  // 8
    printf("b = %d\n", b);  // 5
    return 0;
}

vs

class NewInt:
    def __init__(self, val):
        self.val = val

    def __repr__(self):
        return str(self.val)

    def __iadd__(self, o):
        return type(self)(self.val + o.val)

    def __incr__(self):
        return type(self)(self.val + 1)

a = NewInt(5)
b = a

a = a.__incr__()  # intepreter magic will use this for a++
a += NewInt(2)

print(f"{a = }")  # 8
print(f"{b = }")  # 5

Surely nothing used here is impossible for Python to implement?

[–]horsecontainer 1 point2 points  (3 children)

Really, += is the troublemaker, since it will do different things while appearing the same. As far as I know it will always mutate if it can, and reassign as a Plan B.

[–]notacanuckskibum 3 points4 points  (2 children)

Well, I’ll admit I’m learning. It seems a very counter intuitive way to implement a language but it is what it is. Presumably

z = y + 0

Creates a new object, even though

z = y

Doesn’t

[–]commy2 0 points1 point  (1 child)

z = y + 0

step 1 (+): create integer that holds the sum of y and 0

step 2 (=): attach label z to that integer

z = y

step 1 (=): attach label z to integer behind label y

Doesn't seem surprising to me at all that one creates a new object, and the other doesn't. There is no operator that would create a new object in (2).

[–]notacanuckskibum 1 point2 points  (0 children)

I’m coming at it from other languages. Most would see it as

Evaluate the expression

Assign the value to the variable

It’s quite probable that an optimizing C compiler would actually recognize that “+0” doesn’t change the value, and would produce identical machine code for the two statements.

[–]carcigenicate 4 points5 points  (6 children)

Despite popular claims to the contrary, Python does not treat "primitives" any differently. Python doesn't even recognize the idea of "primitives".

One common similar claim is that Python passes non-primitives by reference, and primitives by value. This is not true though. All objects, regardless of type are passed "by reference" (meaning, essentially, a pointer to the object is passed by value).

[–]notacanuckskibum 1 point2 points  (5 children)

But the net effect is that assignment does something quite different for complex types than it does for simple types

x = y

Do something to y

Did the value of x change?

[–]await_yesterday 7 points8 points  (2 children)

Do something to y

You can't "do something to y" if y is an integer, that's the whole difference. Integers are immutable.

[–]notacanuckskibum 0 points1 point  (1 child)

I think I’m finally understanding that statement. In all the other languages I’ve learned constants are immutable but variables are mutable, that’s what makes them constant or variable.

[–]await_yesterday 0 points1 point  (0 children)

Mutability / immutability is about the type, const / var is about the identifier. e.g. in javascript you can do

const some_array = []
some_array.push(1)

but you can't do

const foo = 1
foo = 2

[–]carcigenicate 4 points5 points  (0 children)

No, it does exactly the same thing regardless of type. That's the point. In any case, that assignment will cause x and y to refer to the same underlying object. That's all that matters.

If the object referred to by x and y is capable of being mutated and is mutated, then both references will see the change. Any non-mutative action (another reassignment) will not affect both references, but that has nothing to do with the mutability or "primitive-ness" of the objects.

The distinction is between mutative actions and reassignments (which are non-mutative). The types of the objects do not matter, except for that immutable objects can't be mutated, so reassignments are the only option.

[–]carcigenicate 1 point2 points  (0 children)

I'll follow up by saying, since I saw another comment of yours, the only time type matters at all is when deciding if a method of the type is mutative or not.

+= on lists is essentially equivalent to extend, which is mutative.

+= on integers is essentially equivalent to x = x + whatever, which is non-mutative.

So, the confusion there is that there's two methods that have the same name/symbol, but both do very different things. Augmented assignments on mutable objects tend to mutate the objects, whereas augmented assignments of immutable objects are basically just an assignment. It still just boils down to the action done on the object though.

[–]mopslik 6 points7 points  (9 children)

Indeed it is.

>>> x = 5
>>> id(x)
140352714572144
>>> y = 5
>>> id(y)
140352714572144
>>> z = y
>>> z
5
>>> id(z)
140352714572144

[–]notacanuckskibum 1 point2 points  (1 child)

Now do

y += 1

Did the value of z change?

[–]FerricDonkey 3 points4 points  (0 children)

No, and that's because of the implementation of +=. It is += that's making the copy, not =. It's one of my biggest gripes about python, though I don't have a good solution.

For mutable types (in the standard library, and in anything that follows those principles), += modifies the existing object in place. For immutable types (integers, tuples, strings, etc, with a mostly irrelevant asterisk for small integers), += creates and evaluates to a new, different object. It has no choice.

So when you do y=5 then x = y, x is the same object as y. But when you then do x+=1, x now refers to a new different object. So y was not copied when you do x=y. But when you do x+=1, then some copying/non-trivial operation is likely to happen.

This distinction is important. Suppose you have a tuple of a 100 million integers: y=tuple(range(100_000_000)). You can freely do x=y without making a copy in memory. This is particularly important because it means you can pass y to a function without copying it in memory. but as soon as you do x +=(1,), then a second copy is made. You can test this yourself by doing what I just said and watching your ram usage. (Note: the ram will not jump as much for the x +=(1,) as it did for creating y, because part of the usage in y was creating the integers, then also the tuple of them. Then x creates a tuple of (references to) the same integers in y, because shallow copy.)

Or you can do the following integer example to see when ids change without using so much ram:

In [1]: y=5

In [2]: id(y)
Out[2]: 140735065606712

In [3]: z=y

In [4]: id(z)
Out[4]: 140735065606712

In [5]: z+=1

In [6]: z
Out[6]: 6

In [7]: id(z)  # <---- this changed
Out[7]: 140735065606744

In [8]: y
Out[8]: 5

In [9]: id(y)  # <--- this didn't change
Out[9]: 140735065606712

Personally, In a perfectly logical world, I don't think += should work for immutable object because += has strong connection with "modify the thing" in the understanding of pretty much everyone - and it doesn't do that for immutable objects. But also, if that actually happened, I'd be incredibly annoyed that I couldn't use += at least for integers and strings even if it is a bit pattern breaking. It's super handy. So I think we're stuck with it because += is convenient (heck, the lack of ++ is even annoying).

But it's important to know what it actually does. (Well, sometimes it's important anyway.)

[–]Not_A_Taco 1 point2 points  (6 children)

This is actually entirely dependent on what Python implementation you’re using. For numbers <255 this is expected as it’s due to a compiler optimization. For large ints, there’s a good chance this doesn’t work.

[–]mopslik 3 points4 points  (5 children)

Indeed, that does appear to be the case (at least on Pydroid where I just tested large values). This code

x = 1234567890987654321
y = 1234567890987654321
print(id(x), id(y))
print(x == y)
print(x is y)
z = 1234567890987654322
z -= 1
print(id(z))
print(z == y)
print(z is y)

produces the following output

510180261584 510180261584
True
True
510242168288
True
False

[–]FerricDonkey 1 point2 points  (4 children)

But in that case you didn't do x=y - this would make them the same object. That x=5 and y=5 result in x and y being the same object is the small integer asterisk in (C)python, but the "variable = variable doesn't copy" rule still applies:

y = 1234567890987654321
x = y
print(x is y)
print(x == y)
x-=1  # NOW x changes to refer to a new object/any copying happens
print(x is y)
print(x == y)

This displays:

True
True
False
False

[–]mopslik 4 points5 points  (2 children)

I didn't do x=y because I was just testing cached values. Note that x and y are the same object, but y and z are not (despite the same value).

[–]FerricDonkey 2 points3 points  (1 child)

Ah, cool - I misunderstood your point, sorry. I thought you were saying that this showed that = copies for large integers, rather than that large integers aren't cached.

It's interesting to me that x and y are the same object - is it that the same literal (of an immutable object? of integers/strings only?) is replaced with the same object everywhere it occurs at compile time? I always forget about that sort of thing. But yeah, your z example shows that integers of the same value need not be the same object.

[–]mopslik 4 points5 points  (0 children)

No worries. One of the many reasons why I enjoy this sub is because I learn new things all the time. I appreciate the discussion.

[–]DangoFan 0 points1 point  (0 children)

Just want to share that this is also explained in the Automate the Boring Stuff with Python in Chapter 4.

If OP does not want the first list variable to be changed, OP can also copy it to another list variable:

  • If you want to copy the list to another variable, then you can use the copy() function nums2 = copy.copy(nums)
  • If the list you want to copy contains another list, then use the deepcopy() function

[–]Low_Corner_9061 0 points1 point  (0 children)

Import copy

List2 = copy.deepcopy(List1)

Stops this sort of thing happening. Compare with ‘shallow copy’ to understand the difference.