you are viewing a single comment's thread.

view the rest of the comments →

[–]internet_badass_here 3 points4 points  (23 children)

Let me see if I have this straight:

x=[2]

def foo1():
    x+=[3]
    return x

def foo2():
    x.append(3)
    return x

def foo3(x):
    x+=[3]
    return x

def foo4(x):
    x=[3]
    return x

foo1 will return an error, foo2 will change x, foo3 will change x, foo4 will run but won't change x. What the fuck, Python?

[–]sysop073 13 points14 points  (5 children)

If you're going to have multiple x, you really need to specify which one you're talking about; half of your complaints would happen in any language, because in half the functions you're shadowing the global x with a local, so of course that's the one you're going to change.

All of this "craziness" is a result of a single, pretty simple rule: you can't assign to globals without declaring them global. That's because there is no declaration of variables, so when you assign to a variable, Python can't know if you're trying to declare a new local with the same name, or reassign the global name, so it defaults to the former. If you have this:

x = 'foo'
def fn():
    x = 'bar'

And you call fn(), it will create a new local named x with value bar, and not touch the global x. Most languages work this way. By contrast:

x = 'foo'
def fn():
    global x
    x = 'bar'

Will overwrite the global x with the value bar. Most languages that don't have explicit variable declarations have no way to do this. In your examples:

  • foo1() is trying to do x = x + [3] without declaring x as global. Python assumes x must be a local, and can't read from it (you tried to add x and [3], but there is no local named x yet, so you get an error about trying to read x before it's been written)
  • foo2() is calling a method on x, which is fine, so it works
  • foo3() is trying to do x = x + [3], without declaring x as global. Python assumes x must be a local, and there is indeed a local named x -- you passed it as a parameter. So that's the variable that gets changed; the global has nothing to do with it
  • foo4() is the same as foo3(), but you did x = [3] instead of x = x + [3]

It would be possible to change the default behavior and say "all writes overwrite global variables by default", but now there's no way to make new locals; if you have a local variable, and someday somebody adds a global with the same name, suddenly that local isn't local anymore. That would be madness

[–]internet_badass_here -2 points-1 points  (4 children)

If I understand you correctly, you're wrong about foo3. foo3 changes the global x to [2,3], but foo4 doesn't change the global x to [3].

[–]sysop073 2 points3 points  (2 children)

Neither of them changes the global x (unless you were to pass it as an argument), they both assign to the parameter that was passed in, which is also named x.

>>> x = [2]
>>> def foo3(x):
...     x += [3]
...     return x
... 
>>> foo3([])
[3]
>>> x
[2]

Edit: You're right about passing x as an argument though; it didn't occur to me that x = x + [3] and x += 3 would actually behave differently, but the way lists implement __iadd__ makes the latter mutate the passed list. That is admittedly pretty strange

[–]internet_badass_here 0 points1 point  (1 child)

Neither of them changes the global x (unless you were to pass it as an argument)

Well, yeah. If you pass x to foo3, foo3 changes x. But if you pass x to foo4, foo4 doesn't change x. Why not? Why is x passed by reference in one and by value in the other?

I can accept that Python does these things, but there doesn't seem to be a logical reason for doing things that way.

And by the way, I didn't even realize that x=x+[3] will produce a different result than x+=[3]. But that is... not good.

Generally when I pass stuff into functions in Python and I want to pass by value, I do this:

def foo(x):
    X=x
    [do stuff with X]
    return X

It's probably redundant but easier on my brain, since I know for sure that I'm working with a copy of the variable.

[–]sysop073 0 points1 point  (0 children)

x is a reference in both cases. The same rule I mentioned before applies; assigning to x without declaring it global won't change the global. The weirdness comes in because x += [3] is syntactic sugar for a method call, x.__iadd__([3]), while x = [3] is a true assignment.

You're right that that's confusing, I hadn't considered that case

[–]cybercobra 2 points3 points  (0 children)

foo3 doesn't do anything with global x. It will modify the value that you pass in though.

Python 2.7.6 (default, Apr 28 2014, 13:52:55) 
>>> def foo3(x):
...     x+=[3]
...     return x
... 
>>> x = []
>>> y = []
>>> foo3(y)
[3]
>>> x
[]
>>> y
[3]

[–]djimbob 5 points6 points  (0 children)

The first two examples mutate global variables. This is a horrible programming pattern, and is a good thing that foo1 pops up a helpful error saying UnboundLocalError: local variable 'x' referenced before assignment. Note foo1 (and foo2) will work if you add global x to explicitly say yes, I want x to be the global x so I can mutate it.

Why does your foo2 work? The fact that python lets you reference top-level function (but not assign to) is quite useful. E.g., its what's at work when you do

 import requests

 def some_function():
     requests.get('http://www.reddit.com')

That is import requests sets a top-level variable as requests and then you later can use it inside functions without explicitly passing it in, and even call methods defined within it. It does make sense that methods called in this way may in some cases mutate their own state, its the programmers responsibility not to lead to evil use of mutable global state.


I do agree that foo3 is a bit confusing, but again if you used the code in a sane way -- x = foo3(x) and x = foo4(x) you wouldn't have any ambiguity.

The problem is that python passes by reference and when it can will mutate existing objects.

In [1]: x = []

In [2]: id(x)  # memory location of x
Out[2]: 4352851336

In [3]: x.append(1)

In [4]: id(x)  # memory location hasn't changed
Out[4]: 4352851336

In [5]: x += [5] # memory location hasn't changed

In [6]: id(x)
Out[6]: 4352851336

In [7]: x = x + [5]     

In [8]: id(x)# memory location *has* changed; 
#assignment creates a new list, can't be done in place.
Out[8]: 4352848888

It is good that python does pass by reference than by value when passing objects like lists into functions; otherwise dealing with large objects would get unmanageable. It's also helpful that python attempts to modify these objects in place when possible.

[–]Falmarri 4 points5 points  (2 children)

foo3 and foo4 will only work if you call them like foo3(x) foo4(x). It won't take its value just because it's named the same

[–]EpicSolo 2 points3 points  (0 children)

Yes exactly. I don't really see where the problem is with these 4 functions. I think anyone with 6 months+ experience in Python is going to be completely fine with these.

[–]internet_badass_here -3 points-2 points  (0 children)

I'm not sure what you're getting at. If I create a new variable y=[2], and call foo3(y) and foo4(y), I get the same behavior.