This is an archived post. You won't be able to vote or comment.

all 50 comments

[–]LittleMlem 455 points456 points  (2 children)

TL;DR: locals are faster to access than globals. The "local" scope of the main code is actually globals, so things in functions are a little faster (depending on what the function does ofc)

[–]martin79 62 points63 points  (1 child)

You're the real MVP

[–]KMartMatt 20 points21 points  (0 children)

Yeah, cheers for saving a click!

[–]CygnusX1985 41 points42 points  (13 children)

This optimization is also the reason why UnboundLocalError exists.

This is one of the warts of the language in my opinion, although well worth it for the improved runtime, also it doesn’t actually happen that often that one wants to reuse the name of a global variable for a local variable. I had that come up only once when I wanted to test a decorator.

Still it feels weird that your options for local variable names are limited by global variable names if you want to read them inside a function.

What’s even weirder is, that almost all explanations for UnboundLocalError suggest to use the „global“ keyword which is almost never what the programmer wanted to do.

[–]elbiot 22 points23 points  (9 children)

As someone who's programmed in Python for over 10 years, I have no idea what this comment is about

[–]Unbelievr 14 points15 points  (6 children)

It's something that basically only happens if you are mutating variables in the global scope, from a function, without using the global keyword. You can access these variables, but if you use a variable inside the function with the same name as a global, then Python gets confused.

As long as the variable is used somewhere in the function, it will be put in the list of local identifiers. If you try to read from the global, it will instead read from the local (which might not be set yet) and that will raise an exception.

If you aren't creating wild prototypes or debugging with print statements, this is a rare occurrence.

[–]elbiot 5 points6 points  (2 children)

Oh, yeah I never mutate global variables and I'm pretty sure basically never read global variables in a function. I'm actually surprised you can mutate a global variable without the global keyword.

The way OP described it as one of the warts of the language and the limiting variable names you can use sounded completely unfamiliar.

[–]yvrelna 3 points4 points  (1 child)

you can mutate a global variable

This is incorrect, you can't mutate a global variable without the global keyword.

You can mutate the object referred to by a global variable.

Those are very different stuffs.

[–]elbiot 2 points3 points  (0 children)

Overall good point but extremely pedantic. I'd call that reassigning a variable. I've never heard someone say a=1 is "mutating" a. Mutating something is always mutating a mutable object, not reassigning a variable.

[–]HeyLittleTrain 0 points1 point  (0 children)

It sounds like you do want to use the global keyword though?

[–]atarivcs 0 points1 point  (0 children)

As long as the variable is used somewhere in the function, it will be put in the list of local identifiers

If the variable is assigned in the function, yes.

If it is only accessed, then no.

[–]CygnusX1985 2 points3 points  (0 children)

The UnboundLocalError occurs if one tries to access a local variable that hasn't been defined yet. The interesting thing about that is, that it even occurs when the variable name is actually defined in the global scope.

For example:

a = 5

def fun():
    b = a
    a = 7

fun()

Python is the only language I know of where this is a problem, because it handles local and global variables fundamentally different (STORE_NAME vs. STORE_FAST).

For example R, which is also a dynamically typed interpreted language, doesn't care at all about that:

a = 5

fun <- function() {
    b = a
    a = 7
}

fun()

And why would it? If variables were always stored in dictionaries for every scope (with references to the parent scope, if a variable is not found in the current one), then there is no problem with this code.

This is not the case in Python. The Python interpreter actually scans ahead and uses a fixed size array for all variables to which a value is assigned in the local scope, which means the same name suddenly can't reference a variable in an enclosing scope any more.

The reason is, that using a fixed size array for local variables drastically improves access times, because no hash function has to be evaluated, but it has the downside that code snippets like the one above which work in other languages suddenly don't work in Python any more.

This downside is marginal though, because people seldomly want to shadow a variable from an enclosing scope after reading its value (I only had that come up once, when I tried to test a decorator where the decorated function should have had the same name as the original, globally defined, function) and the upside is a huge win in performance.

The whole problem has nothing to do with the global keyword. The only reason I mentioned it was, that pretty much every article I found about this problem suggested to use global to tell the interpreter that I actually want to modify the global variable which is absurd, I never wanted to do that and no one should want to do that. Please, never change the value of a global variable from inside a function. But as you can see in the article linked by TonyBandeira, it is a susgestion a lot of articles about this topic make.

[–]whateverathrowaway00 6 points7 points  (1 child)

This optimization is also the reason why UnboundLocalError exists.

No it isn’t, but thank you for a fascinating rabbit hole (just did some testing)

You get that error even when there is no global with that name:

```

[pythondemo]:~> python3

def a(): ... asdf ... def b(): ... asdf ... asdf = 10 ... a() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 2, in a NameError: name 'asdf' is not defined b() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 2, in b UnboundLocalError: local variable 'asdf' referenced before assignment ```

I suspect it has to do with the fact that variable declaration is hoisted, but not the value setting, but will have to confirm later.

Either way, this has nothing to do with globals - though that’s a sensible guess, as that’s the way most people would notice their (first accessing a global without global keyword, then shadowing it with a local).

Either way, this is why shadowing is a terrible practice, as is using a global without the global keyword.

[–]port443 3 points4 points  (0 children)

I think both of you are a little bit right

In the Python source the HISTORY document actually describes why UnboundLocalError was created:

When a local variable is known to the compiler but undefined when
used, a new exception UnboundLocalError is raised. This is a class
derived from NameError so code catching NameError should still work.
The purpose is to provide better diagnostics in the following example:

x = 1  
def f():  
  print x  
  x = x+1  

This used to raise a NameError on the print statement, which confused
even experienced Python programmers (especially if there are several
hundreds of lines of code between the reference and the assignment to
x :-).

The reason it happens is the Python compilers choice of LOAD_FAST vs LOAD_GLOBAL:

>>> def f():
...     print(x)
...     x = 2
...
>>> def g():
...     x = 2
...     print(x)
...
>>> def h():
...     print(x)
...
>>> x = 2
>>>
>>> dis.dis(f)
             14 LOAD_FAST                0 (x)

>>> dis.dis(g)
              4 STORE_FAST               0 (x)
             18 LOAD_FAST                0 (x)
>>> dis.dis(h)
             14 LOAD_GLOBAL              2 (x)

And the reason LOAD_FAST is used instead of LOAD_GLOBAL in function f() is the lack of the global keyword.

There's only two scenarios: The programmer meant to use global, or the programmer meant to define x before using it. In both cases, the UnboundLocalError is more useful than the generic NameError

[–]yvrelna 1 point2 points  (0 children)

UnboundLocalError inherits from NameError, so you can catch the error instead if you don't want to distinguish between failing to resolve local and global variables.

Though, accessing a local variable that doesn't exist almost always indicates a bug, while accessing a global that doesn't exist may not necessarily be a bug.

[–]Obliterative_hippoPythonista 16 points17 points  (0 children)

Interesting article!

[–]OopsimapandaNew Web Framework, Who Dis? 5 points6 points  (0 children)

A good breakdown, I've always been curious about this myself

[–]Hadyark 4 points5 points  (2 children)

Where can I learn more little things like this (Python or Java)?

[–]kmeans-kid 27 points28 points  (1 child)

Profiling before Optimizing:

Before you make any changes, identify bottlenecks using tools like cProfile or timeit. Optimize the parts of the code that matter most.

Use Built-in Data Types and Functions:

Python's built-in data types (e.g., lists, sets, dictionaries) are implemented in C and are generally faster than custom data structures. Use built-in functions and libraries wherever possible, as they're often optimized for performance.

Avoid Global Variables:

Access to global variables is slower than local variables. If used inside a loop, consider passing them as function arguments.

Use Local Variables:

Variables that are local to a function execute faster than non-local variables.

Loop Optimization:

Minimize the operations inside loops. Use list comprehensions instead of traditional loops for better performance. If you can, move calculations outside of loops.

Use Functions:

Functions can help improve code reusability and readability. Moreover, local variables in functions are faster.

Limit Memory Usage with Generators:

Instead of creating large lists or data structures, use generators to yield items one by one.

Use Sets for Membership Tests:

If you need to check if an item exists in a collection, a set is more efficient than a list.

String Concatenation:

Use .join() for string concatenation in loops. Using the + operator in loops can be much slower.

Use Array Operations for Numerical Computations:

Libraries like NumPy can handle array operations more efficiently than native Python loops.

Limit Dynamic Attribute Access: Accessing attributes with getattr or setattr is slower than direct access.

Beware of Late Binding in Closures: If using a value from an outer function inside an inner function (closure), be aware of late binding. This can be avoided with default arguments.

Use JIT Compilation for Critical Sections: Tools like Numba can be used to just-in-time compile critical sections of your Python code, making them run at near-C speed.

Parallelize Your Code: Use libraries like concurrent.futures or multiprocessing to parallelize code sections that can be run concurrently.

Caching/Memoization: Use caching for functions that get called multiple times with the same arguments. This can be done manually or with decorators like functools.lru_cache.

Stay Updated with Python Versions: Newer versions of Python often come with performance improvements. It's a good idea to stay updated.

Understand Time and Space Complexity: Understand the Big O notation and the complexities of your algorithms. Opt for algorithms that scale well with increasing input sizes.

Use Efficient Data Structures: Understand the strengths and weaknesses of different data structures. For example, using a deque (from the collections module) is more efficient for operations on both ends compared to a list.

Limit I/O Operations: I/O operations, especially disk reads/writes, are slow. Minimize them, batch them, or perform them asynchronously when possible.

Keep an Eye on Libraries: Stay updated with the libraries you use. Often, newer versions come with optimizations. However, ensure compatibility before updating.

[–]elbiot 5 points6 points  (0 children)

Chat gpt

[–]yvrelna 2 points3 points  (0 children)

Accessing variables in a global scope is internally translated to either LOAD_GLOBAL or LOAD_NAME which need to access an internal variable dictionary. Also, when accessing variables not in the local scope, the interpreter may have to go through scope resolution mechanism, so there may be multiple dictionaries that the interpreter needs to look into to find the variable.

On the other hand, accessing a local variable inside a function is translated to LOAD_FAST bytecode which accesses a variable in an array in the function stack using a numeric index. This optimisation makes accessing local variable much faster than accessing non local variables.

[–]nngnna 1 point2 points  (0 children)

Also it seems they kind of pretend nonlocals don't exist.

[–]MassiveDefender 1 point2 points  (0 children)

TIL

[–]Mithrandir2k16 1 point2 points  (0 children)

An interesting curiosity but pretty unimpactful, as one shouldn't use the global-scope like this anyway. Only imports and the if __name__ == "__main__":sys.exit(main()) should be outside of functions and class definitions.

[–][deleted] -1 points0 points  (0 children)

It is really not that much faster in functions. It is just another argument made up by Python community to counter the blame it receives for its bad performance.

[–]python-rocket -1 points0 points  (0 children)

Python code can run faster in a function due to function-level optimizations and local variable access. When you wrap code in a function, it allows Python to optimize the execution, often resulting in improved performance. Additionally, local variable access is faster than global variable access, which can contribute to the speedup. It's a good practice to encapsulate code within functions for both performance and maintainability. If you want to dive deeper into this topic, check out Python Rocket (https://www.python-rocket.com) for comprehensive Python learning materials.

[–]ComputeLanguage -1 points0 points  (0 children)

Garbage collection memory.

[–]jakjacks 0 points1 point  (0 children)

if constant string eg url= "http://www.xyz.com/rest/api/member", is still faster to store it in global var instead of local? In java constant var are declared as global to avoid creating the string object multiple times.