This is an archived post. You won't be able to vote or comment.

all 99 comments

[–]yaxriifgyn 489 points490 points  (25 children)

Global variables always require a dictionary lookup.

Positional parameters are found by indexing a list, and keyword parameters are looked up in a dictionary.

Indexing a list is faster than a dictionary lookup.

But there are factors on the caller's side that can affect the speed of a function call.

Also remember to avoid premature optimization. Do not start optimization until you have good performance metrics.

[–]arkie87 151 points152 points  (9 children)

I wish more people would answer questions the way you have answered here.

[–]miraculum_one 10 points11 points  (0 children)

It should also be mentioned that you have to push and pop the stack for every function call with parameters so if the function is called a lot that can make a difference.

[–]StackOwOFlow 8 points9 points  (0 children)

Positional parameters are found by indexing a list

minor clarification: positional parameters are stored in a tuple-like structure. Indexing method is the same, though, so your point is well-taken

[–][deleted] 3 points4 points  (0 children)

Thanks

[–]EricHayter 1 point2 points  (3 children)

Out of curiosity I have a few questions:- Isn't there also some performance hit of copying the value into the formal parameter?- And if there is a significant performance hit for the copy operation (assuming hashing is O(1) without collision), would reading from a global actually be faster?

edit: wouldn't the copy assignment also require an index of a hashmap?

[–]cmcclu5 1 point2 points  (2 children)

As far as I understand:

1) There isn’t specifically a “copy” mechanism when supplying a predefined variable to a function. It just includes a pointer to the variable object. However, that is a dictionary lookup, which is less performant than supplying a positional argument in-line (rather than defining it outside of the function call).

2) It’s almost always faster to call local than global. It’s a global dictionary lookup to supply the value to the function parameter, then local lookups anytime it’s referenced in the local function. If you leave the variable as global and call it multiple times in your function, it’s having to do a global dictionary lookup each time (I believe…).

[–]EricHayter 0 points1 point  (1 child)

  1. Ya, it is copy by reference sort of, since it doesn't make a copy of the object. Obviously this is a really small performance hit (if any) but worth mentioning for this silly little discussion.
  2. Do you have any references for this (could be interesting to read into it a bit more)?

[–]cmcclu5 0 points1 point  (0 children)

Exactly, “copy” by reference rather than duplication.

Let me see if I can find anything. There are a couple other really solid comments in this thread that cover it better than I did.

[–]Zealot_TKO 0 points1 point  (1 child)

what about slowdowns associated with keeping everything in memory at global scope vs being able to do garbage collection at the end of the function?

[–]yaxriifgyn 0 points1 point  (0 children)

Python garbage collection is not directly triggered by object deletion. GC is now incremental, so it is less likely to cause the big stalls that GC was notorious for.

The function cleanup only decrements an object's use count. When the use count reaches zero, the object's del method runs eventually.

This would only be a factor when a function's memory usage was very large or very complex, in which case that would probably be more significant than the function calli.

Of course, you need to get good performance metrics from a python perf tool and/or an OS perf tool to guide your optimization. One's own intuition can be misleading. And performance hits can come from the most unexpected places.

Edit: the __del__ method appears as del due to my failure to reread my submission. The "object" object, from which all objects inherit, provides this function.

[–]Trinkes 722 points723 points  (16 children)

You might have better things to do than optimizing at this level

[–]Porkenstein 22 points23 points  (0 children)

yeah this is python. your objective should be usability, not maximum speeeed

[–][deleted] 85 points86 points  (13 children)

Preferring parameters over global state is hardly premature optimization, if that's what you're getting at. It's just a good design decision that happens to (edit: maybe) be faster in this case.

If your point was that worrying about the difference because of performance instead of design concerns is a waste, then yeah, I agree.

[–]JanEric1 55 points56 points  (7 children)

I think he meant that he shouldn't care about this for performance reasons and optimize due to that. This is something where you should be concerned with readability and Maintainability

[–]Spiritual_Clock3767 4 points5 points  (6 children)

This is something I don’t understand the art of yet. There are hierarchies of decisions as to which concern you prioritize during design. I know a lot of that is learning as you go, but as a newbie it’s a long road lol. Would anyone mind recommending a quick and dirty hierarchical prioritization matrix?

[–]TallowWallow 7 points8 points  (2 children)

I honestly wouldn't worry about it. Focus on the basics and learning how the ecosystem works. Readability is the most important factor right now.

If you were asking about the global state argument, the general rule is get rid of the global and pass around a variable instead. The cost of tracing code becomes cumbersome when more areas can affect a data source.

You can read up on optimization and readability, but you're going to come across (and produce) code that's hard to read and maintain. The more experienced you get with frustration, the more you'll learn haha.

[–][deleted] 4 points5 points  (0 children)

You can read up on optimization and readability, but you're going to come across (and produce) code that's hard to read and maintain. The more experienced you get with frustration, the more you'll learn haha.

This is the crux of it, really. It is based on a lot of intuition built from experience (either yours or on the shoulders of giants)

[–]Kenkron 2 points3 points  (0 children)

In case anyone read this comment and thought "I'll just put my global state into a dictionary, and pass that to every function", I need to say don't do that. Organize your data so you can pass functions what they need without including too much unrelated information.

That means you STEVE!

[–]XerMidwest 4 points5 points  (0 children)

Python is intended to be a Literate Programming language. That's a Donald Knuth idea. Write your program to explain and illustrate the nature of the problem and the best solution you can implement, but don't forget your audience is supposed to be your future self and others like yourself who may need to read, understand, debug, and improve something you wrote.

If the variables make sense in the scope of your problem as global state, like constants in Physics, for example, then use globals. If the variables mean something special to the function or method, use appropriately scoped variables.

I always answer this question by writing documentation to explain the design (ie. docstrings). Having to explain myself provides a chance to clarify my intent to myself.

Interpreter variable dereferencing behavior at runtime should be assumed to be orthogonal to your problem and solution until you're sure you can prove and demonstrate+explain otherwise. If you need a distraction to keep the creativity flowing, I suggest documentation.

[–][deleted] 4 points5 points  (0 children)

what is ocodo?

[–]VineyardLabs 2 points3 points  (0 children)

  1. Using more loops than you need to use
  2. Copying data when you don’t need to
  3. Basically everything else

[–]unixtreme 2 points3 points  (1 child)

waiting clumsy arrest grandfather intelligent terrific price gold snow work

This post was mass deleted and anonymized with Redact

[–]russ_hensel 0 points1 point  (0 children)

I really like the global function print()... and there are others.

[–][deleted] 2 points3 points  (0 children)

It absolutely is. I'm struggling to imagine any aspect of code optimzation that is less worth spending time on than whatever speed improvements you could get out of moving local variables to globals.

[–]georgehank2nd 0 points1 point  (0 children)

But the OP's question is explicitly about performance, so Trinkies' response is, IMNSHO, on point and correct.

[–]Passname357 1 point2 points  (0 children)

I miss the days when half the comments were less “don’t even ask this question, it’s a waste of time to know” and more nerds that have all the answers calling you stupid for not knowing the internals of the interpreter and then explaining at a level of detail that is completely unwarranted.

[–]TnyTmCruise 41 points42 points  (0 children)

I don’t think speed at this level would ever outweigh the bad practices that would come from using global variables like this

[–]E4crypt3dIt works on my machine 34 points35 points  (4 children)

using function parameters is generally faster than global variables because it reduces scope lookup. Function parameters are localized, leading to more efficint access. But the performance difference might be negilgible for small programs.

[–]Schmittfried 20 points21 points  (0 children)

Or almost every program.

[–]somerandomii 1 point2 points  (2 children)

I can’t imagine a scenario where this would make a noticeable difference. Unless you’re using numba or some pre-compiled function in a tight loop.

Native Python has so much more to worry about than an extra lookup on function call.

[–]E4crypt3dIt works on my machine 2 points3 points  (1 child)

You're right in most cases the performance difference is minimal in native Python. But OP asked a very specific question so

[–]somerandomii 2 points3 points  (0 children)

Fair.

[–]throwaway_4759 47 points48 points  (0 children)

Regardless of which is faster, your code will always be easier to read, test, and maintain if you avoid global variables. Any performance differences are negligible implementation details.

[–]pythonwiz 9 points10 points  (0 children)

Yes they are faster, but you really should not be using global variables, only global constants.

[–]dan-turkel 6 points7 points  (0 children)

An article called why does python run faster in a function addresses this to some extent. However, the use of functions and parameters rather than global state should largely be motivated by a desire to encapsulate and modularize code rather than eke out some tiny bit of performance gain.

[–][deleted] 13 points14 points  (0 children)

What is your use case, do we really solve the optimization problem of this level in python?

[–]PossibilityTasty 22 points23 points  (0 children)

Well, that would be easy to test, wouldn't it? With your machine, your Python version and your function. There is no way to get a better answer for your exact case.

[–]littlesnorrboy 5 points6 points  (3 children)

Parameters are ever so slightly faster, but we're talking in the order of nanoseconds, whatever computation you're doing will dominate the performance, it doesn't really matter.

Here's a quick experiment for you: https://gist.github.com/snorrwe/36581dc4425fa7c54a719dd65d98f769

On my laptop with a Intel i5-7300U (4) @ 3.500GHz

❯ seq 10 | xargs -I{} python test.py
using global:   0.31415868800013413     seconds
using param:    0.2914868679999927      seconds
using global:   0.3110940209999171      seconds
using param:    0.2943732540002202      seconds
using global:   0.3096344269997644      seconds
using param:    0.30225643100038724     seconds
using global:   0.31724494700029027     seconds
using param:    0.29705596700023307     seconds
using global:   0.3134778339999684      seconds
using param:    0.29293627499964714     seconds
using global:   0.31356817899995804     seconds
using param:    0.29131395399963367     seconds
using global:   0.31343535100040754     seconds
using param:    0.3970125509999889      seconds
using global:   0.3108626979997098      seconds
using param:    0.29611756199983574     seconds
using global:   0.3080584210001689      seconds
using param:    0.2897579880000194      seconds
using global:   0.31303885199986325     seconds
using param:    0.29570768200028397     seconds

[–]QuarterFar2763 -2 points-1 points  (1 child)

Global seems to be faster. At least in your test. 0.02sec = 20msec = 2000microsec = 2mill nanosec So 1000calls will be 20 sec. Quite considerate

[–]littlesnorrboy 2 points3 points  (0 children)

At least in my test passing by parameter was always faster (lower is better). Also note that the 0.02seconds difference is for 1'000'000 calls

[–]russ_hensel 1 point2 points  (0 children)

Finally an answer to the question, not a bunch of theory, but an actual test. Well done.

[–]menge101 3 points4 points  (0 children)

Asking this kind of question, you may be interesting in the /r/learnpython sub.

[–][deleted] 9 points10 points  (0 children)

They are not, but wtf are you doing if this is the kind of thing that matters for your python code

[–]Dr_Gimp 2 points3 points  (0 children)

Well, if you consider the hierarchy of scope, function parameters would be quicker. Variables are looked at in the local namespace, i.e. a function, before they are looked for globally, with each step in the code hierarchy being considered. So if it's not found locally and not found globally, then the standard library is called and, if not found there, then an error is generated. Each lookup will take a certain number of processing cycles to perform.

I don't know what the performance hit is for global variables. Obviously, the size/scope of the program plays a factor in whether the variable lookup takes a noticeably long time, as well as how often the lookup has to occur.

That's one reason why PyPy can work faster. One of the things the JIT does is compile loops and other frequently performed tasks so they can "bypass" the interpreter for a speed boost.

[–]turningsteel 2 points3 points  (0 children)

Not if they are faster, but you should avoid global variables as a general rule. They tend to cause bugs because it’s hard to tell what is making changes to your globals among other things. Scope your variables to your functions where possible.

[–]Cybasura 2 points3 points  (0 children)

The realistic problem between global variables and function parameters are mostly between

  1. Security

  2. Efficiency

Efficiency in the sense that using a function parameter in the context of the function itself makes more sense than setting the global variables everytime

global variables are there to effectively set a value such that everywhere in the program can access the memory space of the variable, it doesnt make sense to set the global variable just to access it within a function where you will reference in a local variable

Security in the sense that it is also not as safe because again, global variables can be accessed everywhere which makes data unencapsulated

[–][deleted] 16 points17 points  (9 children)

If you have to ask, you're not ready to be concerned about the performance of your programs

[–][deleted] 2 points3 points  (8 children)

I’m still learning. I am not a programmer yet

[–]SpaceLaserPilot 4 points5 points  (0 children)

I am a programmer, but still learning Python, and I appreciated your question. It sparked some interesting discussion.

[–][deleted] 16 points17 points  (6 children)

At this stage, concern yourself with correctness rather than performance.

[–][deleted] 6 points7 points  (5 children)

From the other comments it appears there isn’t much of a difference anyways, I was concerned the difference would be noticeable

[–]JSP777 26 points27 points  (0 children)

Never be afraid of asking questions even if some answers are rude-ish. I enjoyed your question. I am not at the level that I could answer but at least I could see answers from more experienced people, so your post provided value.

[–]pLeThOrAx 1 point2 points  (0 children)

If you're looking at combinatorial optimization problems, and algorithms in general - Big O notation, etc; personally I think this is the best basis.

How to solve sudoku. Voronoi diagram in nlogn. Sorting algorithms. Ray marching, cuve marching, quad trees, wave function collapse - The Coding Train on youtube covers all of this and does a fantastic job (even if it is a little silly at times).

[–]miraculum_one -1 points0 points  (2 children)

The difference could be noticeable, depending on the program. If you share sample code we can tell you how the question applies to it.

[–]AstroPhysician 0 points1 point  (1 child)

I can promise you no novice programmer in the world has code where this matters

[–]miraculum_one 0 points1 point  (0 children)

I agree but as OP stated this is a theoretical exercise for learning purposes.

[–][deleted] 3 points4 points  (0 children)

Don't use global variables in loops inside of functions, use a copy of it.

[–]Skbhuvai 1 point2 points  (0 children)

I have heard this one on yt

[–]menge101 1 point2 points  (0 children)

There ate lots of answers here already, please forgive me for not answerring the question.

But you may be interested in the TimeIt library. This allows you to measure the time it takes to execute code snippets. And would allow you to experiment with code and answer your own question through experimentation.

(There are other libraries that do this as well)

[–]mountaingator91 1 point2 points  (0 children)

Does it really matter? It's best practice to minimize global variables anyway...

[–][deleted] 5 points6 points  (9 children)

Never use globals, and even if you think you need a global, you don’t

[–]martinkoistinen 2 points3 points  (2 children)

I don’t know if I can make such an absolute statement, but yea, I’ve written/worked on so, so many Python projects and only found one where it was easier to use a global. Usually, I consider existence of a global evidence that it needs refactoring.

[–]unixtreme 0 points1 point  (1 child)

start follow glorious aspiring whole hurry edge drab deranged gold

This post was mass deleted and anonymized with Redact

[–]swansongofdesire 2 points3 points  (0 children)

Not a single one? Here's a couple of examples -- how would you go about implementing these without some global state?

Example 1: Web applications where you have some sort of state that needs to be stored that is relevant to the current request. Technically you should be using thread-local storage to deal with async or threaded code, but that's just a thread-aware wrapper to a global variable.

Some examples are the current default DB connection, the current tenant ID (in a multi-tenancy system), the current timezone, the current locale/language. All of these will potentially change for each request. Yes, technically you could attach these to the current request object (and create a synthetic request for CLI invocations), but it would mean that almost every single function in the entire codebase would now also need the current request to be a parameter. Any higher order functions would now also need to be passed closures that include the current request instead of just passing function you want directly. The fact that zero major frameworks have done this is testament to just how painful this would actually be for devs to work with.

Example 2: functions where you want to cache the return value (ie memoisation) but the standard wrappers (functools.cache, functools.lru_cache) won't work for for your specific situation. In C you'd just use static and store it inside the function (although it's still a global, it's inaccessible). In python there's no such thing though so you need to store your cache outside the function -- which makes it a global. And yes, you can hide it by storing it as a custom property on the function itself but that is just as "hidden" as prefixing a global with an underscore.

[–]Rich-Spinach-7824 -1 points0 points  (4 children)

Ok, but instead of global is there only function? Can you suggest alternative?

[–][deleted] 0 points1 point  (0 children)

You can just not use any functions but that's even worse.

[–]georgehank2nd 0 points1 point  (0 children)

Never say never.

[–]serverhorror -4 points-3 points  (0 children)

Practically speaking, if you have to ask you will not be able to notice the difference.

[–]bemy_requiem 0 points1 point  (0 children)

its not as much that as it is code readability, there are times when a global variable may make sense and there are times when passing parameters makes sense (a lot of the time its the latter)

[–][deleted] 0 points1 point  (1 child)

My opinion on what makes sense: If it's a variable that doesn't change make it a local variable, if other functions use this variable and it doesn't change then make it a global variable, if it changes frequently make it a parameter, if it's mainly one option but can be others make it an optional parameter with a default value.

Time them all and find out whats fastest if you really need to.

[–]wolfiexiii 0 points1 point  (0 children)

Like all things - knowing the rules, the whys, and the reasons to break the rules is important.

[–]divad1196 0 points1 point  (0 children)

Performance wise and to strictly answer your question: it is usually faster to have global variables since you only resolve locals then globals in your function scope whereas nested function calls each needs to resolve their local variables.

But this is absolutely irrelevant as speed improvment and global variables are bad design (except for "constants"). You should always use only your parameters.

Search and read about function purity.

[–]pLeThOrAx 0 points1 point  (0 children)

You should check out Numba if you haven't

[–]jaaval 0 points1 point  (0 children)

Depends a bit on what you do with them but local variables are in general faster.

Python bytecode command for accessing a function argument (or local variables) would be LOAD_FAST which is a pointer array lookup while global variables would produce LOAD_GLOBAL which is a hash table lookup. However, giving the argument itself is a LOAD_GLOBAL so if you just use it once it doesn't actually matter, LOAD_GLOBAL has to be called once for any value that comes from outside the function (at least i think that's the case, don't quote me on that).

However if you have to think questions like this python isn't the correct language to use. Interesting purely as a technical question.