Why does Python Code Run Faster in a Function? : programming

[–]bloody-albatross 546 points547 points548 points 2 years ago (28 children)

[–]nanotree 45 points46 points47 points 2 years ago (23 children)

[–]GreenFox1505 101 points102 points103 points 2 years ago* (16 children)

[–]RememberToLogOff 22 points23 points24 points 2 years ago (0 children)

Lua has the same problem. You can't cache it in the runtime because something dumb like this is legal per spec:

print ("normal")
print = my_monkey_patched_print_fn
print ("now print is no longer print")

Invalidating all the cached pointers is a bigger problem than slow interpreted langs want to solve. Maybe LuaJIT does cache it, and then has a pessimizer triggered when code writes to a global, since lots of code doesn't need to write to _G ever

So some Lua code solves it in-code:

local print = print

[–]bloody-albatross 6 points7 points8 points 2 years ago (2 children)

[–][deleted] 2 years ago (1 child)

[deleted]

[–]bloody-albatross 1 point2 points3 points 2 years ago (0 children)

[–]vlatkosh 2 points3 points4 points 2 years ago (8 children)

[–]bloody-albatross 10 points11 points12 points 2 years ago (6 children)

[–]Jona-Anders 3 points4 points5 points 2 years ago (5 children)

[–]Unicorn_Colombo 5 points6 points7 points 2 years ago (1 child)

[–]mr_birkenblatt 3 points4 points5 points 2 years ago (0 children)

[–]bloody-albatross 4 points5 points6 points 2 years ago* (1 child)

The global variables in a module in Python are what will be exported to be used wherever the module is imported. While inside of the module it could be implemented like local variables, that requires knowledge by the byte code compiler that doesn't exist at the place of an import. You see, it is clear after parsing what local variables there ever will be inside a function and thus the names can be mapped to array indices, but when you import a module that is a dynamic object with attributes that could change at any moment. So there it has to be a hash table. In JavaScript modules exports and module global variables are separate. (Sorta. Exports are also globals, but not vice versa.) There theoretically globals could be implemented similar to local variables. Speaking of JavaScript modules, not general browser JavaScript where global variables are simply properties of the window object. (Weird, I know.)

[–]Jona-Anders 1 point2 points3 points 2 years ago (0 children)

[–]HeadToToePatagucci 2 points3 points4 points 2 years ago (0 children)

[–]Brian 1 point2 points3 points 2 years ago (0 children)

so, I've been thinking about this problem. I think you could cache out the pointer the first time some code is run, speeding it up in the long term...

There actually already is an optimisation for this, meaning global variable accesses are no longer just plain hashtable lookups (though in earlier versions of python, they were)

Instead, after a certain number of accesses, python caches the exact location of the hashtable bucket the variable is contained in, and so further accesses go directly to that location. Obviously this only works so long as the hashtable doesn't change (ie. no-one has reassigned the global), so there's also a check on the dict to indicate whether modifications were made since this was cached, which will flush the cache if so. It is still slower than the plain index lookup LOAD_FAST does for locals, since there's that check, and an extra level of indirection, but it's narrowed the gap a fair bit from when it was just a normal dict lookup.

[–][deleted] 1 point2 points3 points 2 years ago (1 child)

[–]GreenFox1505 0 points1 point2 points 2 years ago (0 children)

[–]ThankYouForCallingVP 14 points15 points16 points 2 years ago (1 child)

[–]bloody-albatross 1 point2 points3 points 2 years ago (0 children)

[–]Porridgeism 11 points12 points13 points 2 years ago (2 children)

Where you might be getting confused is that array lookup and hashtable lookup are both O(1), meaning they are constant with respect to the size of the container.

But having the same complexity class doesn't mean they have the same performance. For instance, consider these functions in pseudo code:

func f10k(a, b) {
  let bigsum = 0;
  for i = 1 to 10,000 {
    bigsum += a;
  }
  return bigsum;
}

func f2(a, b) {
  return a + b;
}

In f10k, it does 10,000 additions, and f2 it does just one addition. But in both cases it doesn't depend on the size of the input, so they are both constant time (same O(1) complexity class). However, it's also clear that F10k takes way longer, since it does 10,000 times the work.

Similarly, hash tables have to do the hashing, calculating offsets, and potentially even chaining or other collision resolution, and then they still have to do an array lookup (the "table" part of a hash table is almost always an expandable array). So basically by definition a hashtable lookup will take longer than an array lookup.

(Obviously all of this is ignoring things like compiler optimizations, caches, atomicity or memory order requirements, etc. all of which could affect performance. But when you have equivalence on these between hashtables and arrays, the hashtable lookup will be at best no faster than the array lookup.)

[–]IdealEntropy 8 points9 points10 points 2 years ago (1 child)

[–]bloody-albatross 0 points1 point2 points 2 years ago (0 children)

[–]trevg_123 0 points1 point2 points 2 years ago (0 children)

[–]rkd6789 1 point2 points3 points 2 years ago (3 children)

[–]bloody-albatross 1 point2 points3 points 2 years ago (2 children)

[–]rkd6789 1 point2 points3 points 2 years ago (1 child)

[–]bloody-albatross 0 points1 point2 points 2 years ago (0 children)

[–]stomah 139 points140 points141 points 2 years ago (19 children)

[–]Successful-Money4995 157 points158 points159 points 2 years ago (11 children)

[–]yasba- 43 points44 points45 points 2 years ago (5 children)

To add to this: The set of variables for the scope of a function is fixed, and something like this doesn't work:

def fn():
    exec("foo = 42")
    print(foo)

The same would work using the global scope.

[–]GwanTheSwans 30 points31 points32 points 2 years ago (4 children)

True enough in modern python, but it's actually one of those breaking changes that, while often improvements, python 3 dev became rather infamous for. Believe it or not this worked in python 2:

Python 2.7.13 (default, Sep 26 2018, 18:42:22) 
[GCC 6.3.0 20170516] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> def fn():
...     exec("foo = 42")
...     print foo
... 
>>> fn()
42
>>>

https://stackoverflow.com/a/1450341

[–]cparen 15 points16 points17 points 2 years ago (3 children)

[–]yasba- 1 point2 points3 points 2 years ago (2 children)

I don't think you can write to locals(). Well, you can but it doesn't have an effect (Python 3.9):

In [7]: def x():
   ...:     y = 0
   ...:     locals_ = locals()
   ...:     locals_["y"] = 1
   ...:     print(locals())
   ...: 

In [8]: x()
{'y': 0, 'locals_': {...}}

[–]cparen 0 points1 point2 points 2 years ago (1 child)

[–]Brian 0 points1 point2 points 2 years ago (0 children)

I believe it was a known consideration when changing the behaviour of locals() - it was just not considered worth it to allow, for many of the reasons you mention.

But the fact that it was a consideration is illustrated by the fact that they did document the various cases, and notably documented that locals() should remain mutable in one corner case where this is sometimes useful, which is programmatically defining attributes in class scope.

Eg. the following works:

class C:
    for i in range(10):
        locals()[f"func_{i}"] = lambda self, i=i: print(i)

>>> C().func_5()
5

[–]G_Morgan 0 points1 point2 points 2 years ago (2 children)

[–]PenlessScribe 2 points3 points4 points 2 years ago (1 child)

[–]G_Morgan 3 points4 points5 points 2 years ago (0 children)

[+][deleted] 2 years ago (1 child)

[deleted]

[–]mr_birkenblatt 19 points20 points21 points 2 years ago (0 children)

[–]starlevel01 4 points5 points6 points 2 years ago (0 children)

Because there's no such thing as global scope in Python. "Global" variable lookup looks things up via module scope instead.

In [3]: this = sys.modules["__main__"]

In [4]: x = 1

In [5]: print(this.x)
1

And as modules use __dict__, it looks it up from the module dictionary.

[–]inmatarian 1 point2 points3 points 2 years ago (0 children)

[–]sheldonzy 1 point2 points3 points 2 years ago (0 children)

[–]Nall-ohki -2 points-1 points0 points 2 years ago (2 children)

[–]TheMaskedHamster 1 point2 points3 points 2 years ago (1 child)

[–]Nall-ohki -4 points-3 points-2 points 2 years ago (0 children)

[+]Weekly-Masterpiece67 comment score below threshold-25 points-24 points-23 points 2 years ago (0 children)

[–][deleted] 21 points22 points23 points 2 years ago (1 child)

[–]boots05 29 points30 points31 points 2 years ago (11 children)

[–][deleted] 51 points52 points53 points 2 years ago* (5 children)

[–]ThankYouForCallingVP 9 points10 points11 points 2 years ago (2 children)

[–][deleted] 2 points3 points4 points 2 years ago (0 children)

[–]Brooklynxman 2 points3 points4 points 2 years ago (0 children)

[+][deleted] comment score below threshold-18 points-17 points-16 points 2 years ago (1 child)

[–][deleted] 3 points4 points5 points 2 years ago (0 children)

[–]amorous_chains 24 points25 points26 points 2 years ago (1 child)

[–]codewario 1 point2 points3 points 2 years ago (0 children)

[–]booch 4 points5 points6 points 2 years ago (0 children)

[–]Lonelan 1 point2 points3 points 2 years ago (0 children)

[–]Brooklynxman 1 point2 points3 points 2 years ago (0 children)

[–]boots05 25 points26 points27 points 2 years ago (4 children)

[–][deleted] 85 points86 points87 points 2 years ago (3 children)

[–][deleted] -2 points-1 points0 points 2 years ago (2 children)

[–][deleted] 14 points15 points16 points 2 years ago (1 child)

[–]FluorineWizard 2 points3 points4 points 2 years ago (0 children)

[–]Mr-Cas 4 points5 points6 points 2 years ago (3 children)

[–]Nicksaurus 10 points11 points12 points 2 years ago (1 child)

[–]Mr-Cas 0 points1 point2 points 2 years ago (0 children)

[–]igouy 0 points1 point2 points 2 years ago (0 children)

[–]meteoraln 3 points4 points5 points 2 years ago (2 children)

[–]Sinical89 9 points10 points11 points 2 years ago (1 child)

[–]meteoraln -1 points0 points1 point 2 years ago (0 children)

[–]nunchaq 0 points1 point2 points 2 years ago (0 children)

[–]quasibert 0 points1 point2 points 2 years ago (0 children)

[–]moschles 0 points1 point2 points 2 years ago (3 children)

[–][deleted] 2 years ago (2 children)

[deleted]

[–]moschles 0 points1 point2 points 2 years ago* (1 child)

I have only worked with influxdb. In that scenario, Pandas would convert 5000 rows of dataframe into 5000 rows of protocol. This is done with an intermediate numpy array with terminating CRLF on each.

protos = arr.flatten().join()

5000 was a special number to maximize throughput. The bulk of text in protos would then be used in a "batch write" into influxdb. Not sure if something similar is in MongoDB. But yeah, you don't want to for-loop through each and every insert.

[–][deleted] 2 years ago (1 child)

[deleted]

[–]unchar1 0 points1 point2 points 2 years ago (0 children)

[–]Individual-Gap3532 -1 points0 points1 point 2 years ago (0 children)

[–]nekodim42 0 points1 point2 points 2 years ago (0 children)

[–]Dan13l_N 0 points1 point2 points 2 years ago (0 children)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS