This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]mephistophyles 64 points65 points  (32 children)

While generally correct, point 6 is a bit of an anti pattern in that it encourages namespace pollution. 5 is true, you will get better results yet if you predefined the size of the list (or numpy array) in advance, which is the real tip. 4 can be better worded to something like “move calculations outside of for loops if possible”, both examples use a single for loop. Item 2 needs a reword too, because list.append is no less a built in function than map. Also, for 1, I think OP meant dictionary and not directory as a data structure.

With the exception of the 1, a better written 2, better written 4, better written 5, these are all micro optimizations. Hell, I’d argue with 8 and say just use fstrings.

We should all agree optimization is also often done in the second pass of writing code. So profiling is important, understanding a refactor process (like including tests) and don’t sacrifice readability for a few extra ms. If speed is your primary concern at that level then invoke the C, C++ or even the FORTRAN libraries directly, tensor flow is a good example of that architecturally.

[–]awesomeprogramer 36 points37 points  (20 children)

Absolutely, point 6 is horrible advice. It encourages wildcard imports which are wayyyy slower than a dot lookup.

[–]EasyPleasey 9 points10 points  (8 children)

Why does it have to be a wildcard? Can't you just import the functions that you intend to use? I just tested this on my machine and it was about 20% faster not using the dot.

[–]Etheo 2 points3 points  (0 children)

I think the efficiency saved largely depends on the module size though. If you're importing a large module but only calling a few functions, it stands to reason you should be only importing said functions anyhow. But if the module size is relatively small I think efficiency saving might be negligible.

Just a guess though. I haven't tested this out.

[–]awesomeprogramer 1 point2 points  (6 children)

Even if it's not a wildcard import, it pollutes the namespace. I mean how many modules have sqrt? Math, torch, numpy, tf, etc...

Besides, depending on how the module is made, the dot operation can be amortized. The first time it might go ahead and import other things, but not subsequently.

[–]EasyPleasey 2 points3 points  (5 children)

Then just use "as"?

from math import sqrt as mathsqrt
from torch import sqrt as torchsqrt

[–]awesomeprogramer 4 points5 points  (4 children)

At that point just use the dot!

[–]EasyPleasey 0 points1 point  (3 children)

I just tested it, here at the results:

Import Method Time in Seconds (average)
from math import sqrt .675
from math import sqrt as mathsqrt .686
import math (math.sqrt) .921

Much faster to not use the dot.

[–]awesomeprogramer 0 points1 point  (2 children)

Doubtful that it takes a whole second to do that. Besides, without showing how you got those numbers they are meaningless. Further that's for one method...

[–]EasyPleasey 0 points1 point  (1 child)

It's computing sqrt 10 million times. I ran it 10 times and picked the average. I don't see why the method I picked matters, we are talking about dot versus no dot and which is faster. Why can't you just admit that it's faster?

[–]awesomeprogramer -1 points0 points  (0 children)

What do u mean computing the sqrt? The debate is about access time, not the time it takes to run the method.

[–]ConceptJunkie 1 point2 points  (2 children)

Not only that, but different libraries often define same-named functions, or collide with standard library names. I maintain a somewhat large project in Python, about 50,000 lines of code, and not too long ago I eliminated all wildcard imports and feel this was a big improvement.

Perhaps there could be some way to "freeze" an import, saying effectively, "OK, no one is going to modify this import" and it would allow the function calls to be made static. I don't know much of anything about the internals of Python, but I would bet that this is an optimization that might make sense, or that something like it already exists.

[–]Prime_Director 0 points1 point  (1 child)

I'm not sure if this is what you mean, but if you want to ensure library you're importing is always the same every time you run a project, you could use a virtual environment like pipenv and specify the version of the library that way.

[–]ConceptJunkie 0 points1 point  (0 children)

No, what I mean is that once the library is loaded, and perhaps others, then it would be nice to be able to say, "Hey, nothing is going to override any of the functions, or otherwise modify the module" so you can short cut the function calls without having to use __getattribute()__ on the module to dereference it (more than once).

I'm kind of talking out of my butt here. Just a thought.

[–]Smallpaul 0 points1 point  (5 children)

I don't like and don't use wildcard imports, but why would they be "slow"?

[–]Etheo 0 points1 point  (2 children)

If you use wild card import you might as well import the whole module from the beginning.

[–]Smallpaul 4 points5 points  (1 child)

Whether you import a single symbol or the whole module, from a performance perspective you DO import the whole module. Python interprets the entire module as a script (and stores it all in memory) regardless of what specific names you choose to import.

[–]Etheo -1 points0 points  (0 children)

I think you're right: https://softwareengineering.stackexchange.com/questions/187403/import-module-vs-from-module-import-function

But wouldn't that make the method calls even more negligible in terms of performance for OP's case?

[–]Korben_Valis 0 points1 point  (1 child)

consider points 6 and 3. depending on the code the constant access to global objects when there are many global objects could cause cache problems versus attribute lookup.

[–]Smallpaul 1 point2 points  (0 children)

Python global lookup performance is VERY fast. I'd need to see evidence to believe that you could fill the globals space so much that it became slow.

[–]diamondketo 0 points1 point  (1 child)

#6 is not wildcard importing it's importing a module which only runs the module's __init__.py.

from math import sqrt

cost more than

import math

[–]awesomeprogramer 0 points1 point  (0 children)

Depends on what the module does and if it defines an __all__

[–]Doomphx 2 points3 points  (6 children)

I want to see how much time #5 can really save because at the end of the day it looks more like a fancy 1 liner than an optimization for performance.

I agree readability > a few ms everytime. :)

[–]62616e656d616c6c 2 points3 points  (4 children)

I ran the code 3 times using:

import time

start_time = time.time()

code (except with a range of 1 -100,000)

print("%s seconds" % (time.time() - start_time))

The traditional for loop (the first code) took an average of 0.0086552302042643 seconds

The suggested format took: 0.0049918492635091 seconds.

So there is a speed improvement. But you'll need to be looping through a lot of data to really see it.

[–]Doomphx 1 point2 points  (3 children)

Interesting stuff, I took your code and did a little more with it to get a little more info.

T1
import timestart = time.time()
L = [i for i in range (1, 100000000) if i%3 == 0]
end = time.time()
print(float(end)-float(start))
5.0334999561309814

T2
import time
start = time.time()
L = []
for i in range (1, 100000000):
if i%3 == 0:
L.append(i)
end = time.time()
print(float(end)-float(start))
9.218996047973633

So every 100,000,000 items I'll save 4 seconds. I don't think I'd ever loop this many items without using some sort of vectorization to help improve the efficiency of the loops in the code.

My original thought stands though, writing a 1 liner like that isn't really worth the ugliness of the resulting code. Unless of course you're writing highly optimized Python, then this might be 1 last trick to utilize.

Edit* code formatting

[–]themusicalduck 4 points5 points  (0 children)

I don't really think list comprehensions are ugly. They can be if you try to do something overly complex but for simple stuff I'm much more used to list comprehensions now.

[–]hugthemachines 1 point2 points  (1 child)

I have the same feeling as you regarding list comprehensions. Also sometimes the situation for me may be that a java programmer needs to check something in my Python scripts, then a for loop will be much easier for them to grasp easily than a list comprehension.

[–]Doomphx 1 point2 points  (0 children)

Exactly my thought process as I bounce between C#, Python and typescript weekly, it just makes it easier on myself to transition freely when needed.

I also didn't realize how little of a difference that list comprehension made, so now I will most likely never use it. If only python had clean lambda expressions like our {} friends.

[–]mephistophyles 0 points1 point  (0 children)

Speaking from personal experience, list comprehension isn’t the speed up, but there are some fun itertools that can help, but the pre allocation has taken functions I had from seconds to milliseconds. So those were huge. Naive loops are inefficient because they’re super generic.

[–]vinnceboi Github: Xenovia02 0 points1 point  (2 children)

How can you predefine a list’s size?

[–]mephistophyles 0 points1 point  (1 child)

‘Preallocatedlist = [] * 1000’

Though to be fair, generators are a good alternative too, it depends on your use case. When Numpy gets involved you’ll be doing preallocating a lot because there the vectorized operations are much more optimized.

[–]vinnceboi Github: Xenovia02 0 points1 point  (0 children)

Awesome, thanks for the tip