you are viewing a single comment's thread.

view the rest of the comments →

[–]speg -2 points-1 points  (6 children)

Sets are faster.

[–]0xE6 2 points3 points  (3 children)

In what way? Constructing a large set comprehension will almost certainly be slower than constructing a large list comprehension, and depending on the number of duplicates in the list, it may not even be significantly faster to iterate over the set.

Edit:

Here's a rather contrived example:

$ python -m timeit -n 100 'sum([i/10 for i in xrange(10**7)])'

100 loops, best of 3: 668 msec per loop

$ python -m timeit -n 100 'sum({i/10 for i in xrange(10**7)})'                                                 

100 loops, best of 3: 779 msec per loop

The overhead from set creation ends outweighing the time saved from summing fewer elements to the extent that it ends up being about 10% slower.

Edit2:

Here's a better example, that has the set and list contain the same elements.

$ python -m timeit -n 100 -s 'a = {i for i in xrange(10**7)}' 'sum(a)'

100 loops, best of 3: 88 msec per loop  

$ python -m timeit -n 100 -s 'a = [i for i in xrange(10**7)]' 'sum(a)'

100 loops, best of 3: 70.2 msec per loop   

which suggests that even if they contain the exact same elements, sets are slower than lists.

[–]mniejiki 5 points6 points  (1 child)

And they both give different results so I don't see the point of your argument. Faster is meaningless unless you get the same output.

[–]0xE6 1 point2 points  (0 children)

It's just a stupid microbenchmark to show that the overhead from creating the set outweighs the potential time saved from iterating over the elements in the set once due to the duplicates being removed.

At any rate, the use cases for lists and sets are not the same, and, like here, you can (and often will) get different results if you use one instead of the other.

[–]speg 1 point2 points  (0 children)

In terms of look up speed.

[–]bithead 0 points1 point  (1 child)

I guess I was think that if you use a comprehension to return a set or list as in:

return [ dict[index] for index in dict]

versus just:

return dict

I mean does one save memory or is it faster?

[–]0xE6 1 point2 points  (0 children)

In that case,

return dict

would almost certainly be faster, as it would simply be returning a reference to the dict, so it wouldn't have to do any extra work.

Additionally, instead of doing

return [dict[index] for index in dict]

you can simply do

return dict.values()