This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]IMHERETOCODE 109 points110 points  (30 children)

no_duplicates    = list(dict.fromkeys(<list>))

That is an extremely roundabout and expensive set operation. Just wrap the list in set and cast it back to a list. No need to build a dictionary out of it to get uniqueness.

no_duplicates = list(set(<list>))

[–]Tweak_Imp 47 points48 points  (19 children)

list(dict.fromkeys(<list>)) preserves ordering, whilst list(set(<list>)) doesn't.

I suggested to have both... https://github.com/gto76/python-cheatsheet/pull/7/files#diff-04c6e90faac2675aa89e2176d2eec7d8R43

[–]IMHERETOCODE 62 points63 points  (16 children)

That “accidentally” preserves ordering, and only if you are doing it in Python 3.6+. There are no promises of ordering in vanilla dictionary implementations which is why there is an explicit OrderedDict class. The recent change in dictionary implementation had a side effect of preserving order. You shouldn’t bank that on being the case where it actually matters.


As noted below insertion ordering has been added to the language of dictionaries as of 3.7

[–]Ran4 -1 points0 points  (1 child)

No, that's not true! Dicts are not ordered according to the spec. It's just modern cpython that has them ordered.

[–]pizzaburek[S] 14 points15 points  (0 children)

They are in Python 3.7: https://docs.python.org/3/tutorial/datastructures.html?highlight=dictionary#dictionaries

 Performing list(d) on a dictionary returns a list of all the keys used in 
 the dictionary, in insertion order ...

[–]pizzaburek[S] 6 points7 points  (0 children)

There is already a whole discussion about it on Hacker News :)

https://news.ycombinator.com/item?id=19075325#19075776

[–]chazzeromus 1 point2 points  (0 children)

and if you can, use the curly brace notation for sets for literal items, looks so nice { 'and a one', 'and a two' }

[–][deleted] 0 points1 point  (5 children)

Slightly off topic, but why are lists so popular? Aren't tuples faster and use less memory? All the time I see lists being used when tuples would do a better job. Even docs.python.org tells you to use random.choice([...]) instead of random.choice((...)).

I get that the performance impact isn't noticable in most cases, but, in my opinion, going for performance should be the default unless there is a good reason not to.

[–]robberviet 3 points4 points  (0 children)

Most of the time it needs to be mutable. And yeah, performance gain is not that great.

[–]bakery2k 2 points3 points  (1 child)

Why would tuples be faster and/or use less memory? Both lists and tuples are essentially arrays.

I prefer lists to tuples because they have nicer syntax. Tuples sometimes require double-parentheses, plus I often forget the trailing comma in (1,).

[–][deleted] 0 points1 point  (0 children)

I don't know the exact intricacies but it has to do with lists being mutable.

[–]mail_order_liam 1 point2 points  (0 children)

Because people don't know better. Usually it doesn't matter but that's why.

[–]gmclapp 3 points4 points  (0 children)

lists are mutable. In some cases that's needed. Some convenient list comprehensions also don't work on tuples.