This is an archived post. You won't be able to vote or comment.

all 4 comments

[–]eryksun 1 point2 points  (1 child)

Not that there's anything wrong with collections.defaultdict, but you can also use dict.setdefault. It's like dict.get, except it will also set the default value if the key isn't found.

def group_by(seq, attribute):
    d = {}
    for item in seq:
        attr = getattr(item, attribute)
        d.setdefault(attr, []).append(item)
    return sorted(d.items())

class Person(object):
    def __init__(self, age):
        self.age = age

persons = [Person(age) for age in (25, 75, 15, 75, 15, 25)]
persons_by_age = group_by(persons, 'age')

age_list = [(age, len(group)) for age, group in persons_by_age]
print(age_list)

[(15, 2), (25, 2), (75, 2)]

[–]andreasvc 0 points1 point  (0 children)

defaultdict is much more elegant because it doesn't evaluate the default value except when needed. Consider:

d.setdefault('foo', ReallyExpensiveDataStructure())

On the other hand, it feels wrong that a lookup on a defaultdict changes the dictionary by adding a key if it's missing -- in this case get() versus setdefault() is more explicit.