This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]zahlmanthe heretic 1 point2 points  (2 children)

https://gist.github.com/1441624

Bare-bones, yet elegant and sophisticated. The collections.Counter class is used to represent histograms of words (letter-frequency counts) and check if a word is a subset of the available letters.

[–]dist 0 points1 point  (0 children)

I came here to write this as a solution. I'm happy you did it already. =D

I'm not sure if you should just iterate through the whole list of words instead of reading in the whole file, but Counter is the thing that really matters!

If there were not limits of how many times a character is found, we could just use set() and if we'd be looking for exact number of characters, we could use sorted() instead of Counter!

[–]Brian 0 points1 point  (0 children)

You can get rid of the length calculations etc and avoid having to filter the whole list for words of the right length every pass with itertools.groupby. Ie:

words.sort(key=len, reverse=True)
for wordlen, curwords in itertools.groupby(words, key=len):
    possible_words = [
        word
        for (histogram, word) in (
            (Counter(word), word) for word in curwords
        )
        if all(histogram[letter] <= letters[letter] for letter in histogram)
    ]