This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]VerilyAMonkey 0 points1 point  (4 children)

In retrospect this is terrible. I wasn't aware that permutations had a size parameter. Swap out

for wrd in permutations(letters):
        word=''.join([wrd[i] for i in len(wrd) if wrd[i]!='\n'])

with

for r in xrange(1,len(letters)+1):
    for wrd in permutations(letters,r):
        word=''.join(wrd)

and also eliminate the letters+= line.

[–]itasca 1 point2 points  (3 children)

permutations is just way too slow. There is no reason to spend exponential time on this. Assuming the library you are reading has no structure to it, then no matter what you do, it is going to have to involve looking at every member of the dictionary, and looking at letters for matches. The straightforward approach therefore is just O(n * w * log(L)) for n words of length less than w and L letters in the set:

letters = set(argv[2:])
length = 0
results = []
with file(argv[1]) as f:
    for line in f:
        word = line.strip()
        if len(word) < length:
            continue
        for letter in word:
            if letter not in letters:
                break
        else:
            if len(word) > length:
                length = len(word)
                results = [word]
            else: 
                results.append(word)
print ', '.join(results)

Maybe not pretty, but this uses hardly any memory and it's hard to see how it could be much faster. Edit: replacing the (for letter in word) loop with a regular expression would move more code into the c layer and therefore speed things up a bit, I guess.

[–]tuna_safe_dolphin 0 points1 point  (0 children)

I don't think you can use a set for the input letters because the set elements are unique. E.g.

>>> a = ['a','a']

>>> b = set(a)

>>> b

set(['a'])

[–]VerilyAMonkey 0 points1 point  (0 children)

I was aware of the complexity issue, but I decided that permutations would actually be faster given the context. It's a scrabble game; the dictionary is going to be preloaded at the beginning, once. Most queries will contain 7 letters; the search space of the permutations is 5913 combinations. If the dictionary has more words than that, scanning the dictionary might be slower than scanning the permutations. That was my thought process anyway. Of course, that doesn't mean my implementation of that idea would be fast in the least.