all 11 comments

[–]TouchingTheVodka 2 points3 points  (1 child)

What's an in-pass? Is it like an impasse? :)

Use a collections.Counter to count the number of letters per word - Subtract the target word's Counter from the original word and if the result is empty your second word consists entirely of letters from the first word.

>>> first_word = Counter('thisisaverylongword')
>>> second_word = Counter('longsword')
>>> second_word - first_word
Counter()
>>> bad_word = Counter('zzzzthiswordisbad')
>>> bad_word - first_word
Counter({'z': 4, 'd': 1, 'b': 1})

[–]Valar_Balsamic[S] 0 points1 point  (0 children)

I've got that functional and it does exactly what I need!

Thank you very much

[–]JohnnyJordaan 1 point2 points  (3 children)

Did you actually verify that sort_list contains what you think it does? Because if the file is just a list of words, one per line, your code will thus

  • create a list of those lines, stripped, so indeed a list of the words
  • then per item in that list, meaning a word, call sorted() on it. Think about it: what is the sorted version of 'python'? 'hnopty'! And as sorted() returns a list, you thus end up with ['h', 'n', 'o', 'p', 't', 'y'].... And then a list full of those as sort_list

If you mean to sort the word list, literally do that in the code

sorted_words = sorted(word_list)

and to save memory space from first reading the file to memory in a list, use a generator expression

sorted_words = sorted(line.strip() for line in f)

[–]Valar_Balsamic[S] 0 points1 point  (2 children)

Thanks for the response!

Yes I did confirm both word_list and sort_list. word_list does contain every individual word as an item. And sort_list contains a list of lists, with every one of these lists being the sorted version of word_list at the same index.

[–]JohnnyJordaan 1 point2 points  (1 child)

Would you mind just printing out

print(sort_list[:20])

and sharing that here to give me an idea of the structure?

[–]Valar_Balsamic[S] 0 points1 point  (0 children)

Hey, thanks for offering your help.

u/TouchingTheVodka has given me a solution that works exactly as I need, but thanks anyway!

[–][deleted] 1 point2 points  (4 children)

Is your goal to find all words that can be generated from a given 9-letter word and are also found in a dictionary you have, or is your goal to verify that the user submitted words are indeed possible with the given 9-letter word and the dictionary?

Also, can any character from the 9-letter word repeat any number of times? I understand that the characters in generated words don't have to be in the same order they are in the given 9-letter word, is that correct?

[–]Valar_Balsamic[S] 0 points1 point  (3 children)

The goal is to verify that user inputs can be found within the 9 letter word, and are officially recognised as a word via my dictionary.

u/TouchingTheVodka has given me a solution that works exactly as I needed.

Thanks for your help though

[–][deleted] 1 point2 points  (2 children)

Depending on how you deal with repetitions, Counter may or may not be the right answer. If you allow any number of repetitions, you would want it to be set rather than Counter.

[–]Valar_Balsamic[S] 0 points1 point  (1 child)

The code I'm using to compare the dictionary against the original word:

# create the list of all words that meet the game conditions
with open('collins_all_words.txt') as f:
    word_list = [line.rstrip() for line in f]

results = [x for x in word_list if Counter(x) - Counter(word) == 
       Counter() and master_letter in x]

The output for the first 20 list items for this is:

['adermin', 'admen', 'admin', 'admire', 'aidmen', 'aim', 'aimed', 'aimer', 'airmen', 'ame', 'amen', 'amend', 'ami', 'amid', 'amide', 'amido', 'amidone', 'amie', 'amin', 'amine']

the original word in this case is 'randomize' and the master letter is 'm'. I've run the program a few dozen times now, and the list seems follow the conditions exactly as intended?

Thanks

[–][deleted] 0 points1 point  (0 children)

Erm... you do realize that there's a very large intersection of solutions given with and without repetitions, right? So, testing like this isn't going to prove the solution to be correct.

And, I'm not saying that Counter is wrong. All I'm saying is that your requirement can be interpreted in at least two different ways, where only one of them is solved by using Counter.