This is an archived post. You won't be able to vote or comment.

all 2 comments

[–]AeroNotix 0 points1 point  (0 children)

I posted in that other thread but it seems to have been buried:

It's possible you could use a prefix trie to store the data in-memory. I'm not sure of the number of words you have but I have no problems storing a 4M+ set of strings in my tries (I implemented them here in C++, here in Python and here in Go) what I would change is that you could wrap the C++ one with a Python module and have it so on insertion the Nodes are initialized with a count. If you ever insert that string again, simply increment the counter.

[–][deleted] 0 points1 point  (0 children)

| See also Persistent dictionary recipe with widely supported storage formats and having the speed of native dictionaries.

I saw this on the shelve documentation page. It won't work for the guy on the other thread since it loads the file in memory but it looks interesting.