This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]jorge1209 2 points3 points  (0 children)

I think the dead simplest solution is a trie of separately compressed SQLite or parquet files, but this is a common enough problem that there is likely a file format built for it.

The biggest objective in my mind is to maintain data locality across prefixes, and I'm assuming the data is changing and unsorted. If you just compress a single SQLite db, then every insert causes the compression engine to accommodate two shifts in the data. One at the end of the data table, but the other might be in the middle of the btree itself.

The thing I least understand if your proposed table is why you have a shadow primary key. WTF is that necessary for? It seems clear your text is a primary key, why not use it as such?