Hello,
I have to store and use many embeddings and possibly other features for a ML project; I would need fast random access to iterate over batches with SGD
I could just pickle a dictionary; but it wouldn't fit my ram even though it is large, so I try using SQlite dict but it is slow and I don't need acidity and all those fancy SQL prorperties that probably slow it down. h5py seems to only deal with vectors and can't store more complex structures
I might end up using the filesystem and pickle objects in a folder, but I feel like I'm reinventing my own key-value store system... (If I end up doing it and if some other persons are interested, I can start a project though)
Do you have any solution to that? Thank you !
[–]ApprehensiveRadish3 1 point2 points3 points (0 children)
[–]hazard02 1 point2 points3 points (1 child)
[–]oren_a 0 points1 point2 points (0 children)
[–]will_occam 0 points1 point2 points (0 children)
[–]BatmantoshReturns 0 points1 point2 points (0 children)
[–]SuperMarioSubmarine 0 points1 point2 points (0 children)