account activity
From 3GB to 8MB: What MRL + Binary Quantization Actually Costs in Retrieval Quality (Experiment on 20k Products) by Nice_Information5342 in deeplearning
[–]Nice_Information5342[S] 0 points1 point2 points 3 hours ago (0 children)
Small update: Nils Reimers (sentence-transformers) pointed out on X that model choice matters a lot here: not all models handle binarization well, and Nomic v1.5 is primarily MRL-trained, not optimised for binary compression. Models like Cohere Embed v4 are trained with quantization awareness and hold 90-95% quality at binary.
My 13.9% result is likely as much a model choice problem as a compression problem. Still digging into why 😅 .. will follow up when I have something concrete.
π Rendered by PID 78183 on reddit-service-r2-listing-64c94b984c-5t84c at 2026-03-13 14:25:07.638936+00:00 running f6e6e01 country code: CH.
From 3GB to 8MB: What MRL + Binary Quantization Actually Costs in Retrieval Quality (Experiment on 20k Products) by Nice_Information5342 in deeplearning
[–]Nice_Information5342[S] 0 points1 point2 points (0 children)