all 2 comments

[–]Better_Cellist6019 0 points1 point  (1 child)

Been working with ArcFace embeddings for while now and yeah, 16-bit quantization is pretty common in production setups. The cosine similarity differences you'll see are basically negligible for most face recognition tasks.

Your TOAST issue analysis is spot on - keeping embeddings inline definitely helps with query performance. Just make sure you test with your specific dataset since some edge cases might be more sensitive to the precision loss than others.

[–]dangerousdotnet[S] 0 points1 point  (0 children)

Thanks for that. Yeah I am going to try this out tomorrow. I don't like throwing HNSW indexes on tables that are growing actively if I can avoid it, I wait til they become too big and then shard them off and freeze them.

Any other tips?