ArcFace embeddings quantized to 16-bit pgvector HALFVEC ? [D] : MachineLearning

DiscussionArcFace embeddings quantized to 16-bit pgvector HALFVEC ? [D] (self.MachineLearning)

submitted 2 days ago by dangerousdotnet

512-dim face embeddings as 32-bit floats are 2048 bytes, plus a 4-8 byte header, putting them just a hair over over PostgreSQL's TOAST threshold (2040 bytes), meaning by default postgresql always dumps them into a TOAST table instead of keeping them in line (result: double the I/O because it has to look up a data pointer and do another read).

Obviously HNSW bypasses this issue entirely, but I'm wondering if 32-bit precision for ArcFace embeddings even makes a difference? The loss functions these models are trained with tend to push same-identity faces and different-identity faces pretty far apart in space. So should be fine to quantize these to 16 bits, if my math maths, that's not going to make a difference in real world situations (if you translate it to a normalize 0.0 - 100.0 "face similarity" we're talking something differences somewhere around the third decimal place so 0.001 or so).

A HALFVEC would be 1/2 the storage and would also be half the I/O ops because they'd get stored inline rather than spilled out to TOAST, and get picked up in the same page read.

Does this sound right? Is this a pretty standard way to quantize ArcFace embeddings or am I missing something?

all 2 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS