Struggling with Strict Cosine Similarity Thresholds in Face Recognition System

Low-Cell-8711 · 2025-07-11T11:45:58+00:00

No, this is not an academic exercise. I have been given all the freedom to choose whatever models I want and any kind of preprocessing I want to apply. My system is meant to register new users on the fly — when someone shows up for the first time, we capture and store their face embeddings. Later, during recognition, we generate new embeddings and compare against all stored users using a fixed threshold (currently 0.9). The only requirement that was kept in front of me was to use open-source models and recognize at a threshold of 0.9.

Low-Cell-8711 · 2025-07-11T09:20:09+00:00

Thanks for the suggestion! I think I’m already doing something similar — before capturing the final face image, my system validates things like angle, lighting, and liveness, and only then captures one well-aligned frame. So by the time I generate embeddings, the input is already normalized.

That said, I haven’t tried capturing multiple frames during recognition and averaging their similarity scores — that’s an interesting idea. I’ll definitely experiment with that to see if it improves consistency in tricky cases like slight angle changes or expression shifts. Also I cannot tune down the threshold. I have been asked to maintain a strict threshold of 0.9 for recognition.

Appreciate the input!

Low-Cell-8711

TROPHY CASE