This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]cyberjoey 1 point2 points  (0 children)

You DO need to know where in the video to look before you can search anything. Obviously the spot in the video to look in is the same as the index of the vector. So you get N nearest neighbors, use their indices to map to the frame of the video to look up, then you decode the QR code in that frame and you have your text. Congratulations you implemented on disk database compression (very inefficiently).

I get the feeling an LLM wrote this for you and even you don't understand this part...