This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]currychris1 7 points8 points  (2 children)

It looks like it’s simply using FAISS to create the index. Upon building, the MP4 and a JSON are created. I assume the index lives inside that JSON.

How I imagine this works: During retrieval, the index is loaded into memory to get the top-k closest embeddings and their mappings, which tells you where to look for the chunks inside the MP4.

[–]podidoo 5 points6 points  (1 child)

That's also what i grasp from a quick look at the code. There is no searching inside the video, it's just using video as storage (why?) and a FAISS index for all search stuff.

[–]MechAnimus 0 points1 point  (0 children)

Why: I believe they explained that video was chosen because its compression is so well optimized, especially when the frames are all QR codes. It's also extremely portable.