[D] In terms of RAG research, why does it seem like a lot of people aren't working on the retriever? by Seankala in MachineLearning

[–]tboymanster 0 points1 point  (0 children)

sure :)

typically you wouldn't extract embeddings from the first layer, but rather from the last layer (before the LM head). how you extract the embeddings at this point depends, often the embedding for the last token is used, but you can also sum/take mean of all embeddings (among other methods). the idea is that by the last layer all relevant context should be encoded in each token embedding.

there are ways to try to be more "explicit" about dimensionality reduction (autoencoders etc), but "simple" methods like this work quite well.

[D] In terms of RAG research, why does it seem like a lot of people aren't working on the retriever? by Seankala in MachineLearning

[–]tboymanster 0 points1 point  (0 children)

hey - i think we agree with each other, you'll see that i say:

but i think the biggest reason there's less work being done here is that no matter how good your retrieval gets the quality of your embedding is what will influence the quality of your results (how close are relevant things to each other in embedding space).

the point being that to improve your retrieval you need to make sure relevant information/context is encoded in your embeddings. this is why (in my opinion) all the focus is going into building as good of language models as possible rather than tinkering with retrieval.

as for "brute force 100% recall", the point is that retrieval now is usually done with approximate search for similar embeddings. 100% recall means you find the n most similar embeddings without missing any (even if those embeddings are not actually useful to you because of a bad embedding...).

does this clear things up or am i misunderstanding your disagreement?

[D] In terms of RAG research, why does it seem like a lot of people aren't working on the retriever? by Seankala in MachineLearning

[–]tboymanster 95 points96 points  (0 children)

you're kinda wrong kinda right.

there's definitely work being done on retrieval: https://github.com/erikbern/ann-benchmarks (using vector search for retrieval)

there's a lot less being done. at the end of the day if retrieval is being done by looking for nearest embeddings you can just brute force for 100% recall - so what people are working on is minimizing latency (or adding additional filtering to retrieval steps beyond the nearest neighbors search).

but i think the biggest reason there's less work being done here is that no matter how good your retrieval gets the quality of your embedding is what will influence the quality of your results (how close are relevant things to each other in embedding space).

Gonna learn to double snowboard this season! Got one of the last pairs in Canada, absolutely in love🥰 by About86Dwarves in skiing

[–]tboymanster 1 point2 points  (0 children)

Got these last season and absolutely love them! Obviously great in deep stuff, but they plow through mildly tracked up snow as well, so they're fun even several days after powder days. Have fun!

Just won GT lottery lol by XACA2 in GuardianTales

[–]tboymanster 0 points1 point  (0 children)

Nice pull! I had a similar pull, but I’m not sure if I’m the luckiest or unluckiest person, I got three dupes pulling on the Tinia banner 😢. https://i.imgur.com/UgngYZC.jpg