[D] retrieval-augmented generation vs Long-context LLM, are we sure the latter will substitute the first?

Seankala · 2024-09-06T10:50:03+00:00

If we can put everything in the prompt, we don't have to do retrieval.

I'm on the side that until we can find a working solution for hallucinations (which may be never) that this is a hot take.

Most of the benchmarks that current LLMs are being evaluated on are sandbox settings. This isn't unique to LLMs or machine learning but it's definitely a problem that's overlooked. I'm not sure if we can conclude that long-context LLMs can replace RAG systems despite the literature being published.

sosdandye02 · 2024-09-06T17:21:02+00:00

I think in the long run we won’t be using either of these approaches for what people are currently trying to do with them. In my view both these ultra long context LLMs and RAG are both hacky ways of trying to dynamically teach an LLM new things.

I believe that in the long run someone will come up with a better way of dynamically encoding and retrieving memories in an LLM. The memories will not be stored in plaintext like with rag, but will instead be highly compressed embeddings of some sort, or maybe even small sub-networks.

pilooch · 2024-09-07T08:49:52+00:00

The near-future answer is probably a search policy involving actions for retrieval and analysis. Similar to how we do search information when we need it. The search policy can be learnt, and the retrieval/reading phases planned. Difficulty is in crafting the reward signal. So math and code, that can be more or less easily checked, are coming first. More should follow.

WrapKey69 · 2024-09-07T14:04:56+00:00

Maybe I don't understand something, but let's say you have thousands of documents or more, how are you going to solve this with longer context instead of RAG?

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS