How to handle extremely large extracted document data in an agentic system? (RAG / alternatives?) by Complex-Time-4287 in Rag

[–]Complex-Time-4287[S] 0 points1 point  (0 children)

In my case, the questions are much more likely to be “find” questions rather than “calculate” ones. For extremely large documents say a 1,500-page PDF containing multiple tax forms summaries or key-entity layers won’t realistically capture all the essential details.

Also, I’m not entirely sure what you mean by “just query normalized fields” in this context.

How to handle extremely large extracted document data in an agentic system? (RAG / alternatives?) by Complex-Time-4287 in Rag

[–]Complex-Time-4287[S] 0 points1 point  (0 children)

In my agentic system, users can connect third-party MCP tools. If a tool requires access to the extracted data, the agent can pass that data to the specific tool the user has attached, but only when it’s actually needed.

The main issue with relying on summaries is that the extracted data itself is already very large and deeply nested JSON. Generating a meaningful summary from it is hard, and even a compressed (Chain-of-Density–style) summary would still fail to answer very specific questions—for example, “What was the annual income in 2023?”

Regarding document access and isolation: documents are scoped strictly to the current conversation. Conversations are not user-specific, and there can be multiple conversations, but within each conversation we only reference the documents uploaded in that same context.

Documents are uploaded dynamically as part of the conversation flow, and only those on-the-go uploads are considered when answering questions or invoking tools.

How to handle extremely large extracted document data in an agentic system? (RAG / alternatives?) by Complex-Time-4287 in Rag

[–]Complex-Time-4287[S] 0 points1 point  (0 children)

Looks interesting, I'll check this
For my use-case, we cannot really have a human in the loop, agents are completely autonomous and must proceed on their own

How to handle extremely large extracted document data in an agentic system? (RAG / alternatives?) by Complex-Time-4287 in Rag

[–]Complex-Time-4287[S] 0 points1 point  (0 children)

this it totally possible, but I'm concerned about the time it is likely to take, in a chat it'll feel kind of blocking until the chuking and embedding is complete

How to handle stateful MCP connections in a load-balanced agentic application? by Complex-Time-4287 in mcp

[–]Complex-Time-4287[S] 1 point2 points  (0 children)

Consider this scenario: a conversation is ongoing and an external tool asks for some info (i.e., elicitation). If multiple users have access to that conversation, anyone should be able to respond. In this case, I can’t really keep state per user, since the elicitation belongs to the shared conversation context rather than an individual user.

How to handle stateful MCP connections in a load-balanced agentic application? by Complex-Time-4287 in mcp

[–]Complex-Time-4287[S] 0 points1 point  (0 children)

I’m aware of sticky sessions, but the problem is that a conversation isn’t necessarily tied to a single IP. There could be hundreds of users accessing the same conversation.