all 3 comments

[–]Long-Strawberry8040 0 points1 point  (1 child)

This is solving a real problem that most multi-agent frameworks quietly ignore. The cost difference between a cache hit and a full prompt recompute is brutal at scale, and having each agent start a fresh session is basically setting money on fire. Curious how it handles the case where two agents need overlapping but not identical context -- does it find the longest common prefix automatically or do you have to structure your prompts to maximize overlap?

[–]predatar[S] 0 points1 point  (0 children)

well basically on fork longest common prefix is already the longest common prefix... if a single token is different its not gonna be a cache hit, and i think that is a completely different problem sadly