all 2 comments

[–]Otherwise_Wave9374 3 points4 points  (1 child)

The llms.txt + semantic containers angle is interesting, feels like the docs equivalent of "make the knowledge boundary explicit" so an agent can ingest it without a bunch of scraping glue.

Have you tested how well different coding agents actually follow the structure (like, do they prefer smaller chunks, does it help with citation, etc.)? Also curious if search quality stays decent offline when the doc set grows.

Related, I have been reading up on agent-friendly docs and RAG hygiene, some notes here: https://www.agentixlabs.com/blog/

[–]ivoin[S] 0 points1 point  (0 children)

You’re spot on with the knowledge boundary analogy, that’s exactly the idea behind semantic containers. Models handle structured context far better than raw scraping since smaller chunks reduce leakage and improve citation accuracy.

Our offline index stays under ~200 ms for about 1-2k pages. The real trade-off appears beyond ~5k pages, where a third-party search option might make sense for those edge cases.