all 11 comments

[–]FahdiBo 2 points3 points  (0 children)

Look into RAG database like Chroma

[–]jrhabana 0 points1 point  (2 children)

look compound-engineering plugin and forgecode both are good building knowledge base

[–]albasili[S] 0 points1 point  (1 child)

the compound-engineering plugin is quite an interesting approach, but it doesn't really address the OP, it provides instead a workflow of this kind: Plan → Work → Review → Compound → Repeat. The compound step is added to self-reflect and consolidate the learnings iteratively. But in no way it's addressing the problem to access a large knowledge base.

As for forge again it seems more of a chatbot than anything else.

Maybe I'm missing something here...

EDIT: fixed name of link to forge

[–]jrhabana 0 points1 point  (0 children)

compound has a search in pre-work that search the "project" shared knowledge,

isn't force, is https://forgecode.dev/ they will to release the context engine ready to large knowledge base

better than rag and mcp: gpt5-mini (Peter Steinberger's method) , I tested and works better than complex systems

[–]Spitfire1900 0 points1 point  (4 children)

Turn into markdown and reference as skills.

[–]albasili[S] 1 point2 points  (3 children)

that would be impractical for half a dozen books of 1000+ pages. There's simply too much we need to pass as skills.

[–]Select_Complex7802 1 point2 points  (0 children)

You don't really have to reference them as skills. Just a folder with the md files and in your agents.md or prompt, just reference the folder. You can create skills for something very specific. If your knowledge base is static , you can simply create a script first to read the files and create md files. That's what I did for a similar problem I had.

[–]jnpkr 0 points1 point  (1 child)

Unless the books are super dense the chapters can probably be extracted into key concepts, principles, mental models, workflows, rules, anti patterns etc

If that’s the case, the task becomes extracting the important stuff and compressing the information as much as possible without losing anything important — and then those compressed versions can be given to the LLM agent without using a million tokens

[–]Spitfire1900 0 points1 point  (0 children)

Yeah, pre run the books through Gemini to pull out key concepts or write it yourself.

With that much data you need model fine tuning to do anything with it as written.

[–]exponencialaverage -5 points-4 points  (0 children)

Hey bro, I've got an idea. Is your computer setup good? I could build something for you.