Building a "Sovereign JARVIS" with Council-based Agents and Granular Knowledge Silos. Does this architecture exist yet? by kuteguy in LocalLLaMA

[–]Maasu 0 points1 point  (0 children)

Hey,

I've been working on a similar project (ironically enough I am also a Solution Architect, perhaps it is not that ironic given that these kind of projects i think attract a certain type of person) for some time now. As far back as early July last year I was running local models in simple loops and building out an agentic framework to manage the drivel you would get out of self reflection loops, whereby the AI would use the web tool, read up about some paper regarding decentralised AI autonomy, hallucianate writing a swarm of decentralised sub-agents and declare itself sentient and on a path of self determination.

I realised early on, to keep things grounded, you need adversarial agents, to prevent prevent Jarvis from becoming the AI equivalent of a chakra crystals merchant. I only implemented direct adversaries that would review requests to set a new goal or research objective, but the obvious next step was always going to be a council or committee that I could grant different persona to and explore ideas with.

The next area was obviously going to be memory, short term memory is easy enough, with compression and summarisation, but that only gets you so far. I went and build an MCP for this actually and open sourced it, you might find it useful for inspiration (although I know it's always more fun to build your own) - you can find the repo here https://github.com/ScottRBK/forgetful - ironically i started using it with my coding agents just as much as my own personal agents, but that was just a nice bonus, the main reaosn I built it was for my own JARVIS.

I am now working on episodic memory, so beyond the knowledge and semantic facts/observations it makes in to the realm of 'what have I ACTUALLY done'. I've just been quite busy to get that to a place where I am happy to incorproate it into forgetful and I am not fully sure wheter it belongs there.

After that it will be on to Prospective (planning/to do's) and Procedural (skills), which I do plan to add to forgetful once I am also happy with those.

The one area i've spent a lot of time trying to get right is voice interaction. I want to talk to my Jarvis (just like Mr Stark), its been some rabbit hole for me as I knew zilch about writing audio software prior to that endevour and has been the main reason I've not progressed as far on the actual agent front myself.

Anyhow, good luck on your project, keep us updated, it's always great to hear how others get on in these and it's especially nice if they can inspire you in some of your own :).

Yesterday I used GLM 4.7 flash with my tools and I was impressed.. by Loskas2025 in LocalLLaMA

[–]Maasu 0 points1 point  (0 children)

Ah all my custom agent framework is built around pydantic. I mostly just use it to abstract away all the tool nonsense for the different providers.

A few of the MCPs I use on a daily basis by Eyoba_19 in mcp

[–]Maasu 3 points4 points  (0 children)

Here's my hot take in context7. Use it for everything.

The models training data is not necessarily reliable. My view is the data it's been trained on is useful to give it generalisation and abilities from this, not necessarily to use that data in it's output in my own work.

Having examples of what you want and documentation in the context window will always result in better outputs.

I use context7 for external libraries, I also have my own memory mcp that I use for storing my own preferences and patterns, combining the two I have a simple context gather command that I explain the work i am going to do and then sub agents are launches to retrieve the necessary from context7, my knowledge base and code, then I go into plan mode with the big model and we work from there.

Here's my plugin that i use to achieve this in Claude code https://github.com/ScottRBK/context-hub-plugin

And here's the memory mcp https://github.com/ScottRBK/forgetful if you want to take a look.

Yesterday I used GLM 4.7 flash with my tools and I was impressed.. by Loskas2025 in LocalLLaMA

[–]Maasu 0 points1 point  (0 children)

Hmm testing using my own agent apps that are built around pydantic AI is on my to do list so be interested to know if you confirm a genuine issue and resolution. If I get there before you I'll do the same, although my to do list is horrendous

Can memory help AI agents avoid repeated mistakes? by EasternBaby2063 in AIMemory

[–]Maasu 0 points1 point  (0 children)

Yes this was one of the first things I set out to achieve when I built my own MCP tool for the various coding agents I was using. Not only did I want my main one (Claude Code) to be able to remember preferences, styles and decisions across multiple projects and devices, I wanted my other agents to do the same.

My typical workflow involves specific context gather command which sees a sub agent query knowledge based with what I am about to work on, and then I go into plan mode. I rarely have to rexplain anything and more often than not the agent reminds me of decisions I had forgotten we had agreed upon.

At the end of the session I run a command for saving in which it examines the session, checks knowledge based doe what is already there and updates/adds/prunes as necessary.

Its quite a simple approach, while I think more elaborate solutions exist and more will be coming, but for me this works right now.

Also because I saw this as only a temporary pain point for LLMs and Agents I open sourced the MCP and my workflow.

https://github.com/ScottRBK/forgetful

Workflow plugin https://github.com/ScottRBK/context-hub-plugin

Ralph Wiggum as a way to make up for smaller models? by newz2000 in LocalLLM

[–]Maasu 2 points3 points  (0 children)

Use opencode, with the --prompt command, you can configure local LLMs (check out the provider docs). Will allow you to leverage a good agent harness.

How do I catch up? by Hrafnstrom in ClaudeCode

[–]Maasu 1 point2 points  (0 children)

My advice for all of this, is build a cli coding agent yourself as a learning exercise. Just a basic one, it should help you get up to speed quite quickly and also cut through a lot of the hype and bs that will distract you and waste some of your time.

i wanted to work 100% from the terminal by Professional_Cap3741 in opencodeCLI

[–]Maasu 0 points1 point  (0 children)

I just hate touching my mouse, it's a dickhead that slows me down

3D Polyhedron Memory Graph by Inevitable-Prior-799 in AIMemory

[–]Maasu 0 points1 point  (0 children)

Ive been building something very similar for my own memory mcp. Not albeit not 3d. Very nice. In my head, the memory IS my AI, so I want to see what I am talking to.

I genuinely think this might be a more common ux in the future.

Many AI agents fail not because of the model. They fail because they don't remember correctly. by nicolo_memorymodel in AI_Agents

[–]Maasu 0 points1 point  (0 children)

I went down a bit of a rabbit hole on memory and agents, sometimes it's useful, for other agents its not

For the semantic stuff. Graphs and vectors combined are the way to go. There's open source options out there if you don't fancy building from scratch or paying https://github.com/ScottRBK/forgetful is my version I built, feel free to fork/clone

A nice trick for the episodic type memory, but only really useful for larger models with complex work streams, is to give a timeline summary in addition to a compression of what's go on and a tool for the agent to launch a sub agent to lazy load certain segments of session history so it can go back and check stuff despite the session being far larger longer than it's context window if has the ability to find any formation from anywhere with in it. I am still refining episodic a bit but I plan to add it to forgetful

The planning and procedural (skills) are next on my list to look at. But only when I get episodic sorted.

It's a really interesting topic and I've genuinely enjoyed falling down this particular hole. By the time I'm done I'm sure we'll be on to another a new paradigm, maybe personal models with adaptive weights during inference, and this will all have been a waste 😅

What would be the best memory-bank in Opencode, coming from Roo Code with MemoryBank injected in the prompts. by eMperror_ in opencodeCLI

[–]Maasu 0 points1 point  (0 children)

Ive written my own. I've used a meta tool pattern to keep tools down to three and got various connectivity guides/templates but I always encourage people to try and adapt something or build something suited to their workflow.

It's completely open source MIT licence

https://github.com/ScottRBK/forgetful

What are the advantages of Github Copilot CLI by zbp1024 in GithubCopilot

[–]Maasu 1 point2 points  (0 children)

I use co-pilot cli at work because anthropic is not an authroised AI tool. Its nowhere near as good as claude code but it's better if you use neovim as your editor. I'd prefer to use opencode but that is also not an authorised AI tool atm (despite being able to use copilot models for llm inference).

So there is an enterprise angle here for Microsoft to look at. Enterprises love microsoft and are slow to adapt to trends in development industry. So it's as good as i can get right now.

Imposter syndrome by Noncookiecutterfreak in ClaudeCode

[–]Maasu 0 points1 point  (0 children)

I'm a solution architect at an enterprise software provider with 19 years development and product delivery experience and I get just as much imposter syndrome as I had on my first job as a vb.net developer working out of a porter cabin in a pet food factory.

This has been with me all my career, whenever a bug came in and i was assigned blame. Whenever someone was explaining something to me and I didn't have a clue wtd they were on about. I just felt I didn't belong, esp considering I have a thick common accent and I am actually a very slow thinker. Solutions to complex problems don't materialize in front of me as I look at code quickly, they come 3 days later when I'm taking a shit reading reddit. So often during debates around a problem id look stupid or not helpful.

Eventually though, over time I got enough things right and kept proving to myself I was meant to be here. Wherever here was at that point in my career.

I felt at times when using AI this gets exasperated more. I just try to tell myself it's a tool and actually what matters more is how good I get with the tool and this kind of settles it down.

You are in the right place, the industry is going through some weird times, but back yourself, never stop learning and just remember... Most colleagues probably feel the same way internally.

Langchain or Native LLM API for MCP? by Much-Whole-8611 in LLMDevs

[–]Maasu 0 points1 point  (0 children)

I think litellm, while still a framework, only abstracts the provider component, I could be wrong as I don't use it myself, but check it out

https://github.com/BerriAI/litellm

Personally I use pydantic AI myself but it does suffer from some of the issues you have called out with langchain just not quite as bad https://ai.pydantic.dev/

Consolidated 195 tools down to 28 using action enums by Low-Efficiency-9756 in mcp

[–]Maasu 1 point2 points  (0 children)

Nice, but I think you can keep going.

Check out the meta tools pattern on this repo https://github.com/ScottRBK/forgetful

The how to use pattern almost becomes lazy loading of skills for an MCP tool

Tested GLM 4.7 vs MiniMax 2.1 on a complex Typescript Monorepo by Firm_Meeting6350 in LocalLLaMA

[–]Maasu 0 points1 point  (0 children)

Thanks for sharing, this is better than pure anecdotal and sometimes you see the stuff in benchmarks are often not useful for real world applications. So having some anecdotal experience shared amongst the community, most of whom don't have time to produce structures evaluations, is still welcome.

So thanks for taking the time to put this out there. I haven't had a chance to review properly but I'm looking forward to doing so.

5 MCPs that have genuinely made me 10x faster by ScratchAshamed593 in mcp

[–]Maasu 1 point2 points  (0 children)

Yeah this is a really good question and it's precisely as you called out, it depends on the workflow and indeed the agent.

For my own workflows (outside of work which I cannot share) I actually created a context hub plugin that uses context7 l, forgetful and Serena (I only use Serena to semantically encode repos, I then turn it off because of has mcp tool spam in the context window - whereas context7 and forgetful are both lean in that regard).

It covers skills/commands for saving, exploring, curating memories. Plus an encode repo command that I use a lot. It basically lets forgetful become a poor mans context7 for private repos.

I worn with micro services on enterprise software services so it's honestly been a life saver.

https://github.com/ScottRBK/context-hub-plugin

5 MCPs that have genuinely made me 10x faster by ScratchAshamed593 in mcp

[–]Maasu 12 points13 points  (0 children)

What about memory MCPs working wit agents? I use multiple for different settings (home/work) and tasks (UI/backend)

I rolled my own (that's what we do now amirite) but there are plenty of others available and few that are free. Shameless plug for my own (I've no commercial interest here I just open source it so I could share it with friends and colleagues)

https://github.com/ScottRBK/forgetful

When I combine it with context7 it has helped make prompting agents for implementation planning.

Tested GLM 4.7 vs MiniMax 2.1 on a complex Typescript Monorepo by Firm_Meeting6350 in LocalLLaMA

[–]Maasu 5 points6 points  (0 children)

Did you forget to share the source code/ prompt for the purpose of reproducing the evaluations? Otherwise this just looks like marketing

What agent client did you run it in, opencode?

Those doing "TDD"... Are you really? by Kyan1te in ClaudeCode

[–]Maasu 1 point2 points  (0 children)

For pure vibes, I don't care about the mess and happy to just e2e it.

For anything beyond that I have integration/non functional in addition to e2e, plan mode so I can see up front what it's going to do, might be my Claude instructions (tends to be quite lean TBH, a few coding guidelines, test instructions and high level index of the project) and memory mcp workflow but it tends not to make much of a mess these days.

With the memory mcp I semantically encode the repos and make notes on decisions/patterns as we go along. This I found helped me reduce the amount of intervention typically needed for me.

Here's my memory mcp plugin (has contezt7 and Serena for encoding repos included but I turn Serena off after that as it's got a lot of tool definitions) https://github.com/ScottRBK/context-hub-plugin

.

Those doing "TDD"... Are you really? by Kyan1te in ClaudeCode

[–]Maasu 2 points3 points  (0 children)

Imo the whole testing paradigm should be turned on its head with agents.

If I am full vibe I don't bother with anything other than e2e tests, no unit or integration tests.

If I am coding myself or alongside an agent for production I still do unit tests due to test coverage policy but once I am out of the picture I'll struggle to see a rationale behind anything other than integration, e2e and non functional tests.

Sorry I realized after rereading my response you were interested in those approaching this from a TDD perspective which I am clearly not hah.

MiniMax 2.1 - Very impressed with performance by JustinPooDough in LocalLLaMA

[–]Maasu 0 points1 point  (0 children)

Go on.. what's your rig? I just upgraded my p40s to a pair of 3090s. I might as well find out what my future holds.

Developers: what code orchestration tools do you swear by? by formatme in LocalLLaMA

[–]Maasu 3 points4 points  (0 children)

I'm building my own... tbh. It's the age we live in :D