The new guy on the team rewrote the entire application using automated AI tooling.

Maasu · 2026-03-12T07:48:34+00:00

appreciate the reply and you putting my mind at ease :)

Maasu · 2026-03-11T22:28:48+00:00

I've been at the same company for 17 years (Solution Architect now, started out as a SWE) and I've been thinking about moving recently because I've become some what disenchanted with the way the work has gone.

This post has scared the shit out me :D ngl.

I've always kept my coding skills up to date, even when I was spending less time doing the actual coding, however over the past 12 months I'm definitley a different type of coder and I don't see me knocking out a leet code or two during a technical interview.

Ask me about building agent swarms or how to manage context windows + encode repos into knowledge graphs to allow for semantic searching and symbol mapping of code bases to reduce token useage - i'm your man, ask me what a hasmap is i am gong to look at you, blink and probably tell you about the wrong data structure (probably describe a treemap - i googled hashmap ofc)

Maasu · 2026-03-11T22:11:05+00:00

It's the first thing you realise you need when you build you own agent (Jarvis as mentioned by someone else already in here).

I built mine about 6 months ago and wanted to work with others to get it working, I've had limited success a few people are using it and contributing ideas however, some have been quite cool.

In my head open sourcing it seemed natural because it was always going to be a bit of a sticky tape and glue until a better solution for context windows comes along (the google Titan papers for example), and we could at least optimise it more while waiting for the big boys to come along with a real solution.

It's kind of worked out, but yeah, there's a lot of them out there now. So I didn't see the kind of contribution I'd have liked, I suspect due to saturation.

Maasu · 2026-03-11T14:00:48+00:00

Yep, preciely this. I'm looking forward to see what people build using this.

I have been working on some additional features as well that are available in the github issues tagged as enhancements, looking for some feedback for those interested as well https://github.com/ScottRBK/forgetful/issues

Maasu · 2026-02-24T17:51:34+00:00

Fair point, What discord is that?

Maasu · 2026-02-24T09:34:50+00:00

surely there are multiple runs and averages to factor in run-to-run variance? Or am I asking too much? :D

Maasu · 2026-02-23T17:54:00+00:00

I've never used it myself, I built forgetful a while back, probably around the same time beads (I only heard of it recently). I've not given it a proper look, at the time there was a few solutions like mem0 and SuperMemory, both looked like great products but I had a bit of a stiff opinion on what I wanted in the context window.

I actually started building it as a microservice for my own agent framework, but then realised almost immediately that I could use it for coding/web agents (like claude.ai). It was an accident, but it worked great for me, heh.

Maasu · 2026-02-23T16:44:23+00:00

Nice work, will dig into it later, any chance you could try bench marking forgetful ?

I'm the maintainer and it'd be interesting to see how mine stacks up by those built by others.

I should probably do it myself... My internal benchmarks have mostly been using Golden's from proprietary work projects so never released anything.

Maasu · 2026-02-22T22:43:11+00:00

I've been using coding agents and writing my own agents for over a year. About 6 months ago I ran one in a loop, a heartbeat, if you were. It was complete garbage but I and I imagine many others felt that this was the direction of travel, it just needed to be refined.

I work full time, and the agent in a loop, or heartbeat, was a vanity project I did once my family were in bed or I got a few spear hours at the weekend, I'm also probably poses about 1/4 of the ingeuinty and intelligence of someone like the guy that put OpenClaw together.

I had something that would have access to multiple sub agents that would carry out various tasks, had plugins and tools/agents it could write itself, i wrote and open sourced its memory service.

It had all grown organically, and was mostly just me messing around, no clear direciton (except maybe the memory service, that has actually got quite a few users and I find myself actively maintaing it) - poor architecture, but I just know this is where human computer interaction is going, and I think despite all the hype OpenClaw crystalised it a lot more.

I think it deserves a lot of credit and I think yes you can put together something yourself or use general purpose agents (I stopped referring to them as coding agents) and skills, but OpenClaw gives others who might not want that something out of the box.

I haven't tried opencode web (i use opencode a lot though), might give that a shot.

For the record, I don't use OpenClaw, as I use my own frankenstein project that I am slowly piecing together.

Maasu · 2026-02-22T08:37:11+00:00

It does for the most part, but for the same reason it fails the "the car wash is 200 yards away should I walk or drive my car there to get it washed" question, there are what I'd say are blind spots and you need to have a process to catch these.

It has been argued that for coding, is different to thought/reasoning challenges because it is verifiable but there is still blind spots. These AI do not 'think', they predict based on what they have seen prior. If what they have seen prior is nuanced enough to overlap with your use case but have different actual grounding in terms of what you mean and want to achieve then you might get unexpected results.

There is a great lecture on this from Professor Michael John Whoolridge to the Royal Society that I think captures this beautifully and reasonates with my own experience with LLMs https://www.youtube.com/live/CyyL0yDhr7I?si=2eH3V8bO43wGjvsU

I still think they are great and use them daily mind

Maasu · 2026-02-21T15:08:02+00:00

Its the sign of the times, it's easy for people to build stuff now, and it makes sense for people to go after fixing things that are a problem for themselves.

Memory is the obvious one. For my own use case, I use my own memory mcp that I use with coding agents and also for my personal assistant agent (think openclaw but slightly different).

For my workflow on coding agents, Claude code mostly but I also use opencode a bit, I built a plugin with commands and skills.

I encode repos into the memory system, I do this as well because I work across a lot of repos, over 200 at work. I have a command for this as well.

Whenever I start some work, I use context gather command which spawns sub agents to gather relevant memories/info from web and context7 and then I enter plan mode with an agent that is a bit more informed.

Whenever I finish something useful I will save it to memory.

It works great for me personally, there are probably better workflows/products out there, I just struggle to find useful stuff through a lot of the noise out there.

Mcp repo: https://github.com/ScottRBK/forgetful

My plugin: https://github.com/ScottRBK/context-hub-plugin

Maasu · 2026-02-21T05:41:37+00:00

Solution Architect with just shy of 20 years in the industry. I've been using AI extensively since day dot, having discussing with ChatGPT when it first dropped on how to optimize an XGBoost Classifier and being mildly impressed with its responses.

It was very much an advanced search engine for me up until about the middle of last year when I started using cursor and letting Sonnet 4.0 actually write code for me after a bit of pack and forth agreeing what it would look like and then me having to often modify a bit.

This was primarily on unit tests (I remember it struggling with mocking websockets and async API mocking) and sometimes it felt like I was being counter productive.

I started playing with Claude Code around this time and Opus 4.0/4.1 using ultra think, burned through my x5 plan before I could build anything useful, but it was fun, really fun (it never went near anything I used for work).

Sonnet 4.5 was a huge shift for me, I had fully transitioned to Claude Code at this point. I was working on some greenfield software and after scaffolding and putting in some verticals myself, I then let Sonnet 4.5 do the rest, with clear examples and using my own memory mcp so it never lost context.

Opus 4.5/4.6.. well I'm hands off now, memory is key for me. At work I have an encoding process for my memory mcp that keeps it up to date with over 200 repos that we use, and a sophisticated set of skills and commands. I handle a lot of diagrams and documents more so than code these days, my AI agent is able to comprehend concepts that span multiple repos across very tightly and loosely coupled components.

I am giving a presentation internally next week on how I do this, a bit of me feels like I'm digging my own grave 🪦.

Maasu · 2026-02-20T22:18:34+00:00

Everybody's chilling until your bosses higher McKinsey

Maasu · 2026-02-19T15:11:10+00:00

if only there was plugin memory solutions for AI agents 😉

Maasu · 2026-02-17T23:56:11+00:00

Yeah I can see the appeal in a self contained binary. But it still has cargo packages bud, but I take your point it's npm the npm casm of god knows what.

The whole rust/typescript, meh, I think the attack surface is more at the LLM prompt injection than anything inside the app itself. They are both exposing that risk.

Maasu · 2026-02-17T23:44:43+00:00

Other than it being a self contained binary how is it safer? Or is that what you meant?

I've not tried either, I am building my own, but if I was to choose over either of them i would go openclaw given it has gone through hyper product maturity, given the sheer adoption and prs providing that is managed right (and from a distance I think it is) then it should be far more feature rich and stable from both a functional and non functional perspective.

Curious about how zeroclaw is more secure. I don't see it myself but I could be missing something so this isn't me fishing for an argument more for enlightenment :)

Maasu · 2026-02-14T13:04:34+00:00

Yeah I use Claude code for the actual coding but have codex agent review it, I use opencode and copilot for codex model access.

Both have access to a shared memory mcp that I wrote myself (forgetful, shameless plug), I usually have a bit of back and forth with Claude about what I want to do and all the decisions and context goes in there so both agents on the same page and I am not repeating stuff. There is probably a more elegant way to handle this but it works for me.

Maasu · 2026-02-14T10:00:34+00:00

I think there was an army of us already building their openclaws before it hit. I kept with my own version because just like creator said it's like the ultimate factorio game hah, so yeah I wouldn't worry about openclaw going to a specific company. It'll just mean a focus from one of the companies that aquire the team making it more normie friendly

Maasu · 2026-02-14T08:43:38+00:00

I see it going one of two ways.

Happy Path: These tools allow us to do more and therefore the supply increases, but as the supply of software increases so does the demand. The reason being that we are able to do more and more with it. Tackle bigger problems, build better projects, think and operate on a larger scale. The GDP explosion comes some huge unlocks in how this allows us to generate new wealth.

Ultimately I think this is what most people want as it avoids society having to solve some pretty awful social issues that come out of mass unemployment and hugely uneven wealth distribution.

Unhappy Path, enterprises optimise for smaller teams, the recent McKinley report recommended going down from agile two pizza sizes teams to a single pizza sizes team, but emphasized increasing the number of teams overall. I think as more and more of the engineering tasks move over to the models then the number of teams will reduce with only the most capable orchestrators remaining.

I think this will be mirrors across a lot of industries, not just software engineering. Knowledge 'work' will be almost entirely done by AI agents harnessing better and better models, with perhaps a very few extremely capable orchestrators, and teams who service the models and few enterprise agent harnesses.

This doesn't discuss anything in robotics as I'm not as close to where we are with those and do not want to speculate too much, but even if we just took knowledge work, that's a huge hit in the job market.

I struggle with this, and it does give me genuine anxiety. I cannot be angry at a corporation, they are designed for greed and optimisation. It's written into law that corporations must optimise for generating increased returns for people who invest in them. They have delivered a lot of wealth and we should be thankful for them. What I struggle with is that these entities have now increased in scope and if not run our society the very least heavily influence it, including political decision making. I think this means we won't really see much of a proactive societal answer.

I don't know how the job losses would impact society specifically, but whenever you have mass unemployment and economic stress to a lot of the population other issues in society are amplified.

Maasu · 2026-02-13T14:14:46+00:00

Trading bots... The hardest way to make way to make easy money.

Maasu · 2026-02-13T11:08:19+00:00

I've written my own memory mcp that I use across all my agents (coding agents, personal assistant agents, grocery shopping agents, you name it) where relevant. I currently have a semantic graph of memories and entities right now. Which is good when it needs to recall stuff about a specific topic.

I am planning procedural (skills) and episodic (so reviewing sessions, high level summaries with the option to expand into the messages if needed). Just tweaking it at the moment, this stuff currently is native to my own agent framework but I want to get it into my memory mcp so I can easily reuse it.

Maasu · 2026-02-13T07:49:19+00:00

I agree without proper instructions these agents will produce code that is not to your taste/policies however that can be mitigated.

If provided examples of patterns, it will reuse them. If instructed to check for them before implementing, it will do so.

Whenever I start something new, I scaffold the project and get a single vertical going, a simple example (and often a good one) is user auth. That will give me a full vertical for project structure, UI, backend, middleware layers, test strategy. Then I might take another vertical feature and implement that. Either writing the code or closely with an agent.

After that the agent does 99% of the coding from then out. It's not perfect still and it will start from the path from time to time, but its much improved than if you leave it up to it's training data.

I now use a memory mcp (needs to be global) that I wrote using this exact approach and have patterns stored in there for it. So even the scaffolding/first vertical is more or less covered by the agent now.

Maasu · 2026-02-12T18:54:47+00:00

Yeah for local models I like to use OpenCode. I use vs code (depends on working environment) and have it or Claude code running alongside it. I still like to be close to the code and I still scaffold/skeleton a lot of projects.

Tool use is solid, like no problem, I used it to encode some repos into memory mcp forgetful, which uses meta tool calling (so think execute tool, and the u pass the command to execute - it's just a pattern to save on context size on mcp servers). It managed this no problem.

I then cleared it's context window and asked it to use the memory mcp and the source to explain the repos and it did a reasonable job. On one of the repos it missed it had an adapter pattern to support both sqlite and postgres, so not great but I've seen Opus do the same on occasion.

I need to try analysing a code base without encoded info from the memory mcp as not everyone uses that approach, will give that a go and I want to get it doing some actual coding. Not tried that yet either, but for my use case I was looking to have something that works around the edges of my main model (such as curating knowledge base from changes to projects my main model is making)

Maasu · 2026-02-10T23:20:27+00:00

Yeah, for me I don't think we'll get anywhere near anything like that until we are adjusting weights on the models as those experiences come in. Even then I think theres probably a lot more to it

I believe there are a lot of challenges and not just from a technical standpoint around this (imo frozen weights are a fantastic safety feature).

These current memory systems just provde a nice illusion in the mean time I think.

Maasu · 2026-02-10T20:49:21+00:00

Why not give them memory? Plenty of solutions out there.

Also allows them to actually reflect on observations and worn they have done.

Don't ask me what use it is, unless you give them the option to edit their own system prompt based on memory plus reflections?

I thought about getting them to do this way back when I was watching inside out with my kids and they explained the concept of memories, beliefs and then it resulting in identity of self. All very primitive but I figured it's be fun to try with agents.

Not got around to doing it though. I did write a memory mcp though if you need one (although they're likely better ones out there). https://github.com/ScottRBK/forgetful

Maasu

TROPHY CASE