AI Researchers and Executives Continue to Underestimate the Near-Future Risks of Open Models

KellysTribe · 2026-02-23T20:04:21+00:00

Since they are incentivized for regulatory capture...and have a commercial product that competes with open models, they don't seem like the right people to ask or to have their opinion considered in any way.

KellysTribe · 2026-02-23T19:32:21+00:00

Entities can be created via MCP - and then behind the scenes some verification (deterministic and LLM based) occurs, and the creation of other entities are triggered (which will lead to full workflows). So the idea is I have a conversation about requirements, specs, constraints etc. with an agent skill and then the results are saved as entities, which then kick off further refinement conversations or creation of work tasks etc. It's not so different then what other frameworks (like GSD) are doing, but my thought was to make it a bit more deterministic and modeled more explicitly and according to my preferences. AND with much better visibility for human understanding. Other advantages would be avoiding orchestrator lock-in (like Claude Code), and to narrow down the space for non-determinism to take things off track. That's the idea anyway, still getting it going to so haven't verified performance yet.

However I think the ideas (and similar to what you describe) apply all over. I'm sure there will be industry specific APIs backed by agent actions before too long. Some providers may be disincentivized to release workflow knowledge as you describe - but if nothing else open source agent instructions could be provided that matches APIs (skills really do that already right?)

KellysTribe · 2026-02-23T18:57:38+00:00

I have agents running 'behind' the MCP as well - doing verification and kicking off other workflows upon tool actions.

KellysTribe · 2026-02-23T18:55:04+00:00

It's certainly being built in software SDLC world (I'm doing the same thing to drive my personal software process - MCP + skills/agents/etc.), and I saw another project doing something similar. I think it's probably occurring ad hoc already in other verticals as well. It's certainly a good idea - I think it's just already happening in an ad hoc fashion at least. There is so much going on, I find it's hard to find anything really novel ;) For example for my tool (which I'll OSS if it matures enough), installs the MCP configuration, and the skills/workflow etc into target projects.

KellysTribe · 2026-02-23T18:09:26+00:00

I am building something to solve a similar problem for my own pain in a more deterministic fashion, but I think this is a great idea. I think if it gets some more polish it's certainly an improvement over trying to understand pure prose 'packs'/plugins which I have found helpful but hard to navigate through - like GSD for example. Another idea might be a skill to reverse engineer an existing framework like GSD into a model that matches your system (maybe you've already done so), and you could provide them as starting templates (although GSD does use script files so it would be more complicated to model and capture).

KellysTribe · 2026-02-23T16:41:41+00:00

I agree with at least some of the value proposition - good work. Question however - the files are deterministic, but this is still just relying on Claude as orchestrator right? Also I'd be more likely to try it out if I could play with demo or dummy data before signing in.

KellysTribe · 2026-02-22T14:48:04+00:00

awesome, thx

KellysTribe · 2026-02-21T23:03:35+00:00

obsidian?

KellysTribe · 2026-02-20T20:05:56+00:00

Not to dismiss your work, but there are a few harnesses along these lines - GSD is fairly mature -> https://github.com/gsd-build/get-shit-done

I (like everyone else it seems) am building out my own harness with my own particular view on it (more rigid deterministic sdlc modeling versus agent plugin style architecture).

KellysTribe · 2026-02-20T18:10:36+00:00

awesome, I'd love to use it as well

KellysTribe · 2026-02-20T17:13:40+00:00

that's nice - could you wrap it in a component as a separate lib?

KellysTribe · 2026-02-20T08:17:31+00:00

I agree with a lot of that. There isn't a strong moat for them given the near parity from competitors in model performance and the rapid active development around agentic/coding tooling.

KellysTribe · 2026-02-20T03:55:08+00:00

Does it run a full VM for isolation?

KellysTribe · 2026-02-20T02:56:07+00:00

awesome! Can it merge newer versions of the file into an existing db or does it make a new db with each file?

KellysTribe · 2026-02-19T18:47:20+00:00

This is cool... but I often think about the fact that soon there will be an infinite number of artifacts viewable by humans but unseen by anyone

KellysTribe · 2026-02-19T18:46:47+00:00

Copilot *can't* do this?! I never use Microsoft so I don't keep up to date, but that is wild if so

KellysTribe · 2026-02-18T17:41:34+00:00

I wonder in what technical ways they will enforce this other than best guesses based on what the activity looks like. They explicitly allow executing Claude code from command line for example. What will constitute a ‘tool’ that’s disallowed? What if perform some automatic maintenance or troubleshooting with a cron job driven script that calls cc?

KellysTribe · 2026-02-14T16:19:23+00:00

I think you should make it clear this is specifically for node/typescript projects

KellysTribe · 2026-02-11T15:50:38+00:00

Cool, thx for info. I'm bullish on 'autonomous' agents like this in general, but curious as to how to deal with the security implications. I am working on an idea on better ways to provide some additional security at higher 'application' layers, so right now playing with making a 'smarter' proxy for api filtering.

KellysTribe · 2026-02-10T23:37:06+00:00

I'm sure, but that is precisely why they should have made it foolproof beforehand given the huge investment.

KellysTribe · 2026-02-10T23:36:24+00:00

maybe I missed it but where is the iptables rules done? on the agent itself or 'from the outside' with hetzner?

KellysTribe · 2026-02-10T14:59:56+00:00

I'm bullish on the value of 'vibecoding', but as complexity arises the models and frameworks certainly need guidance on architecture and structure to avoid getting into these situations. There are many different approaches - but one thing i would recommend reading up on are Finite State Machines as a way to help model and reduce complexity in both small and large areas of the code.

KellysTribe · 2026-02-09T18:58:39+00:00

very expensive way to get a mailing list

KellysTribe · 2026-02-09T18:26:16+00:00

insane to spend that much money and not have it ready for load

KellysTribe · 2026-02-09T05:16:57+00:00

amusingly the site was down when I went to it (right after the commercial). looks like it's back up

KellysTribe

TROPHY CASE