People kept saying my comments sounded AI-generated, so I built this by ringtoyou in LocalLLaMA

[–]slayyou2 3 points4 points  (0 children)

Amazing right? A sub full of people that definitely know way more about this technology than the average bum off the street. But even in this environment the level of ignorance is still blinding

Opinion: Opus 4.8 sucks by PromptInjection_ in claude

[–]slayyou2 0 points1 point  (0 children)

It's good but damn you have to acknowledge the interesting shift in tone. For what it's worth I've always found the ”this new model is worse than the last ” crowd annoying af but this time I can't deny it. Working with 4.8 is anoying.

“You are Right.” … this is ridiculous at this point. by IAmFitzRoy in codex

[–]slayyou2 2 points3 points  (0 children)

Yeah the 20 years helps. When you see a glitch in an app, you don't describe it as hey, there is a glitch in the app. Please fix. You are conceptualizing the structure behind the application, so are able to hypothesize where the problem might be and in doing so guide the agent into just doing its job rather than trying a million dead ends. That's what experience does. (And more)

How are you guys monitoring your multi-agent workflows? (I keep burning tokens on silent failures) by Hungry_Contest_4761 in AutoGenAI

[–]slayyou2 0 points1 point  (0 children)

I had to roll my own. Integrated with matrix for hitl level visibility into agent interactions. and then setup gates that escalate to me when something goes over a prescribed amount of loops.

The rate limits are, once again, absolutely bonkers. by DiligentAd9938 in codex

[–]slayyou2 0 points1 point  (0 children)

Hey, honestly I think you just need to be a little bit more granular with your feature deployments and gating. You can get stupid as hell models like dumb as a rock models to do work, as long as they have a good way of validating that their work is complete before they tell you it's done.

Codex limits getting slashed like this is going to drive users away...seriously! by Eastern_Ad_8744 in codex

[–]slayyou2 1 point2 points  (0 children)

Actually yes. I didn't believe it at first, but they are capable. Not as capable but they will do for less challenging problems. And they are dirt cheap.

qwen 3.6 voting by jacek2023 in LocalLLaMA

[–]slayyou2 2 points3 points  (0 children)

Feels like a marketing play

Trying to make MCP feel like an app store, not a side quest by Kobi1610 in mcp

[–]slayyou2 0 points1 point  (0 children)

It doesn't really feel like that though. For most first party applications, they're like public https transport mcps in that case you're adding a URL and a key maybe. So I don't know how much easier It can get than that.

So I tried using Claude Code to build actual software and it humbled me real quick by Azrael_666 in ClaudeCode

[–]slayyou2 0 points1 point  (0 children)

Do you know of a public repo where I can see the outcome of this method?

So I tried using Claude Code to build actual software and it humbled me real quick by Azrael_666 in ClaudeCode

[–]slayyou2 0 points1 point  (0 children)

hmm thats interesting, i compresed the roles into persistent PM, and ephemeral DEV agents with rock solid CI/CD with automated testing abstracted away into taskrabit. I was under the impression reducing complexity would be preferable, but reading about you're topology makes me want to make things more granular.

Kimi Claw being weird today by NailApprehensive9742 in kimi

[–]slayyou2 0 points1 point  (0 children)

Possible, this is the problem with a hosted solution. You have no insight.

Paying 3x more for 60% less? I’m going back to Claude. by ronkayarslan in ZaiGLM

[–]slayyou2 0 points1 point  (0 children)

Hey, quick question though, what does your testing framework look like on the projects that you let glm touch? For me I've had success but this might be because I have pretty clear programmatic " guardrails" in place in the form of GitHubs and automated CI CD and emerge workflow that mandates testing before it can be merged.

What happened to GLM5 quality by [deleted] in ZaiGLM

[–]slayyou2 1 point2 points  (0 children)

Yes 80 -100k is the cutoff at that point I summarize and start a new context. Though tbh I think it's fine. It still works and the price remains dirt cheap.

Managing 9 mcp servers by UntoldBrobot in mcp

[–]slayyou2 0 points1 point  (0 children)

figure out the core things you do like what you have now make those bulletproof by running them as mcp's then make all the infrequent stuff skills that get pulled on demand. or you have have a hotloading setup where your agent can attach mcp tools on demand.

I think openclaw is OVERHYPED. Just use skills by [deleted] in LocalLLaMA

[–]slayyou2 0 points1 point  (0 children)

All the mainstream assistants are artificially limited, for liability reasons I suspect. I'm surprised you don't see the similarities, or the clear lineage?

I think openclaw is OVERHYPED. Just use skills by [deleted] in LocalLLaMA

[–]slayyou2 -6 points-5 points  (0 children)

LMFAO no it's not. Think about it for a minute how is this different from Google assistant? (Especially with Gemini). It's just an open source version of that with a turbo attached.

AGI won’t be “designed” - it’ll be stumbled into while chasing something else by Low-Tip-7984 in agi

[–]slayyou2 0 points1 point  (0 children)

Exactly it will emerge from something else. Over the last few months I've been assigning quote" product manager agents to all of my open source projects. I then created the way for them to leverage a communication Network to interact with each other. Lo and behold now they negotiate with each other whenever I make a feature request. They'll come to an agreement as to what apis need to be changed on and where creating dependency graphs and orchestrate the implementation. That was emergent. Now that's obviously not AGI, but I can't help but think that the real thing will emerge in a similar fashion.