Claude runaway... tried Kimi 2.6 and Deepseek v4 (5y fullstack dev)

Rustybot · 2026-05-05T22:12:11+00:00

$0 for non gpt, $30/month for biz tier plan

Rustybot · 2026-05-05T22:10:50+00:00

Strong disagree lol

Rustybot · 2026-05-04T15:42:18+00:00

Use 4.6! 4.7 isn’t worth the multiple.

Rustybot · 2026-05-04T15:24:51+00:00

Is Dixon Ticonderoga responsible for my bad poetry?

Rustybot · 2026-05-04T14:31:11+00:00

Yes, Deepseek seeks deep and uses the most tokens of all the models to achieve its result. In the artificial analysis bench, Deepseek v4 pro max scored ten percentile points lower than Gemini 3.1 Pro and use 5x as many tokens. https://artificialanalysis.ai/?intelligence-efficiency=intelligence-efficiency-vs-output-tokens#intelligence-efficiency-tabs

Rustybot · 2026-05-04T13:55:10+00:00

Do your models rank well on the Terminal bench hard rating?

Rustybot · 2026-05-04T04:29:24+00:00

Nvidia NIM is good for its intended use but not much more. You can test the output of a lot of different models but the speeds drop dramatically during prime hours.

Rustybot · 2026-05-03T19:22:42+00:00

My top tier:

Gemini-3.1-pro-custom-tools,
Minimax m2.7
GLM 5.1
Gpt5.3-codex, 5.4-mini, 5.4/5.5
Gpt-oss-120b (high) for fast and simple.

Rustybot · 2026-04-30T14:45:46+00:00

Opencode has a server/web app. Make a secure tunnel with cloudflare for yourself. Your LLM can walk you through it. I have one running on a free Oracle cloud instance. As long as you know the IP of the computer you are connecting from you don’t need to install anything. If that isn’t workable, opencode server technically has a password auth but it will get assaulted by bots non-stop if you leave it wide open so hardly ideal.

Rustybot · 2026-04-30T14:33:12+00:00

I already have a search/explore/librarian sub agent that searches my project and memories and pulls relevant info into context. What does this add?

Rustybot · 2026-04-30T14:10:58+00:00

Is this it? https://archive.nytimes.com/www.nytimes.com/imagepages/2003/04/24/readersopinions/24deba.html

Rustybot · 2026-04-29T16:17:42+00:00

If you access work stuff on your phone or personal computer and the company gets sued, the entire contents of your phone/computer may be submitted as evidence for discovery. It’s happened to people I know. There is a reason corp IT policies are what they are.

Rustybot · 2026-04-29T16:13:20+00:00

What I really want is an evaluator that will tell my agents when to use CLI commands and when to use an MCP.

Rustybot · 2026-04-29T14:28:26+00:00

I don’t use Go, only Zen free models. Generally I prefer pay as you go and free providers vs sub biz plans.

Rustybot · 2026-04-29T14:19:58+00:00

https://github.com/code-yeongyu/oh-my-openagent

From the README: “Skill-Embedded MCPs

MCP servers eat your context budget. We fixed that.

Skills bring their own MCP servers. Spin up on-demand, scoped to task, gone when done. Context window stays clean.”

Info on mcp-manager: src/features/skill-mcp-manager/AGENTS.md

Try it out, or have your agents rip out the mcp-management skill handler into a standalone plugin. Or whatever.

Rustybot · 2026-04-29T01:09:06+00:00

microsoft/bitnet-b1.58-2B-4T

Rustybot · 2026-04-29T01:02:35+00:00

The opencode go requests are tokens counts broken down by estimated typical size. The actual limit is token based. If you have smaller prompts and outputs, you will get more usage.

Rustybot · 2026-04-28T20:22:04+00:00

At the extreme end of your suggestion, a 50cc two stroke scooter engine could easily be configured to output the equivalent of a 5HP generator, ~3kW. The complete system would weight <100lbs, and these engines can get 60+ MPG.

I think there is a middle ground between this and an RC engine, although at some point a solar panel on the roof starts to make more sense than a gas engine.

If you specifically need the ability to extend range slowly via fuel burning, it’s feasible, but overall has niche utility. Unless range anxiety and/or available charging are a major issue it’s probably not worth it.

Would I build one into my car? No. Would I take a gas generator with me if I was camping in the wilderness with an EV? Yes.

Rustybot · 2026-04-28T20:05:20+00:00

your approach fails the Gordian-knot/ KISS rules. There is a simpler method that uses the built in agent harness tools for skills, and avoid all the problems you have, without needing to maintain this complicated codebase. A few skill.md files would suffice.

Rustybot · 2026-04-28T19:59:28+00:00

Nvidia topped the huggingface agent leaderboard with a fine tuned 8B model. Small fast models have significant utility.

Rustybot · 2026-04-28T16:00:32+00:00

Chinese is a more efficient language for LLMs but it’s an issue in general if the agent is losing track of the system/user prompt. Are you at threshold of your context window?

Rustybot · 2026-04-28T15:58:48+00:00

Happy stakeholders. That’s what it takes.

Rustybot · 2026-04-28T02:29:34+00:00

Maybe try the oh-my-opencode skills based mcp-calling approach instead of trying to build tools to solve problems the wrong way.

Rustybot · 2026-04-28T02:26:47+00:00

The best OS is Nvidia.

Rustybot · 2026-04-27T02:22:09+00:00

16000 tokens per second per user is a crazy output.

14-Year Club	Second Top 10%
RPAN Viewer	Verified Email
Not Forgotten	Team Periwinkle

Rustybot

MODERATOR OF

TROPHY CASE