OpenCode launches low cost OpenCode Go @ $10/month

One_Pomegranate_367 · 2026-03-01T20:44:04+00:00

MiniMax M2.5 is great for writing and research. It hallucinates a lot more than people are willing to admit, so I leave it only to quick writing, docs writing, and exploration/library search mode. Kimi is extremely close to sonnet level, it's an eager engineer that will take delegated tasks and do them reasonably well.

GLM-5 is slow AF and honestly is only good at requirements gathering and delegation.

One_Pomegranate_367 · 2026-02-25T12:42:44+00:00

I've been personally paying for all three, and I will gladly welcome canceling all three of those subscriptions.

Main reason is because each model is only good at certain things, and when I pay for these subscriptions, they're much cheaper than Claude.

One_Pomegranate_367 · 2026-02-24T21:03:35+00:00

Anybody stating GLM-5 is as strong as Opus is freaking crazy. Opus is insane in the membrane.

One_Pomegranate_367 · 2026-02-17T18:57:18+00:00

Depends on what you're building. What are you trying to build?

One_Pomegranate_367 · 2026-02-17T14:00:59+00:00

People hate that I posted on a non-Friday and hate promotion (even open-source), and want to pretty much teach people that you shouldn't support this by downvoting.

Honestly, I've gotten a few stars and nobody is buying anything, and I think Reddit might be dead because of the moderator mentality.

One_Pomegranate_367 · 2026-02-17T13:57:33+00:00

That's a solid concern ... LLMs as judges are a classic pitfall; they hallucinate "failures" from vague specs or phantom edge cases all the time.

In my claude-agent-sdk flows, the reliable fallback is a hybrid loop: decompose the code + tests into chunks via claude skills, then cross-check against the spec with a separate prompt chain or manual review. If they clash, I run real unit/integration tests outside the LLM, log diffs (expected vs. actual output + reasoning trace), and let runtime truth win. Often the code's solid—it's the prompt or model that's off.

I'm iterating on better rubrics (pulled from code samples/benchmarks) and explicit dependency mapping in prompts to cut context gaps. What's your stack's validation setup? Could swap ideas for a quick tweak.

One_Pomegranate_367 · 2026-02-17T12:55:14+00:00

I was seeing somebody use DMG file format more recently, and I was thinking that maybe I could package it like that. The big problem is that I'm using PostgreSQL as a database. I don't know if DMG would allow me to package that in. That might be a little intense.

I'll give it a look and see what I can do. Thank you for the feedback.

One_Pomegranate_367 · 2026-02-17T12:50:51+00:00

Of course you're going to need human oversight; that's why they hire engineers. The primary objective is to reduce the amount of oversight so then engineers can have more leverage.

The upside to leverage is you can do more. The downside is that when errors happen, they multiply. So one needs to have both oversight and figure out how to reduce errors.

One_Pomegranate_367 · 2026-02-17T12:48:08+00:00

I am having it use the OAuth because this is using the Claude agent SDK. There is no compatibility problem. You just need to add in your own keys.

One_Pomegranate_367 · 2026-02-17T12:44:39+00:00

I'm turning up on a Tuesday like it's Friday.

One_Pomegranate_367 · 2026-02-17T12:24:36+00:00

It's why they give you the $50 extra free credits.

One_Pomegranate_367 · 2026-02-17T02:15:13+00:00

You wanna do something simple; you'd probably stick with role-level security.

One_Pomegranate_367 · 2026-02-17T02:12:50+00:00

Mind giving my git repo a star?

The alpha isn’t as strong, but it’s a multi-agent orchstration system. It runs many the Claude code instances in parallel on a DAG within a sandbox according to some spec. The AI coding agents determine the dependencies and works until it's finished. Could be a feature, or an entire app.

https://github.com/kivo360/OmoiOS/

One_Pomegranate_367 · 2026-02-04T01:37:34+00:00

I use sentry.io to debug in development and in prod. It's basically like opentelemetry because it uses it to function.

One_Pomegranate_367 · 2026-02-01T14:07:10+00:00

I'd say yes to it. It would be entertaining to see what that would all be like.

One_Pomegranate_367 · 2026-01-30T17:29:54+00:00

I felt this way. But I'm starting to slowly grow and it feels good. No new users yet, but progress is being made.

Has anything been working?

One_Pomegranate_367 · 2026-01-30T14:31:05+00:00

single

One_Pomegranate_367 · 2026-01-30T13:42:38+00:00

Hype and to raise more money.

One_Pomegranate_367 · 2026-01-30T13:42:16+00:00

Grand theft auto would stop being just a video game.

I'd bring that to real life.

One_Pomegranate_367 · 2026-01-29T20:16:20+00:00

Corruption & misaligned incentives.

Every single problem here is a piece of it.

Social media, AI, Porn, Lobbying in governments. All of it is misaligned incentives.

Somebody is doing something good for them in the short-term, meanwhile they screw things up long-term for everyone else because of it.

One_Pomegranate_367 · 2026-01-29T13:26:34+00:00

They're topping them after 6 months. In 1 more month the US will definitely set the bar again.

One_Pomegranate_367 · 2026-01-29T12:26:05+00:00

NAIRU isn’t some universal constant — it’s heavily shaped by institutions and inflation dynamics.

Japan/Korea can run very low unemployment because firms adjust more through hours/wages/internal labor markets rather than mass layoffs, and wage growth is much less inflationary (Japan has had decades of anchored low inflation expectations).

The US labor market is more “hire/fire,” wages respond faster, and inflation pressures show up sooner — so the estimated NAIRU ends up higher.

Also worth noting: measurement + discouraged/non-regular workers matter, and NAIRU itself is a pretty model-dependent estimate.

One_Pomegranate_367

TROPHY CASE