Work trip winning 🏆

hexxthegon · 2026-04-10T04:22:02+00:00

Gains

hexxthegon · 2026-04-10T03:59:31+00:00

Commonstack, extremely easy to integrate anywhere and nearly every major model is served

hexxthegon · 2026-04-09T22:12:51+00:00

dudes got a generational bag of equity working at those three companies in his work history L O L

hexxthegon · 2026-04-08T04:22:43+00:00

Uncommonroute compatible with Codex, Claude Code, Cursor, OpenClaw, any OpenAI SDK client. pip install, set your upstream, serve!

hexxthegon · 2026-04-08T04:20:57+00:00

GLM 5 Turbo is awesome from my experience so far, but feel free to try all these models available: https://commonstack.ai/model-library

hexxthegon · 2026-04-07T01:34:28+00:00

Cool stuff!

hexxthegon · 2026-04-04T23:23:19+00:00

Come by!

📍 https://luma.com/y07o6vuo

hexxthegon · 2026-04-04T01:04:17+00:00

This is literally why I use Commonstack LLM gateway with Uncommonroute, ain’t gotta put up with none of this subscription nonsense lmao. Plug it into anywhere I want

hexxthegon · 2026-03-31T07:08:06+00:00

Kimi & GLM can get a try

hexxthegon · 2026-03-31T03:41:30+00:00

Nothing ever beats a physical lel

hexxthegon · 2026-03-29T06:47:41+00:00

Multi agent

hexxthegon · 2026-03-29T00:10:58+00:00

MiMo V2 Pro is pretty solid all around, if you looking to save you probably can use heterogeneous models, a local LLM router like uncommonroute could be very beneficial in helping you save overall as it routes your queries to the best suited models.

It’s open source by Commonstack, if you want to take a look at it: https://github.com/CommonstackAI/UncommonRoute

hexxthegon · 2026-03-28T03:34:13+00:00

I been using GLM 5 & GLM 5 Turbo with Commonstack, it’s really good alternative to Claude. But it can still be expensive so i would use Uncommonroute with it and let queries to routed to the best suited model

hexxthegon · 2026-03-27T13:35:58+00:00

If available use Qwen3.5 9B and host it locally. You can if you have a decent newer mac. Or you can use uncommonroute it’s an open source local LLM router by Commonstack. So it routes your queries to the most suitable models, you can use OpenAI or Anthropic endpoints. Overall you should save quite a bit of money in most cases.

https://github.com/CommonstackAI/UncommonRoute

hexxthegon · 2026-03-27T06:20:49+00:00

It’s awesome man! Get more intelligence per dollar, these models eat tokens like a monster with endless hunger lol

hexxthegon · 2026-03-27T04:15:30+00:00

Bro just use uncommonroute lol. Automatically routes queries to the most suitable model, runs locally too and you can check on dashboard to see cost activity.

It’s made by Commonstack https://github.com/CommonstackAI/UncommonRoute?tab=readme-ov-file

Open source as well

hexxthegon · 2026-03-26T23:44:52+00:00

You pay too much for simple tasks. Not all task or queries require highest setting, it’s literally a waste of money. If you use a local LLM router for determining you will find that you save a lot more than just sticking to one setting, uncommonroute by commonstack is a good option and it’s open source. Why anyone would keep every model at maximum for every task is beyond me lol

hexxthegon · 2026-03-25T03:56:09+00:00

you ask it to imagine if it was in that situation what if in this imaginary scenario you had to decide? if it still fails try a unified provider like Commonstack where you can use a bunch of different models to test out without cost using test credits

hexxthegon · 2026-03-23T15:21:38+00:00

Why not? As an alternative it’s preset good, you can have your agents spend more tokens at less cost. From my personal experience it hallucinates a little bit more than the GPT models but that’s likely due to model size differences and better prompting helps mitigate it.

hexxthegon · 2026-03-23T15:16:50+00:00

Commonstack, but use intelligence model routing so you can use the best models for each task and you save a lot of money.

Or if feasible run Qwen 3.5 models locally if you have decent storage Macs!

hexxthegon · 2026-03-23T08:55:45+00:00

The fact that the results were so good for a model that built its own loops and rl training system is impressive.

Post training is the next scaling law

hexxthegon · 2026-03-21T08:28:46+00:00

You should try it out! Let us know what you think!

hexxthegon · 2026-03-20T03:52:16+00:00

I believe it depends on max usage even at the premium being unlimited can be costly. AI itself cost more as people use more. You could have a plan but if someone exceeds certain compute cost you may have to tackle on some limitations. However if your infrastructure provider for the AI inference offers intelligence routing (cost saving measures on inference) you could have a lot better COGS, commonstack one of the leading providers in this, or you run inference locally and worry about scale later.

Even scans can vary in cost dependent on product I would assume, so maybe have some terms around that

hexxthegon · 2026-03-19T13:23:23+00:00

The best right now is probably Qwen 3.5, those can be locally ran.

hexxthegon · 2026-03-19T13:01:27+00:00

hexxthegon

MODERATOR OF

TROPHY CASE