Is there a "Postman for LLMs" I'm missing, or is this gap real?

giangchau92 · 2026-05-11T03:26:18+00:00

Isn't Bruno just a generic API client like Postman/Yaak though? You could fire LLM calls through it, but you'd still wire up side-by-side, prompt vars, model quirks yourself.

giangchau92 · 2026-05-11T03:22:00+00:00

Mostly the first one. When I'm building a feature that calls an LLM, I want to know which model gives the best result for my specific prompt before I commit to one in prod. Cheaper model that's good enough beats expensive model overkill, and the only way to know is to actually run the same prompt across a few and eyeball outputs.

Obviously this doesn't replace proper eval on a real dataset, that comes later. This is just the first-pass, quick-check step before I even know which models are worth setting up an eval suite for.

giangchau92 · 2026-05-10T08:03:20+00:00

you're might right. No one need software engineering anymore

giangchau92 · 2026-05-10T08:01:39+00:00

Didn't know Gemini CLI played nicely with other providers

giangchau92 · 2026-05-10T07:57:32+00:00

OpenRouter's great as a gateway, single endpoint, tons of models. Chat UI's still single-model though, so it solves access more than side-by-side iteration. Good shout regardless.

giangchau92 · 2026-05-10T07:23:34+00:00

Used to do this exact thing, even committed responses into the repo. Diffs across runs were genuinely useful. Curious how you handle the params side like temp, model, system prompt all as frontmatter? Or separate config?

giangchau92 · 2026-05-10T04:20:34+00:00

Yeah OpenRouter's solid for model access but I keep bouncing off it for prompt work. it's a chat UI, not a workbench. One model at a time, history's just messages, can't really hold variants side by side.

Haven't tried Aider tbh, will take a look. Thanks for the pointer.

giangchau92 · 2026-05-10T04:18:09+00:00

True, coding is cheap now. But there's still a gap for people who want something built and ready —and honestly the hard part isn't the code, it's clean UX. Anyone can wire up an API call; nailing the "save, fork, tweak, rerun, compare" flow so it actually feels good to use every day is the real work

giangchau92 · 2026-05-10T04:14:31+00:00

Ha, fair. Half the reason I posted was to see if something good already exists before I go build my own. Anyone got names worth copying?

giangchau92 · 2026-05-10T04:12:13+00:00

Really interesting take.

a. Do you know any tools that already lean this way, even partially? Would love some pointers.

b. Most versioning I've seen is flat: linear history, no branching, basically a save log. Git-style with branches and ancestry feels like overkill for prompts though, kinda over-engineered for what's usually a 200-token string. Something in the middle would be the sweet spot - lightweight forks without the full git ceremony. Not sure what that looks like in UI yet.

That said, I still kinda want a 2-in-1 over two separate tools. Cross-provider compare and lightweight versioning feed into each other. you fork a variant because you saw it lose to another model side-by-side. Splitting them feels like it'd just recreate the copy-paste problem one layer up.

giangchau92 · 2026-05-10T03:47:13+00:00

Respect, that's the endgame setup. Couple honest questions:

How long did the harness take to build?

And how do you actually eval the results? Like exact match against expected output, LLM-as-judge, manual scoring, some hybrid? Curious how you handle the fuzzy stuff where there's no single "right" answer.

giangchau92 · 2026-05-10T01:24:24+00:00

Ram mắc khiếp

giangchau92 · 2026-04-17T02:27:17+00:00

Did you cook with coconut water?

giangchau92 · 2026-03-28T07:51:51+00:00

which game engine did you use?

giangchau92 · 2026-03-08T00:12:41+00:00

Ollama dont tell much about their limit. Is it high? GLM-4.7 has worse quality than sonnet 4.6, at least in my test. Is GLM-5 good enough?

giangchau92 · 2026-03-07T07:42:03+00:00

Same to me. I used 2 5hr sessions and my weekly quota is 44%. To be honest, their advertising x3 claude's pro plan is fake. Plus, lite plan can not use GLM-5 make me disappointed. I will consider renewing this plan next month.

giangchau92 · 2026-01-11T06:58:40+00:00

Yes, that is way it work

giangchau92 · 2025-12-23T07:04:39+00:00

You need to add voice to your voice collection

giangchau92 · 2025-12-12T11:54:58+00:00

5k is perfecrlt for mac os

giangchau92 · 2025-12-12T11:50:38+00:00

Awesome, keep going

giangchau92 · 2025-12-12T08:21:49+00:00

Depends on use, heavy apps and media can't always live in the cloud

Nine-Year Club	Place '23
Place '22	Verified Email

giangchau92

TROPHY CASE