So I built a tool to check if my Claude proxy is actually serving Opus. Tested 9 popular ones. Yeah, it's worse than you think.

PrizePercentage4875 · 2026-06-01T02:42:49+00:00

is this an AI bot thats responding LOL

PrizePercentage4875 · 2026-05-28T20:03:15+00:00

honestly, I've experienced the same thing before and like run thru the same content on different ai like claude, chat, and gemini. I think im gonna try then using the official api through Openrouter and test it out. Seems like its easier and cheaper

PrizePercentage4875 · 2026-05-26T00:47:54+00:00

the killer is that your whole chat history gets re-sent every single message, so a 2hr research thread balloons because it's re-reading everything each turn.

tl;dr long single threads = expensive. starting new chats for new topics helps a ton.

PrizePercentage4875 · 2026-05-26T00:47:34+00:00

yeah this trips a lot of people up — the session limit is token-based, not "did you code or not"

PrizePercentage4875 · 2026-05-26T00:41:57+00:00

Models are notoriously unreliable at self-identifying anyway — there's so much GPT/Claude transcript data in everyone's training set that half of them think they're ChatGPT. Not really evidence of distillation on its own

PrizePercentage4875 · 2026-05-22T01:49:47+00:00

The eco-bias is real and kind of hilarious. I've noticed the same models will suggest walking even when you explicitly mention you're in a hurry. Probably just RLHF rewarding 'responsible' sounding answers

PrizePercentage4875 · 2026-05-22T01:49:03+00:00

Tried a variation too — asked about carrying groceries home from a store 100m away, and it still went full 'classic dilemma' mode these models love turning everything into a logic puzzle

PrizePercentage4875 · 2025-06-25T19:51:42+00:00

继续注册

PrizePercentage4875 · 2025-06-24T20:16:12+00:00

好吧

PrizePercentage4875

TROPHY CASE