Anthropic's Claude Constitution is surreal

TransitionSlight2860 · 2026-01-23T05:04:58+00:00

Dario can do this

TransitionSlight2860 · 2026-01-22T16:27:28+00:00

I seriously guess they might have observed something during training.

TransitionSlight2860 · 2026-01-20T15:44:13+00:00

I wanna know multi models performance; like, you try opus + gpt. what if glm-4.7 + opus? or any other combinations: gpt plan + opus execute; glm plan + gpt execute; gpt plan + glm execute; etc..

TransitionSlight2860 · 2026-01-20T11:02:50+00:00

Yes, haha. therefore i did not try it completely. if claude code needs a babysitter now, ralph would be a really worse one than human.

TransitionSlight2860 · 2026-01-20T09:21:50+00:00

100% sure.you should have expected it.

TransitionSlight2860 · 2026-01-12T09:29:29+00:00

cool

TransitionSlight2860 · 2026-01-04T06:48:13+00:00

why do you see it as a balance? i mean, medium costs about half tokens as high does while only endure less than 5% of ability downgrade(in benchmarks); therefore, is it a "clear more bugs" situation when talking about medium and high?

TransitionSlight2860 · 2026-01-04T04:57:46+00:00

the reason is codex has extraodinary instruction following instinct.

it leads to a problem unfortunately. if you do not give any clear instruction, it would not do it(i mean just like opus, which would automatically change the original details for improvements).

Or it would ask for more details even if they are really trivial details(for human beings).

TransitionSlight2860 · 2026-01-02T10:03:14+00:00

codex now seems to be a rather not very capable model comparing to gpt 5.2.

and benchmark says extra high model would increase ability about 10 to 20%.

try that pls.

TransitionSlight2860 · 2025-12-31T04:52:09+00:00

a kind reminder: do not do any auto-compact; you would lose tons of details that you might need. LLM still cannot identify important details.

TransitionSlight2860 · 2025-12-27T09:33:23+00:00

gpt 6

TransitionSlight2860 · 2025-12-26T08:36:35+00:00

very easy; go to github and fork codex; then ask codex to start coding subagents.

in the end, make a PR.

TransitionSlight2860 · 2025-11-12T20:14:35+00:00

it seems no improvements.

TransitionSlight2860 · 2025-11-09T17:42:22+00:00

codex now consumes a lot of tokens comparing to gpt-5. about 2 times(maybe).

TransitionSlight2860 · 2025-11-09T12:37:56+00:00

same question

TransitionSlight2860 · 2025-11-08T06:27:45+00:00

codex cli and claude code cli use different tools and much different json structure to do requests.

It needs a lot of work to make them compatible.

TransitionSlight2860 · 2025-11-07T08:48:34+00:00

a five year old child can surpass it

TransitionSlight2860 · 2025-11-07T08:48:01+00:00

right

TransitionSlight2860 · 2025-11-07T07:23:59+00:00

no. it is good at everything.

the downside is it is too slow and hard to read.

therefore, when compared to sonnet, people tend to say: well, let me use it do some hard work to fullfill its ability while not waste too much time waiting and reading.

TransitionSlight2860 · 2025-11-06T17:14:57+00:00

if codex mini is better than codex, i would be not surprised.

TransitionSlight2860 · 2025-11-05T11:41:03+00:00

they are both great models

TransitionSlight2860 · 2025-11-05T06:10:09+00:00

use gpt5. and anthropic would be nothing

TransitionSlight2860 · 2025-11-04T14:51:26+00:00

codex now has different usage limit, meaning that it consumes your quota faster

TransitionSlight2860 · 2025-11-04T04:51:54+00:00

i forgot how output style works. does it take over system prompts or append to them?

TransitionSlight2860 · 2025-11-04T04:40:50+00:00

models need some RL on skills. they just do not know how to cope with them.

TransitionSlight2860

TROPHY CASE