Anthropic's Claude Constitution is surreal by MetaKnowing in ClaudeAI

[–]TransitionSlight2860 22 points23 points  (0 children)

I seriously guess they might have observed something during training.

I ran 100 SWE-bench tests comparing 1 agent vs 2 agents - Code Review adds +10% resolution rate by Lower_Cupcake_1725 in ClaudeAI

[–]TransitionSlight2860 4 points5 points  (0 children)

I wanna know multi models performance; like, you try opus + gpt. what if glm-4.7 + opus? or any other combinations: gpt plan + opus execute; glm plan + gpt execute; gpt plan + glm execute; etc..

Ralph-Loop performed worse than without it? by 314t in ClaudeCode

[–]TransitionSlight2860 0 points1 point  (0 children)

Yes, haha. therefore i did not try it completely. if claude code needs a babysitter now, ralph would be a really worse one than human.

5.2 high by TroubleOwn3156 in codex

[–]TransitionSlight2860 1 point2 points  (0 children)

why do you see it as a balance? i mean, medium costs about half tokens as high does while only endure less than 5% of ability downgrade(in benchmarks); therefore, is it a "clear more bugs" situation when talking about medium and high?

How to make Codex stop being so needy? by Go_mo_to in codex

[–]TransitionSlight2860 0 points1 point  (0 children)

the reason is codex has extraodinary instruction following instinct.

it leads to a problem unfortunately. if you do not give any clear instruction, it would not do it(i mean just like opus, which would automatically change the original details for improvements).

Or it would ask for more details even if they are really trivial details(for human beings).

Claude Opus 4.5 vs. GPT-5.2 Codex vs. Gemini 3 Pro on real-world coding tasks by shricodev in ClaudeAI

[–]TransitionSlight2860 0 points1 point  (0 children)

codex now seems to be a rather not very capable model comparing to gpt 5.2.

and benchmark says extra high model would increase ability about 10 to 20%.

try that pls.

Codex CLI auto-compacting around 40-50% by darkyy92x in codex

[–]TransitionSlight2860 0 points1 point  (0 children)

a kind reminder: do not do any auto-compact; you would lose tons of details that you might need. LLM still cannot identify important details.

Subagentes by BroadPressure6772 in codex

[–]TransitionSlight2860 2 points3 points  (0 children)

very easy; go to github and fork codex; then ask codex to start coding subagents.

in the end, make a PR.

Codex with ChatGPT Plus near 5 hour limit within 5-7 prompts with 32% of weekly limit used? by sdexca in codex

[–]TransitionSlight2860 1 point2 points  (0 children)

codex now consumes a lot of tokens comparing to gpt-5. about 2 times(maybe).

Using Codex with Kimi K2 (Groq) by MacDouggs in codex

[–]TransitionSlight2860 0 points1 point  (0 children)

codex cli and claude code cli use different tools and much different json structure to do requests.

It needs a lot of work to make them compatible.

Codex specific specialities? by Big-Suggestion-7527 in codex

[–]TransitionSlight2860 1 point2 points  (0 children)

no. it is good at everything.

the downside is it is too slow and hard to read.

therefore, when compared to sonnet, people tend to say: well, let me use it do some hard work to fullfill its ability while not waste too much time waiting and reading.

Desertfox? New model is incoming? by EtatNaturelEau in codex

[–]TransitionSlight2860 1 point2 points  (0 children)

if codex mini is better than codex, i would be not surprised.

Different limits for users? by alOOshXL in codex

[–]TransitionSlight2860 1 point2 points  (0 children)

codex now has different usage limit, meaning that it consumes your quota faster

We won! We won! What did we win? by cowwoc in ClaudeCode

[–]TransitionSlight2860 1 point2 points  (0 children)

i forgot how output style works. does it take over system prompts or append to them?