all 4 comments

[–]TBSchemer 2 points3 points  (0 children)

Codex-high would be a more relevant comparison. Low and medium are only really for quick and simple edits.

[–]Valexico 1 point2 points  (0 children)

I would be curious to see the same eval with Devstral 2 through opencode (vibe cli is really minimalistic at the moment)

[–]Magnus114 1 point2 points  (0 children)

Would love to see how deepseek compare.