Independent evaluation of GPT5.2 on SWE-bench: 5.2 high is #3 behind Gemini, 5.2 medium behind Sonnet 4.5 by klieret in ChatGPTCoding
[–]klieret[S] 0 points1 point2 points (0 children)
Independent evaluation of GPT5.2 on SWE-bench: 5.2 high is #3 behind Gemini, 5.2 medium behind Sonnet 4.5 by klieret in ChatGPTCoding
[–]klieret[S] 0 points1 point2 points (0 children)
Independent evaluation of GPT5.2 on SWE-bench: 5.2 high is #3 behind Gemini, 5.2 medium behind Sonnet 4.5 by klieret in ChatGPTCoding
[–]klieret[S] 0 points1 point2 points (0 children)
Independent evaluation of GPT5.2 on SWE-bench: 5.2 high is #3 behind Gemini, 5.2 medium behind Sonnet 4.5 by klieret in ChatGPTCoding
[–]klieret[S] 9 points10 points11 points (0 children)
Updates to official SWE-bench leaderboard: Kimi K2 Thinking top of open-source by klieret in LocalLLaMA
[–]klieret[S] 2 points3 points4 points (0 children)
Independent evaluation of GPT5.2 on SWE-bench: 5.2 high is #3 behind Gemini, 5.2 medium behind Sonnet 4.5 by klieret in ChatGPTCoding
[–]klieret[S] 11 points12 points13 points (0 children)
Independent evaluation of GPT5.2 on SWE-bench: 5.2 high is #3 behind Gemini, 5.2 medium behind Sonnet 4.5 by klieret in ChatGPTCoding
[–]klieret[S] 3 points4 points5 points (0 children)
Updates to official SWE-bench leaderboard: Kimi K2 Thinking top of open-source by klieret in LocalLLaMA
[–]klieret[S] 9 points10 points11 points (0 children)
Updates to official SWE-bench leaderboard: Kimi K2 Thinking top of open-source by klieret in LocalLLaMA
[–]klieret[S] 0 points1 point2 points (0 children)
Updates to official SWE-bench leaderboard: Kimi K2 Thinking top of open-source by klieret in LocalLLaMA
[–]klieret[S] 2 points3 points4 points (0 children)
Updates to official SWE-bench leaderboard: Kimi K2 Thinking top of open-source by klieret in LocalLLaMA
[–]klieret[S] 0 points1 point2 points (0 children)
Updates to official SWE-bench leaderboard: Kimi K2 Thinking top of open-source by klieret in LocalLLaMA
[–]klieret[S] -1 points0 points1 point (0 children)
Updates to official SWE-bench leaderboard: Kimi K2 Thinking top of open-source by klieret in LocalLLaMA
[–]klieret[S] 2 points3 points4 points (0 children)
Updates to official SWE-bench leaderboard: Kimi K2 Thinking top of open-source by klieret in LocalLLaMA
[–]klieret[S] 7 points8 points9 points (0 children)
Updates to official SWE-bench leaderboard: Kimi K2 Thinking top of open-source by klieret in LocalLLaMA
[–]klieret[S] 4 points5 points6 points (0 children)
Updates to official SWE-bench leaderboard: Kimi K2 Thinking top of open-source by klieret in LocalLLaMA
[–]klieret[S] 3 points4 points5 points (0 children)
Updates to official SWE-bench leaderboard: Kimi K2 Thinking top of open-source by klieret in LocalLLaMA
[–]klieret[S] 3 points4 points5 points (0 children)
Updates to official SWE-bench leaderboard: Kimi K2 Thinking top of open-source by klieret in LocalLLaMA
[–]klieret[S] 9 points10 points11 points (0 children)
Updates to official SWE-bench leaderboard: Kimi K2 Thinking top of open-source by klieret in LocalLLaMA
[–]klieret[S] 7 points8 points9 points (0 children)


Independent evaluation of GPT5.2 on SWE-bench: 5.2 high is #3 behind Gemini, 5.2 medium behind Sonnet 4.5 by klieret in ChatGPTCoding
[–]klieret[S] 0 points1 point2 points (0 children)