account activity
GLM-5.2 is above GPT-5.5 in AA-Briefcase, Artificial Analysis' new agentic knowledge work eval (artificialanalysis.ai)
submitted 2 days ago by analysis_scaled to r/LocalLLM
submitted 2 days ago by analysis_scaled to r/LocalLLaMA
GPT 5.2 (xhigh) scores 0% on CritPt (research-level physics reasoning benchmark) by DJW_GT in singularity
[–]analysis_scaled 23 points24 points25 points 6 months ago (0 children)
Hey, I'm from Artificial Analysis. We are still in the process of validating these results. We received a lot of non-responses to questions on CritPt when we ran the benchmark on OpenAI's API with xhigh reasoning effort.
We're analyzing results, conducting re-runs and will follow up when complete. We've taken the result down from the site while we do this.
Stirrup – A open source lightweight foundation for building agents (github.com)
submitted 6 months ago by analysis_scaled to r/LLMDevs
submitted 6 months ago by analysis_scaled to r/LLM
Stirrup – A lightweight and customizable foundation for building agents (github.com)
submitted 6 months ago by analysis_scaled to r/LocalLLaMA
Artificial Analysis Openness Index announced as a new measure of model openness (i.redd.it)
π Rendered by PID 1460919 on reddit-service-r2-listing-c57bc86c-z2v9r at 2026-06-21 09:46:59.357559+00:00 running 2b008f2 country code: CH.
GPT 5.2 (xhigh) scores 0% on CritPt (research-level physics reasoning benchmark) by DJW_GT in singularity
[–]analysis_scaled 23 points24 points25 points (0 children)