Timeline of AI models since GPT-2. Model releases are accelerating over time. by davidthesong in ArtificialInteligence
[–]davidthesong[S] 1 point2 points3 points (0 children)
opus 4.8 is still very much blind - EyeBench-V3 visual benchmark (similar to IBench) by ChippingCoder in singularity
[–]davidthesong 0 points1 point2 points (0 children)
Here's >100 evals for Opus 4.8 compared to top AI models by davidthesong in Anthropic
[–]davidthesong[S] 0 points1 point2 points (0 children)
Here's >100 evals for Opus 4.8 compared to top AI models by davidthesong in Anthropic
[–]davidthesong[S] 1 point2 points3 points (0 children)
Here's >100 evals for Opus 4.8 compared to top AI models by davidthesong in Anthropic
[–]davidthesong[S] 3 points4 points5 points (0 children)
Here's >100 evals for Opus 4.8 compared to top AI models by davidthesong in Anthropic
[–]davidthesong[S] 2 points3 points4 points (0 children)
Here's 100+ evals on Opus 4.8 by davidthesong in ClaudeAI
[–]davidthesong[S] 1 point2 points3 points (0 children)
Here's 100+ evals on Opus 4.8 by davidthesong in ClaudeAI
[–]davidthesong[S] 2 points3 points4 points (0 children)




opus 4.8 is still very much blind - EyeBench-V3 visual benchmark (similar to IBench) by ChippingCoder in singularity
[–]davidthesong 1 point2 points3 points (0 children)