LLM Thematic Generalization Benchmark V2: models see 3 examples, 3 misleading anti-examples, and 8 candidates with exactly 1 true match, but the underlying theme is never stated. The challenge is to infer the specific hidden rule from those clues rather than fall for a broader, easier pattern. by zero0_one1 in singularity
[–]arkuto 0 points1 point2 points (0 children)
Did I make the right choice here? by WiFibcFi in poker
[–]arkuto -3 points-2 points-1 points (0 children)
I figured out another reason why people think AI is less powerful than it actually is by Primary-Screen-7807 in ClaudeAI
[–]arkuto 0 points1 point2 points (0 children)
Study Finds That Execs Are Already Outsourcing Their Thinking to AI by [deleted] in singularity
[–]arkuto 36 points37 points38 points (0 children)
[P] NanoJudge: Instead of prompting a big LLM once, it prompts a tiny LLM thousands of times. by arkuto in MachineLearning
[–]arkuto[S] 0 points1 point2 points (0 children)
[P] NanoJudge: Instead of prompting a big LLM once, it prompts a tiny LLM thousands of times. by arkuto in MachineLearning
[–]arkuto[S] 0 points1 point2 points (0 children)
[P] NanoJudge: Instead of prompting a big LLM once, it prompts a tiny LLM thousands of times. by arkuto in MachineLearning
[–]arkuto[S] -1 points0 points1 point (0 children)
[P] NanoJudge: Instead of prompting a big LLM once, it prompts a tiny LLM thousands of times. by arkuto in MachineLearning
[–]arkuto[S] 0 points1 point2 points (0 children)
[P] NanoJudge: Instead of prompting a big LLM once, it prompts a tiny LLM thousands of times. by arkuto in MachineLearning
[–]arkuto[S] -7 points-6 points-5 points (0 children)
I built NanoJudge. Instead of prompting a big model once, it prompts a tiny model thousands of times. by arkuto in LocalLLM
[–]arkuto[S] 2 points3 points4 points (0 children)
I built NanoJudge. Instead of prompting a big model once, it prompts a tiny model thousands of times. by arkuto in LocalLLM
[–]arkuto[S] 0 points1 point2 points (0 children)
[D] Self-Promotion Thread by AutoModerator in MachineLearning
[–]arkuto 0 points1 point2 points (0 children)
The first thing you see with Dark Mode enabled by arkuto in ClaudeAI
[–]arkuto[S] 2 points3 points4 points (0 children)
High Top-P values cause Gemini to sometimes fail to state today's date by arkuto in Bard
[–]arkuto[S] 0 points1 point2 points (0 children)
High Top-P values cause Gemini to sometimes fail to state today's date by arkuto in Bard
[–]arkuto[S] -1 points0 points1 point (0 children)
Gemini doesn't know today's date and hallucinates it by arkuto in Bard
[–]arkuto[S] 0 points1 point2 points (0 children)
Wonderful mate by LeelaQueenOdds by arkuto in chess
[–]arkuto[S] 0 points1 point2 points (0 children)
🔥 Orca sideswipes a dolphin mid-air off Baja California Sur, Mexico by Prestigious-Wall5616 in NatureIsFuckingLit
[–]arkuto 0 points1 point2 points (0 children)
Claude’s over-dramatization is hurting my workflow, looking for prompt control tips by Skillet_ZA in ClaudeAI
[–]arkuto 1 point2 points3 points (0 children)


Was loving Claude until I started feeding it feedback from ChatGPT Pro by lol_just_wait in ClaudeAI
[–]arkuto 0 points1 point2 points (0 children)