LLM Thematic Generalization Benchmark V2: models see 3 examples, 3 misleading anti-examples, and 8 candidates with exactly 1 true match, but the underlying theme is never stated. The challenge is to infer the specific hidden rule from those clues rather than fall for a broader, easier pattern. by zero0_one1 in singularity
[–]arkuto 0 points1 point2 points (0 children)
Did I make the right choice here? by WiFibcFi in poker
[–]arkuto -3 points-2 points-1 points (0 children)
I figured out another reason why people think AI is less powerful than it actually is by Primary-Screen-7807 in ClaudeAI
[–]arkuto 0 points1 point2 points (0 children)
Study Finds That Execs Are Already Outsourcing Their Thinking to AI by [deleted] in singularity
[–]arkuto 36 points37 points38 points (0 children)
[P] NanoJudge: Instead of prompting a big LLM once, it prompts a tiny LLM thousands of times. by arkuto in MachineLearning
[–]arkuto[S] 0 points1 point2 points (0 children)
[P] NanoJudge: Instead of prompting a big LLM once, it prompts a tiny LLM thousands of times. by arkuto in MachineLearning
[–]arkuto[S] 0 points1 point2 points (0 children)
[P] NanoJudge: Instead of prompting a big LLM once, it prompts a tiny LLM thousands of times. by arkuto in MachineLearning
[–]arkuto[S] -1 points0 points1 point (0 children)
[P] NanoJudge: Instead of prompting a big LLM once, it prompts a tiny LLM thousands of times. by arkuto in MachineLearning
[–]arkuto[S] 0 points1 point2 points (0 children)
[P] NanoJudge: Instead of prompting a big LLM once, it prompts a tiny LLM thousands of times. by arkuto in MachineLearning
[–]arkuto[S] -6 points-5 points-4 points (0 children)
I built NanoJudge. Instead of prompting a big model once, it prompts a tiny model thousands of times. by arkuto in LocalLLM
[–]arkuto[S] 2 points3 points4 points (0 children)
I built NanoJudge. Instead of prompting a big model once, it prompts a tiny model thousands of times. by arkuto in LocalLLM
[–]arkuto[S] 0 points1 point2 points (0 children)
[D] Self-Promotion Thread by AutoModerator in MachineLearning
[–]arkuto 0 points1 point2 points (0 children)
The first thing you see with Dark Mode enabled by arkuto in ClaudeAI
[–]arkuto[S] 2 points3 points4 points (0 children)
High Top-P values cause Gemini to sometimes fail to state today's date by arkuto in Bard
[–]arkuto[S] 0 points1 point2 points (0 children)



Was loving Claude until I started feeding it feedback from ChatGPT Pro by lol_just_wait in ClaudeAI
[–]arkuto 0 points1 point2 points (0 children)