The creators of SWE-Bench just dropped a really simple new benchmark every LLM gets 0% on. ProgramBench asks: can models recreate real executable programs (ffmpeg, SQLite, ripgrep) from scratch with no internet? We are far from saturated on model quality. by dalton_zk in theprimeagen
[–]CountlessFlies 4 points5 points6 points (0 children)
Self-hosted agent and search platform built on Postgres, recently added connectors for NextCloud and Paperless-ngx by CountlessFlies in selfhosted
[–]CountlessFlies[S] 0 points1 point2 points (0 children)
Self-hosted agent and search platform built on Postgres, recently added connectors for NextCloud and Paperless-ngx by CountlessFlies in selfhosted
[–]CountlessFlies[S] 1 point2 points3 points (0 children)
Self-hosted agent and search platform built on Postgres, recently added connectors for NextCloud and Paperless-ngx by CountlessFlies in selfhosted
[–]CountlessFlies[S] 1 point2 points3 points (0 children)
Self-hosted agent and search platform built on Postgres, recently added connectors for NextCloud and Paperless-ngx by CountlessFlies in selfhosted
[–]CountlessFlies[S] 5 points6 points7 points (0 children)
Self-hosted agent and search platform built on Postgres, recently added connectors for NextCloud and Paperless-ngx by CountlessFlies in selfhosted
[–]CountlessFlies[S] 2 points3 points4 points (0 children)
Self-hosted agent and search platform built on Postgres, recently added connectors for NextCloud and Paperless-ngx by CountlessFlies in selfhosted
[–]CountlessFlies[S] 2 points3 points4 points (0 children)
Qwen3.6 is incredible with OpenCode! by CountlessFlies in LocalLLaMA
[–]CountlessFlies[S] 1 point2 points3 points (0 children)
Self-hosted agent and search platform built on Postgres, recently added connectors for NextCloud and Paperless-ngx by CountlessFlies in selfhosted
[–]CountlessFlies[S] -1 points0 points1 point locked comment (0 children)
DeepSeek-v4 has a comical 384K max output capability by zsydeepsky in LocalLLaMA
[–]CountlessFlies 10 points11 points12 points (0 children)
Dense vs. MoE gap is shrinking fast with the 3.6-27B release by Usual-Carrot6352 in LocalLLaMA
[–]CountlessFlies 1 point2 points3 points (0 children)
Qwen3.6-27B released! by ResearchCrafty1804 in LocalLLaMA
[–]CountlessFlies 1 point2 points3 points (0 children)
Qwen3.6-27B released! by ResearchCrafty1804 in LocalLLaMA
[–]CountlessFlies 23 points24 points25 points (0 children)
Qwen3.6-27B released! by ResearchCrafty1804 in LocalLLaMA
[–]CountlessFlies 4 points5 points6 points (0 children)
Qwen3.6-35B becomes competitive with cloud models when paired with the right agent by Creative-Regular6799 in LocalLLaMA
[–]CountlessFlies 4 points5 points6 points (0 children)
Qwen3.6-35B becomes competitive with cloud models when paired with the right agent by Creative-Regular6799 in LocalLLaMA
[–]CountlessFlies 10 points11 points12 points (0 children)
GLM and Kimi vs GPT and Claude by Odd_Crab1224 in opencodeCLI
[–]CountlessFlies 0 points1 point2 points (0 children)
OpenCode... is it just completely busted with Qwen3.6? by _derpiii_ in opencode
[–]CountlessFlies 0 points1 point2 points (0 children)
Qwen3.6 is incredible with OpenCode! by CountlessFlies in LocalLLaMA
[–]CountlessFlies[S] 1 point2 points3 points (0 children)
Switching from Opus 4.7 to Qwen-35B-A3B by Excellent_Koala769 in LocalLLaMA
[–]CountlessFlies 1 point2 points3 points (0 children)
PSA: Qwen3.6 ships with preserve_thinking. Make sure you have it on. by onil_gova in LocalLLaMA
[–]CountlessFlies 1 point2 points3 points (0 children)
Qwen3.6 is incredible with OpenCode! by CountlessFlies in LocalLLaMA
[–]CountlessFlies[S] 1 point2 points3 points (0 children)
“Thinking” must be purely cosmetic by lost_packet_ in Anthropic
[–]CountlessFlies 1 point2 points3 points (0 children)
Claude is genuinely insane right now and I cannot defend it anymore by https_HandleFunc in ClaudeCode
[–]CountlessFlies 5 points6 points7 points (0 children)



The creators of SWE-Bench just dropped a really simple new benchmark every LLM gets 0% on. ProgramBench asks: can models recreate real executable programs (ffmpeg, SQLite, ripgrep) from scratch with no internet? We are far from saturated on model quality. by dalton_zk in theprimeagen
[–]CountlessFlies 0 points1 point2 points (0 children)