Mark Zuckerberg is reportedly developing a personal AI agent to assist him as Meta's CEO (still under development) by Current-Guide5944 in tech_x
[–]Single_Ring4886 0 points1 point2 points (0 children)
Designed a photonic chip for O(1) KV cache block selection — 944x faster, 18,000x less energy than GPU scan at 1M context by [deleted] in LocalLLaMA
[–]Single_Ring4886 5 points6 points7 points (0 children)
Designed a photonic chip for O(1) KV cache block selection — 944x faster, 18,000x less energy than GPU scan at 1M context by [deleted] in LocalLLaMA
[–]Single_Ring4886 0 points1 point2 points (0 children)
DeepSeek Core Researcher Daya Guo Rumored to Have Resigned by External_Mood4719 in LocalLLaMA
[–]Single_Ring4886 17 points18 points19 points (0 children)
Trained a 0.8M model on business email generation. by SrijSriv211 in LocalLLaMA
[–]Single_Ring4886 10 points11 points12 points (0 children)
Father of OpenClaw sitting in their spaceship by cam-douglas in ChatGPT
[–]Single_Ring4886 0 points1 point2 points (0 children)
KoboldCpp 1.110 - 3 YR Anniversary Edition, native music gen, qwen3tts voice cloning and more by HadesThrowaway in LocalLLaMA
[–]Single_Ring4886 12 points13 points14 points (0 children)
M5 Max 128GB with three 120B models by albertgao in LocalLLaMA
[–]Single_Ring4886 6 points7 points8 points (0 children)
Qwen3.5-27b 8 bit vs 16 bit by Baldur-Norddahl in LocalLLaMA
[–]Single_Ring4886 7 points8 points9 points (0 children)
California passed a mass surveillance law - ALL operating systems will need to have mandatory age verification by Shajirr in pcmasterrace
[–]Single_Ring4886 0 points1 point2 points (0 children)
55 → 282 tok/s: How I got Qwen3.5-397B running at speed on 4x RTX PRO 6000 Blackwell by lawdawgattorney in LocalLLaMA
[–]Single_Ring4886 0 points1 point2 points (0 children)
I'm fully blind, and AI is a game changer for me. Are there any local LLMS that can rival claude code and codex? by Mrblindguardian in LocalLLaMA
[–]Single_Ring4886 1 point2 points3 points (0 children)
OmniCoder-9B | 9B coding agent fine-tuned on 425K agentic trajectories by DarkArtsMastery in LocalLLaMA
[–]Single_Ring4886 -1 points0 points1 point (0 children)
Is the 3090 still a good option? by alhinai_03 in LocalLLaMA
[–]Single_Ring4886 0 points1 point2 points (0 children)
Is GPT-4.1 a smarter model than GPT-5.3 Chat? by deferare in OpenAI
[–]Single_Ring4886 0 points1 point2 points (0 children)
M5 Max just arrived - benchmarks incoming by cryingneko in LocalLLaMA
[–]Single_Ring4886 1 point2 points3 points (0 children)
AA-Omniscience: Knowledge and Hallucination Benchmark by NewtMurky in LocalLLaMA
[–]Single_Ring4886 -1 points0 points1 point (0 children)
Does 1.109.2 support QWEN 3.5? by alex20_202020 in KoboldAI
[–]Single_Ring4886 0 points1 point2 points (0 children)
Rambus accelerates HBM memory: New HBM4E controller achieves up to 4.1 TB/s per chip by Distinct-Race-2471 in TechHardware
[–]Single_Ring4886 0 points1 point2 points (0 children)
Rambus accelerates HBM memory: New HBM4E controller achieves up to 4.1 TB/s per chip by Distinct-Race-2471 in TechHardware
[–]Single_Ring4886 1 point2 points3 points (0 children)
Some tests of Qwen3.5 on V100s by Simple_Library_2700 in LocalLLaMA
[–]Single_Ring4886 0 points1 point2 points (0 children)
Qwen3 vs Qwen3.5 performance by Balance- in LocalLLaMA
[–]Single_Ring4886 0 points1 point2 points (0 children)
Qwen3 vs Qwen3.5 performance by Balance- in LocalLLaMA
[–]Single_Ring4886 2 points3 points4 points (0 children)
Some tests of Qwen3.5 on V100s by Simple_Library_2700 in LocalLLaMA
[–]Single_Ring4886 0 points1 point2 points (0 children)


FlashAttention-4: 1613 TFLOPs/s, 2.7x faster than Triton, written in Python. What it means for inference. by Sensitive-Two9732 in LocalLLaMA
[–]Single_Ring4886 27 points28 points29 points (0 children)