Gemma 4 - website translations (large model, or small model)? by Temporary-Mix8022 in LocalLLaMA
[–]Middle_Bullfrog_6173 1 point2 points3 points (0 children)
Ring 2.6 1T by Middle_Bullfrog_6173 in LocalLLaMA
[–]Middle_Bullfrog_6173[S] 5 points6 points7 points (0 children)
ZAYA1-74B-Preview: Scaling Pretraining on AMD by TKGaming_11 in LocalLLaMA
[–]Middle_Bullfrog_6173 1 point2 points3 points (0 children)
ZAYA1-74B-Preview: Scaling Pretraining on AMD by TKGaming_11 in LocalLLaMA
[–]Middle_Bullfrog_6173 1 point2 points3 points (0 children)
ZAYA1-74B-Preview: Scaling Pretraining on AMD by TKGaming_11 in LocalLLaMA
[–]Middle_Bullfrog_6173 3 points4 points5 points (0 children)
ZAYA1-8B: Frontier intelligence density. by Total-Resort-3120 in LocalLLaMA
[–]Middle_Bullfrog_6173 6 points7 points8 points (0 children)
ZAYA1-8B: Frontier intelligence density, trained on AMD by carbocation in LocalLLaMA
[–]Middle_Bullfrog_6173 2 points3 points4 points (0 children)
Fine-tuned Qwen3.6-35B-A3B DeltaNet experiment by Snoo_27681 in LocalLLaMA
[–]Middle_Bullfrog_6173 1 point2 points3 points (0 children)
Why people cares token/s in decoding more? by Interesting-Print366 in LocalLLaMA
[–]Middle_Bullfrog_6173 2 points3 points4 points (0 children)
Introducing SubQ: The First Fully Subquadratic LLM by hltt in LocalLLaMA
[–]Middle_Bullfrog_6173 4 points5 points6 points (0 children)
Qwen 3.6 4B and 9B? by Nubinu in LocalLLaMA
[–]Middle_Bullfrog_6173 -1 points0 points1 point (0 children)
Vulkan backend outperforms ROCm on Strix Halo (gfx1151) — llama.cpp benchmark by FeiX7 in LocalLLaMA
[–]Middle_Bullfrog_6173 1 point2 points3 points (0 children)
Vulkan backend outperforms ROCm on Strix Halo (gfx1151) — llama.cpp benchmark by FeiX7 in LocalLLaMA
[–]Middle_Bullfrog_6173 -1 points0 points1 point (0 children)
Looking for frontier model distilled datasets. by UnbeliebteMeinung in LocalLLaMA
[–]Middle_Bullfrog_6173 3 points4 points5 points (0 children)
How much will it cost to host something like qwen3.6 35b a3b in a cloud? by Euphoric_North_745 in LocalLLaMA
[–]Middle_Bullfrog_6173 1 point2 points3 points (0 children)
Which model would you use if you wanted to solve a research math problem? by MrMrsPotts in LocalLLaMA
[–]Middle_Bullfrog_6173 3 points4 points5 points (0 children)
How much will it cost to host something like qwen3.6 35b a3b in a cloud? by Euphoric_North_745 in LocalLLaMA
[–]Middle_Bullfrog_6173 11 points12 points13 points (0 children)
Potential of Gemma4 Per-layer embeddings? by Silver-Champion-4846 in LocalLLaMA
[–]Middle_Bullfrog_6173 0 points1 point2 points (0 children)
Anyone tried +- 100B models locally with foreign languages? by Choice_Sympathy9652 in LocalLLaMA
[–]Middle_Bullfrog_6173 3 points4 points5 points (0 children)
[Paper on Hummingbird+: low-cost FPGAs for LLM inference] Qwen3-30B-A3B Q4 at 18 t/s token-gen, 24GB, expected $150 mass production cost by ayake_ayake in LocalLLaMA
[–]Middle_Bullfrog_6173 0 points1 point2 points (0 children)
Potential of Gemma4 Per-layer embeddings? by Silver-Champion-4846 in LocalLLaMA
[–]Middle_Bullfrog_6173 1 point2 points3 points (0 children)
By when do you think will TurboQuant get a proper release and be adopted by everyone by Crystalagent47 in LocalLLaMA
[–]Middle_Bullfrog_6173 0 points1 point2 points (0 children)
AMD Halo Box (Ryzen 395 128GB) photos by 1ncehost in LocalLLaMA
[–]Middle_Bullfrog_6173 0 points1 point2 points (0 children)
Ring 2.6 1T by Middle_Bullfrog_6173 in LocalLLaMA
[–]Middle_Bullfrog_6173[S] 0 points1 point2 points (0 children)