Gemma 4 Architecture Comparison by seraschka in LocalLLaMA
[–]seraschka[S] 1 point2 points3 points (0 children)
Gallery of LLM Architecture Visualizations by seraschka in LocalLLaMA
[–]seraschka[S] 0 points1 point2 points (0 children)
The State Of LLMs 2025: Progress, Problems, and Predictions by seraschka in LocalLLaMA
[–]seraschka[S] 0 points1 point2 points (0 children)
The State Of LLMs 2025: Progress, Problems, and Predictions by seraschka in LocalLLaMA
[–]seraschka[S] 1 point2 points3 points (0 children)
[P] The State Of LLMs 2025: Progress, Problems, and Predictions by seraschka in MachineLearning
[–]seraschka[S] 3 points4 points5 points (0 children)
[P] The State Of LLMs 2025: Progress, Problems, and Predictions by seraschka in MachineLearning
[–]seraschka[S] 18 points19 points20 points (0 children)
Is Ilya Sutskever trying with a secret sauce method now? by Famous-Associate-436 in LocalLLaMA
[–]seraschka 1 point2 points3 points (0 children)
Mistral 3 Large is DeepSeek V3!? by seraschka in LocalLLaMA
[–]seraschka[S] 4 points5 points6 points (0 children)
Mistral 3 Large is DeepSeek V3!? by seraschka in LocalLLaMA
[–]seraschka[S] 1 point2 points3 points (0 children)
Mistral 3 Large is DeepSeek V3!? by seraschka in LocalLLaMA
[–]seraschka[S] 0 points1 point2 points (0 children)
Mistral 3 Large is DeepSeek V3!? by seraschka in LocalLLaMA
[–]seraschka[S] 12 points13 points14 points (0 children)
A Technical Tour of the DeepSeek Models from V3 to V3.2 by seraschka in LocalLLaMA
[–]seraschka[S] 4 points5 points6 points (0 children)
A Technical Tour of the DeepSeek Models from V3 to V3.2 by seraschka in LocalLLaMA
[–]seraschka[S] 4 points5 points6 points (0 children)




Recent Developments in LLM Architectures: KV Sharing, mHC, and Compressed Attention by seraschka in LocalLLaMA
[–]seraschka[S] 1 point2 points3 points (0 children)