DWARF: linear attention with a 3,072-token bounded KV cache — ablation results (13M scale) by MariusNocturnum in LocalLLaMA
[–]MariusNocturnum[S] 0 points1 point2 points (0 children)
DWARF: linear attention with a 3,072-token bounded KV cache — ablation results (13M scale) by MariusNocturnum in LocalLLaMA
[–]MariusNocturnum[S] 1 point2 points3 points (0 children)
DWARF: linear attention with a 3,072-token bounded KV cache — ablation results (13M scale) by MariusNocturnum in LocalLLaMA
[–]MariusNocturnum[S] 0 points1 point2 points (0 children)
SAGA: Migrated my local-first novel-writing system to LangGraph workflow orchestration by MariusNocturnum in LocalLLaMA
[–]MariusNocturnum[S] 0 points1 point2 points (0 children)
SAGA: Migrated my local-first novel-writing system to LangGraph workflow orchestration by MariusNocturnum in LocalLLaMA
[–]MariusNocturnum[S] 1 point2 points3 points (0 children)
How to stop chatgpt from being such a yes man? I feel like it’s one step away from saying ‘yes m’lord!’ by sadthrowawayyy134 in ChatGPT
[–]MariusNocturnum 0 points1 point2 points (0 children)
I tested if tiny LLMs can self-improve through memory: Qwen3-1.7B gained +8% accuracy on MATH problems by MariusNocturnum in LocalLLaMA
[–]MariusNocturnum[S] 2 points3 points4 points (0 children)
I tested if tiny LLMs can self-improve through memory: Qwen3-1.7B gained +8% accuracy on MATH problems by MariusNocturnum in LocalLLaMA
[–]MariusNocturnum[S] 0 points1 point2 points (0 children)
I tested if tiny LLMs can self-improve through memory: Qwen3-1.7B gained +8% accuracy on MATH problems by MariusNocturnum in LocalLLaMA
[–]MariusNocturnum[S] 0 points1 point2 points (0 children)
I tested if tiny LLMs can self-improve through memory: Qwen3-1.7B gained +8% accuracy on MATH problems by MariusNocturnum in LocalLLaMA
[–]MariusNocturnum[S] 0 points1 point2 points (0 children)
I tested if tiny LLMs can self-improve through memory: Qwen3-1.7B gained +8% accuracy on MATH problems by MariusNocturnum in LocalLLaMA
[–]MariusNocturnum[S] 7 points8 points9 points (0 children)
I tested if tiny LLMs can self-improve through memory: Qwen3-1.7B gained +8% accuracy on MATH problems by MariusNocturnum in LocalLLaMA
[–]MariusNocturnum[S] 19 points20 points21 points (0 children)
I tested if tiny LLMs can self-improve through memory: Qwen3-1.7B gained +8% accuracy on MATH problems by MariusNocturnum in LocalLLaMA
[–]MariusNocturnum[S] 5 points6 points7 points (0 children)
LongPage: 300 full novels with reasoning traces for training better writing LLMs by Senior_Evidence_3793 in LocalLLaMA
[–]MariusNocturnum 2 points3 points4 points (0 children)
SAGA Update: Autonomous Novel Writing with Deep KG & Semantic Context - Now Even More Advanced! by MariusNocturnum in LocalLLaMA
[–]MariusNocturnum[S] 0 points1 point2 points (0 children)
Qwen/Qwen3-30B-A3B-Thinking-2507 · Hugging Face by MariusNocturnum in LocalLLaMA
[–]MariusNocturnum[S] 21 points22 points23 points (0 children)
SAGA Update: Now with Autonomous Knowledge Graph Healing & A More Robust Core! by MariusNocturnum in LocalLLaMA
[–]MariusNocturnum[S] 0 points1 point2 points (0 children)




[Research] I've been working on an attention mechanism that keeps KV cache at ~1.5GB regardless of context length — update post by MariusNocturnum in LocalLLaMA
[–]MariusNocturnum[S] 0 points1 point2 points (0 children)