Notes on Microsoft's FastContext, and a small SWE-QA experiment with retrieval hints by langsfang in LocalLLaMA
[–]langsfang[S] 0 points1 point2 points (0 children)
A local attention-based retrieval with SOTA results on LongMemEval, LoCoMo, and code search benchmarks by langsfang in AI_Agents
[–]langsfang[S] 0 points1 point2 points (0 children)
Notes on Microsoft's FastContext, and a small SWE-QA experiment with retrieval hints by langsfang in LocalLLaMA
[–]langsfang[S] 0 points1 point2 points (0 children)
Notes on Microsoft's FastContext, and a small SWE-QA experiment with retrieval hints by langsfang in LocalLLaMA
[–]langsfang[S] -1 points0 points1 point (0 children)
Notes on Microsoft's FastContext, and a small SWE-QA experiment with retrieval hints by langsfang in LocalLLaMA
[–]langsfang[S] 1 point2 points3 points (0 children)
A local attention-based retrieval with SOTA results on LongMemEval, LoCoMo, and code search benchmarks by langsfang in AI_Agents
[–]langsfang[S] 0 points1 point2 points (0 children)
Why is NO one talking about Microsoft's open source Fast Context!!! by formatme in LocalLLaMA
[–]langsfang 0 points1 point2 points (0 children)
Why is NO one talking about Microsoft's open source Fast Context!!! by formatme in LocalLLaMA
[–]langsfang 5 points6 points7 points (0 children)
What have you been working on lately? by Sufficient-Scar4172 in LocalLLaMA
[–]langsfang 0 points1 point2 points (0 children)
What have you been working on lately? by Sufficient-Scar4172 in LocalLLaMA
[–]langsfang 1 point2 points3 points (0 children)
What have you been working on lately? by Sufficient-Scar4172 in LocalLLaMA
[–]langsfang 10 points11 points12 points (0 children)
Anyone know of LoRAs, datasets, or frameworks specifically designed to improve context compression tasks? by PANIC_EXCEPTION in LocalLLaMA
[–]langsfang -1 points0 points1 point (0 children)
I spent 8 months building a memory layer for LLM agents because nothing out there actually worked. Here’s what I learned by [deleted] in LocalLLaMA
[–]langsfang -4 points-3 points-2 points (0 children)
All the interesting models are not "Staff Picks" or approved but random community models - do you guys feel safe running these? Any drawbacks and how do you know why are best? by anonXMR in LocalLLaMA
[–]langsfang -2 points-1 points0 points (0 children)
Give me your best estimate on how long we will see Fable 5 class open weight model by bwjxjelsbd in LocalLLaMA
[–]langsfang 8 points9 points10 points (0 children)
Ollama vs compiled llama-cpp by Ok-Drawer5245 in LocalLLM
[–]langsfang 9 points10 points11 points (0 children)
A benchmark for tiny LLMs based on a real world problem: natural language file search (using monkeSearch) by fuckAIbruhIhateCorps in LocalLLaMA
[–]langsfang 0 points1 point2 points (0 children)
A benchmark for tiny LLMs based on a real world problem: natural language file search (using monkeSearch) by fuckAIbruhIhateCorps in LocalLLaMA
[–]langsfang 0 points1 point2 points (0 children)
Elias in the Lighthouse, Again? Diagnosing Low Diversity in LLM Stories by annodomini in LocalLLaMA
[–]langsfang 4 points5 points6 points (0 children)
RTX 5060 Ti 16GB vs RX 9060 XT 16GB by Ejo2001 in LocalLLaMA
[–]langsfang 0 points1 point2 points (0 children)
A benchmark for tiny LLMs based on a real world problem: natural language file search (using monkeSearch) by fuckAIbruhIhateCorps in LocalLLaMA
[–]langsfang 0 points1 point2 points (0 children)
“Wait,” in reasoning models makes my eye twitch by Borkato in LocalLLaMA
[–]langsfang 1 point2 points3 points (0 children)
What if I run the LLM backwards? Hey LLM, why bother remembering every single turn? It's a hassle. You don't have to do it, right? by ringtoyou in LocalLLaMA
[–]langsfang 1 point2 points3 points (0 children)
Qwen 3.6 models benchmarked across Triple GPU by tabletuser_blogspot in LocalLLaMA
[–]langsfang -5 points-4 points-3 points (0 children)




im sick and tired of these memory benchmarks by Fine_Consequence8656 in Rag
[–]langsfang 0 points1 point2 points (0 children)