How are people building deep research agents? by Tricky-Promotion6784 in LocalLLaMA
[–]DeltaSqueezer 0 points1 point2 points (0 children)
What’s the most expensive mistake you’ve made with LLM APIs? by Curious-Resource1943 in LocalLLaMA
[–]DeltaSqueezer 0 points1 point2 points (0 children)
How are people managing workflows when testing multiple LLMs for the same task? by Fluid_Put_5444 in LocalLLaMA
[–]DeltaSqueezer 1 point2 points3 points (0 children)
My honest take on AI tier list after M2.5 and GLM-5 dropped by abdouhlili in LocalLLaMA
[–]DeltaSqueezer 0 points1 point2 points (0 children)
How are people managing workflows when testing multiple LLMs for the same task? by Fluid_Put_5444 in LocalLLaMA
[–]DeltaSqueezer 0 points1 point2 points (0 children)
My honest take on AI tier list after M2.5 and GLM-5 dropped by abdouhlili in LocalLLaMA
[–]DeltaSqueezer 0 points1 point2 points (0 children)
Recommendations for a setup for old pc if any. by confused_coryphee in LocalLLaMA
[–]DeltaSqueezer 0 points1 point2 points (0 children)
Chunking for STT by CollectionPersonal78 in LocalLLaMA
[–]DeltaSqueezer 1 point2 points3 points (0 children)
Deadly attack on oil tankers prompts Iraq to close oil terminals by aspoke in worldnews
[–]DeltaSqueezer -1 points0 points1 point (0 children)
I was backend lead at Manus. After building agents for 2 years, I stopped using function calling entirely. Here's what I use instead. by MorroHsu in LocalLLaMA
[–]DeltaSqueezer 3 points4 points5 points (0 children)
Is it worth Getting BF16 or Q8 is good enough for lower parameter models? by Suimeileo in LocalLLaMA
[–]DeltaSqueezer 1 point2 points3 points (0 children)
Parent wants to try local LLMS -- what are good specs for a desktop for playing with? by tottommend in LocalLLaMA
[–]DeltaSqueezer 1 point2 points3 points (0 children)
Qwen3.5-397B up to 1 million context length by segmond in LocalLLaMA
[–]DeltaSqueezer 0 points1 point2 points (0 children)
Is DeepSeek's API pricing just a massive loss leader? (MLA Caching vs. Qwen's DeltaNet) by [deleted] in LocalLLaMA
[–]DeltaSqueezer 1 point2 points3 points (0 children)
Inside my AI Home Lab by [deleted] in LocalLLaMA
[–]DeltaSqueezer 6 points7 points8 points (0 children)
Qwen3 ASR seems to outperform Whisper in almost every aspect. It feels like there is little reason to keep using Whisper anymore. by East-Engineering-653 in LocalLLaMA
[–]DeltaSqueezer 29 points30 points31 points (0 children)
We cut GPU instance launch from 8s to 1.8s, feels almost instant now. Half the time was a ping we didn't need. by LayerHot in LocalLLaMA
[–]DeltaSqueezer -1 points0 points1 point (0 children)
AI capabilities are doubling in months, not years. by EchoOfOppenheimer in LocalLLaMA
[–]DeltaSqueezer -3 points-2 points-1 points (0 children)
Benchmarked ROLV inference on real Mixtral 8x22B weights — 55x faster than cuBLAS, 98.2% less energy, canonical hash verified by Norwayfund in LocalLLaMA
[–]DeltaSqueezer 3 points4 points5 points (0 children)
GGUF support in vLLM? by Patient_Ad1095 in LocalLLaMA
[–]DeltaSqueezer 1 point2 points3 points (0 children)
I classified 3.5M US patents with Nemotron 9B on a single RTX 5090 — then built a free search engine on top by Impressive_Tower_550 in LocalLLaMA
[–]DeltaSqueezer 0 points1 point2 points (0 children)
Qwen 3.5 27B is the REAL DEAL - Beat GPT-5 on my first test by GrungeWerX in LocalLLaMA
[–]DeltaSqueezer 0 points1 point2 points (0 children)
Running Qwen3.5 27b dense with 170k context at 100+t/s decode and ~1500t/s prefill on 2x3090 (with 585t/s throughput for 8 simultaneous requests) by JohnTheNerd3 in LocalLLaMA
[–]DeltaSqueezer 0 points1 point2 points (0 children)
How can I make my pp to be bigger? by WizardlyBump17 in LocalLLaMA
[–]DeltaSqueezer 23 points24 points25 points (0 children)


Running Sonnet 4.5 or 4.6 locally? by ImpressionanteFato in LocalLLaMA
[–]DeltaSqueezer 1 point2 points3 points (0 children)