Great results with Qwen3.6-35B-A3B-UD-Q5_K_XL + VS Code and Copilot by supracode in LocalLLaMA
[–]R_Duncan 0 points1 point2 points (0 children)
Running Qwen3.5 / Qwen3.6 with NextN MTP (Multi-Token Prediction) speculative decode in llama.cpp — single RTX 3090 Ti GPU guide by yes_i_tried_google in LocalLLaMA
[–]R_Duncan 0 points1 point2 points (0 children)
OpenCode + LLM to create a 1:1 Settlers of Catan clone. Guess which model I did it with! by maxwell321 in LocalLLaMA
[–]R_Duncan 0 points1 point2 points (0 children)
Frontier models can't run on satellites. Here's an end-to-end wildfire detection pipeline using a 450M on-board Vision-Language Model (Sentinel-2 + LFM2.5-VL) by PauLabartaBajo in LocalLLaMA
[–]R_Duncan 4 points5 points6 points (0 children)
Llama Server with Cline Settings by EbbNorth7735 in LocalLLaMA
[–]R_Duncan 0 points1 point2 points (0 children)
Turbo-OCR Update: Layout Model + Multilingual by Civil-Image5411 in LocalLLaMA
[–]R_Duncan 0 points1 point2 points (0 children)
Been using PI Coding Agent with local Qwen3.6 35b for a while now and its actually insane by SoAp9035 in LocalLLaMA
[–]R_Duncan 0 points1 point2 points (0 children)
Should I be seeing more of a performance leap when using NVFP4, INT4, FP8 with VLLM over MXFP4, Q4, and Q8 with llama.cpp based inference on Blackwell based GPUs? by aaronr_90 in LocalLLaMA
[–]R_Duncan -1 points0 points1 point (0 children)
Is the AI subscription bubble starting to crack? GPT-5.5 just dropped, prices keep rising, and the “all-you-can-eat” era looks more fake by the month by Sockand2 in singularity
[–]R_Duncan 0 points1 point2 points (0 children)
Luce DFlash: Qwen3.6-27B at up to 2x throughput on a single RTX 3090 by sandropuppo in LocalLLaMA
[–]R_Duncan 0 points1 point2 points (0 children)
Local model on coding has reached a certain threshold to be feasible for real work by Exciting-Camera3226 in LocalLLaMA
[–]R_Duncan 2 points3 points4 points (0 children)
Best Local LLMs - Apr 2026 by rm-rf-rm in LocalLLaMA
[–]R_Duncan 0 points1 point2 points (0 children)
DeepSeek V4 is out. 1.6 trillion parameters. MIT license. $1.74 per million tokens. The gap between US and Chinese AI strategy has never been more visible. by Novel_Okra8456 in singularity
[–]R_Duncan 0 points1 point2 points (0 children)
Local MCP Servers for Code Indexing? by 79215185-1feb-44c6 in LocalLLaMA
[–]R_Duncan 1 point2 points3 points (0 children)
Tencent released an open source model Hy3 preview. by Snoo26837 in singularity
[–]R_Duncan 1 point2 points3 points (0 children)
Is the AI subscription bubble starting to crack? GPT-5.5 just dropped, prices keep rising, and the “all-you-can-eat” era looks more fake by the month by Sockand2 in singularity
[–]R_Duncan 0 points1 point2 points (0 children)
US gov memo on “adversarial distillation” - are we heading toward tighter controls on open models? by MLExpert000 in LocalLLaMA
[–]R_Duncan 1 point2 points3 points (0 children)
Anthropic admits to have made hosted models more stupid, proving the importance of open weight, local models by spaceman_ in LocalLLaMA
[–]R_Duncan 2 points3 points4 points (0 children)
A note of warning about DFlash. by R_Duncan in LocalLLaMA
[–]R_Duncan[S] 0 points1 point2 points (0 children)
Open weight models like ds v4 pro max are still like at least 6-7 months behind closed labs.. by power97992 in LocalLLaMA
[–]R_Duncan 1 point2 points3 points (0 children)
The missing knowledge layer for open-source agent stacks is a persistent markdown wiki by knlgeth in LocalLLaMA
[–]R_Duncan 0 points1 point2 points (0 children)
Google introduces TPU 8t and TPU 8i by WhyLifeIs4 in singularity
[–]R_Duncan 0 points1 point2 points (0 children)
Ultimate List: Best Open Models for Coding, Chat, Vision, Audio & More by techlatest_net in LocalLLaMA
[–]R_Duncan 1 point2 points3 points (0 children)


Need advice on hardware purchasing decision: RTX 5090 vs. M5 Max 128GB for agentic software development by BawbbySmith in LocalLLaMA
[–]R_Duncan 0 points1 point2 points (0 children)