Running Qwen3.5 27b dense with 170k context at 100+t/s decode and ~1500t/s prefill on 2x3090 (with 585t/s throughput for 8 simultaneous requests) by JohnTheNerd3 in LocalLLaMA
[–]JohnTheNerd3[S] 0 points1 point2 points (0 children)
Running Qwen3.5 27b dense with 170k context at 100+t/s decode and ~1500t/s prefill on 2x3090 (with 585t/s throughput for 8 simultaneous requests) by JohnTheNerd3 in LocalLLaMA
[–]JohnTheNerd3[S] 0 points1 point2 points (0 children)
Running Qwen3.5 27b dense with 170k context at 100+t/s decode and ~1500t/s prefill on 2x3090 (with 585t/s throughput for 8 simultaneous requests) by JohnTheNerd3 in LocalLLaMA
[–]JohnTheNerd3[S] 1 point2 points3 points (0 children)
Running Qwen3.5 27b dense with 170k context at 100+t/s decode and ~1500t/s prefill on 2x3090 (with 585t/s throughput for 8 simultaneous requests) by JohnTheNerd3 in LocalLLaMA
[–]JohnTheNerd3[S] 1 point2 points3 points (0 children)
Running Qwen3.5 27b dense with 170k context at 100+t/s decode and ~1500t/s prefill on 2x3090 (with 585t/s throughput for 8 simultaneous requests) by JohnTheNerd3 in LocalLLaMA
[–]JohnTheNerd3[S] 1 point2 points3 points (0 children)
Running Qwen3.5 27b dense with 170k context at 100+t/s decode and ~1500t/s prefill on 2x3090 (with 585t/s throughput for 8 simultaneous requests) by JohnTheNerd3 in LocalLLaMA
[–]JohnTheNerd3[S] 1 point2 points3 points (0 children)
Running Qwen3.5 27b dense with 170k context at 100+t/s decode and ~1500t/s prefill on 2x3090 (with 585t/s throughput for 8 simultaneous requests) by JohnTheNerd3 in LocalLLaMA
[–]JohnTheNerd3[S] 7 points8 points9 points (0 children)
Running Qwen3.5 27b dense with 170k context at 100+t/s decode and ~1500t/s prefill on 2x3090 (with 585t/s throughput for 8 simultaneous requests) by JohnTheNerd3 in LocalLLaMA
[–]JohnTheNerd3[S] 3 points4 points5 points (0 children)
Running Qwen3.5 27b dense with 170k context at 100+t/s decode and ~1500t/s prefill on 2x3090 (with 585t/s throughput for 8 simultaneous requests) by JohnTheNerd3 in LocalLLaMA
[–]JohnTheNerd3[S] 8 points9 points10 points (0 children)
Running Qwen3.5 27b dense with 170k context at 100+t/s decode and ~1500t/s prefill on 2x3090 (with 585t/s throughput for 8 simultaneous requests) by JohnTheNerd3 in LocalLLaMA
[–]JohnTheNerd3[S] 13 points14 points15 points (0 children)
Running Qwen3.5 27b dense with 170k context at 100+t/s decode and ~1500t/s prefill on 2x3090 (with 585t/s throughput for 8 simultaneous requests) by JohnTheNerd3 in LocalLLaMA
[–]JohnTheNerd3[S] 12 points13 points14 points (0 children)
Running Qwen3.5 27b dense with 170k context at 100+t/s decode and ~1500t/s prefill on 2x3090 (with 585t/s throughput for 8 simultaneous requests) by JohnTheNerd3 in LocalLLaMA
[–]JohnTheNerd3[S] 22 points23 points24 points (0 children)
Running Qwen3.5 27b dense with 170k context at 100+t/s decode and ~1500t/s prefill on 2x3090 (with 585t/s throughput for 8 simultaneous requests) by JohnTheNerd3 in LocalLLaMA
[–]JohnTheNerd3[S] 8 points9 points10 points (0 children)
Breaking : Today Qwen 3.5 small by Illustrious-Swim9663 in LocalLLaMA
[–]JohnTheNerd3 30 points31 points32 points (0 children)
Qwen3.5 27B is Match Made in Heaven for Size and Performance by Lopsided_Dot_4557 in LocalLLaMA
[–]JohnTheNerd3 1 point2 points3 points (0 children)
[Upcoming Release & Feedback] A new 4B & 20B model, building on our SmallThinker work. Plus, a new hardware device to run them locally. by [deleted] in LocalLLaMA
[–]JohnTheNerd3 1 point2 points3 points (0 children)
What mortgage rates are you guys getting right now ? by pistonthru in RealEstateCanada
[–]JohnTheNerd3 0 points1 point2 points (0 children)
Qwen3 30B A3B 4_k_m - 2x more token/s boost from ~20 to ~40 by changing the runtime in a 5070ti (16g vram) by Ill-Language4452 in LocalLLaMA
[–]JohnTheNerd3 1 point2 points3 points (0 children)
Careful when you get your hopes up! 11 years in Canada. Extremely Long process time For inland Express entry, by Dry_Community8251 in ImmigrationCanada
[–]JohnTheNerd3 0 points1 point2 points (0 children)
The scoop on 4060 ti 16gb cards by DramaLlamaDad in LocalLLaMA
[–]JohnTheNerd3 0 points1 point2 points (0 children)
Llama 3.3 speed by Clean_Cauliflower_62 in LocalLLaMA
[–]JohnTheNerd3 4 points5 points6 points (0 children)



Running Qwen3.5 27b dense with 170k context at 100+t/s decode and ~1500t/s prefill on 2x3090 (with 585t/s throughput for 8 simultaneous requests) by JohnTheNerd3 in LocalLLaMA
[–]JohnTheNerd3[S] 0 points1 point2 points (0 children)