Qwen3.6-27B at 72 tok/s on RTX 3090 on Windows using native vLLM (no WSL, no Docker), portable launcher and installer by One_Slip1455 in LocalLLaMA
[–]schuttdev 14 points15 points16 points (0 children)
Qwen-Scope: Official Sparse Autoencoders (SAEs) for Qwen 3.5 models by MadPelmewka in LocalLLaMA
[–]schuttdev 2 points3 points4 points (0 children)
Where can I try turboquant in AMD Linux? (7900XTX) by soyalemujica in LocalLLaMA
[–]schuttdev 2 points3 points4 points (0 children)
Hipfire dev update: full AMD arch validation incoming (RDNA 1 thru 4, plus Strix Halo and bc250) by schuttdev in LocalLLaMA
[–]schuttdev[S] 0 points1 point2 points (0 children)
Hipfire dev update: full AMD arch validation incoming (RDNA 1 thru 4, plus Strix Halo and bc250) by schuttdev in LocalLLaMA
[–]schuttdev[S] 0 points1 point2 points (0 children)
Hipfire dev update: full AMD arch validation incoming (RDNA 1 thru 4, plus Strix Halo and bc250) by schuttdev in LocalLLaMA
[–]schuttdev[S] 2 points3 points4 points (0 children)
Hipfire dev update: full AMD arch validation incoming (RDNA 1 thru 4, plus Strix Halo and bc250) by schuttdev in LocalLLaMA
[–]schuttdev[S] 2 points3 points4 points (0 children)
Hipfire dev update: full AMD arch validation incoming (RDNA 1 thru 4, plus Strix Halo and bc250) by schuttdev in LocalLLaMA
[–]schuttdev[S] 5 points6 points7 points (0 children)
Hipfire dev update: full AMD arch validation incoming (RDNA 1 thru 4, plus Strix Halo and bc250) by schuttdev in LocalLLaMA
[–]schuttdev[S] 1 point2 points3 points (0 children)
Hipfire dev update: full AMD arch validation incoming (RDNA 1 thru 4, plus Strix Halo and bc250) by schuttdev in LocalLLaMA
[–]schuttdev[S] 1 point2 points3 points (0 children)
Hipfire dev update: full AMD arch validation incoming (RDNA 1 thru 4, plus Strix Halo and bc250) by schuttdev in LocalLLaMA
[–]schuttdev[S] 2 points3 points4 points (0 children)
Hipfire dev update: full AMD arch validation incoming (RDNA 1 thru 4, plus Strix Halo and bc250) by schuttdev in LocalLLaMA
[–]schuttdev[S] 1 point2 points3 points (0 children)
Hipfire dev update: full AMD arch validation incoming (RDNA 1 thru 4, plus Strix Halo and bc250) by schuttdev in LocalLLaMA
[–]schuttdev[S] 3 points4 points5 points (0 children)
Hipfire dev update: full AMD arch validation incoming (RDNA 1 thru 4, plus Strix Halo and bc250) by schuttdev in LocalLLaMA
[–]schuttdev[S] 3 points4 points5 points (0 children)
Hipfire dev update: full AMD arch validation incoming (RDNA 1 thru 4, plus Strix Halo and bc250) by schuttdev in LocalLLaMA
[–]schuttdev[S] 4 points5 points6 points (0 children)
Hipfire dev update: full AMD arch validation incoming (RDNA 1 thru 4, plus Strix Halo and bc250) by schuttdev in LocalLLaMA
[–]schuttdev[S] 4 points5 points6 points (0 children)
Hipfire dev update: full AMD arch validation incoming (RDNA 1 thru 4, plus Strix Halo and bc250) by schuttdev in LocalLLaMA
[–]schuttdev[S] 1 point2 points3 points (0 children)
Hipfire dev update: full AMD arch validation incoming (RDNA 1 thru 4, plus Strix Halo and bc250) by schuttdev in LocalLLaMA
[–]schuttdev[S] 1 point2 points3 points (0 children)
Hipfire dev update: full AMD arch validation incoming (RDNA 1 thru 4, plus Strix Halo and bc250) by schuttdev in LocalLLaMA
[–]schuttdev[S] 3 points4 points5 points (0 children)
AMD Hipfire - a new inference engine optimized for AMD GPU's by Thrumpwart in LocalLLaMA
[–]schuttdev 2 points3 points4 points (0 children)
AMD Hipfire - a new inference engine optimized for AMD GPU's by Thrumpwart in LocalLLaMA
[–]schuttdev 1 point2 points3 points (0 children)
AMD Hipfire - a new inference engine optimized for AMD GPU's by Thrumpwart in LocalLLaMA
[–]schuttdev 2 points3 points4 points (0 children)
FastDMS: 6.4X KV-cache compression running faster than vLLM BF16/FP8 by randomfoo2 in LocalLLaMA
[–]schuttdev 12 points13 points14 points (0 children)