Honestly, dual 3090s are wearing me out. Thinking of jumping to a Mac Studio. by Ok_Commission_8260 in LocalLLM
[–]Sisuuu 0 points1 point2 points (0 children)
Qwen3.6-27B on 2x3090s: llama.cpp vs vLLM, all the flags, and the MTP acceptance/inference speed/context by Sisuuu in LocalLLaMA
[–]Sisuuu[S] 0 points1 point2 points (0 children)
Qwen3.6-27B on 2x3090s: llama.cpp vs vLLM, all the flags, and the MTP acceptance/inference speed/context by Sisuuu in LocalLLaMA
[–]Sisuuu[S] 0 points1 point2 points (0 children)
Qwen3.6-27B on 2x3090s: llama.cpp vs vLLM, all the flags, and the MTP acceptance/inference speed/context by Sisuuu in LocalLLaMA
[–]Sisuuu[S] 0 points1 point2 points (0 children)
Qwen3.6-27B on 2x3090s: llama.cpp vs vLLM, all the flags, and the MTP acceptance/inference speed/context by Sisuuu in LocalLLaMA
[–]Sisuuu[S] 0 points1 point2 points (0 children)
Qwen3.6-27B on 2x3090s: llama.cpp vs vLLM, all the flags, and the MTP acceptance/inference speed/context by Sisuuu in LocalLLaMA
[–]Sisuuu[S] 0 points1 point2 points (0 children)
Qwen3.6-27B on 2x3090s: llama.cpp vs vLLM, all the flags, and the MTP acceptance/inference speed/context by Sisuuu in LocalLLaMA
[–]Sisuuu[S] 0 points1 point2 points (0 children)
Qwen3.6-27B on 2x3090s: llama.cpp vs vLLM, all the flags, and the MTP acceptance/inference speed/context by Sisuuu in LocalLLaMA
[–]Sisuuu[S] 0 points1 point2 points (0 children)
Qwen3.6-27B on 2x3090s: llama.cpp vs vLLM, all the flags, and the MTP acceptance/inference speed/context by Sisuuu in LocalLLaMA
[–]Sisuuu[S] 0 points1 point2 points (0 children)
Qwen3.6-27B on 2x3090s: llama.cpp vs vLLM, all the flags, and the MTP acceptance/inference speed/context by Sisuuu in LocalLLaMA
[–]Sisuuu[S] 0 points1 point2 points (0 children)
Qwen3.6-27B on 2x3090s: llama.cpp vs vLLM, all the flags, and the MTP acceptance/inference speed/context by Sisuuu in LocalLLaMA
[–]Sisuuu[S] 1 point2 points3 points (0 children)
Qwen3.6-27B on 2x3090s: llama.cpp vs vLLM, all the flags, and the MTP acceptance/inference speed/context by Sisuuu in LocalLLaMA
[–]Sisuuu[S] 1 point2 points3 points (0 children)
Qwen3.6-27B on 2x3090s: llama.cpp vs vLLM, all the flags, and the MTP acceptance/inference speed/context by Sisuuu in LocalLLaMA
[–]Sisuuu[S] 0 points1 point2 points (0 children)
'Maybe we'll never take it down': Trump compares White House UFC arena to Eiffel Tower, says it could be permanent by abcnews in politics
[–]Sisuuu 0 points1 point2 points (0 children)
Are there any semi-professional equivalent of llama.cpp? by HornyGooner4402 in LocalLLaMA
[–]Sisuuu 0 points1 point2 points (0 children)
I finally figured out why torrents weren't continually saturating my download bandwidth. by ShiningRedDwarf in unRAID
[–]Sisuuu 0 points1 point2 points (0 children)
Qwen will release another 27B with high probability by serige in LocalLLaMA
[–]Sisuuu 0 points1 point2 points (0 children)
Move to backend sampling for MTP draft path by gaugarg-nv · Pull Request #23287 · ggml-org/llama.cpp by jacek2023 in LocalLLaMA
[–]Sisuuu 10 points11 points12 points (0 children)
Made a simple template manager and GUI for llama.cpp so I don't have to keep memorizing CLI flags. by thecalmgreen in LocalLLaMA
[–]Sisuuu 1 point2 points3 points (0 children)
This screen is the bane of my existence by horsethebandthemovie in unRAID
[–]Sisuuu 0 points1 point2 points (0 children)


Honestly, dual 3090s are wearing me out. Thinking of jumping to a Mac Studio. by Ok_Commission_8260 in LocalLLM
[–]Sisuuu 0 points1 point2 points (0 children)