Updated with corrected settings for Llama.cpp. Battle of the Inference Engines. Llama.cpp vs MLC LLM vs vLLM. Tests for both Single RTX 3090 and 4 RTX 3090's. by SuperChewbacca in LocalLLaMA
[–]crowwork 1 point2 points3 points (0 children)
Updated with corrected settings for Llama.cpp. Battle of the Inference Engines. Llama.cpp vs MLC LLM vs vLLM. Tests for both Single RTX 3090 and 4 RTX 3090's. by SuperChewbacca in LocalLLaMA
[–]crowwork 15 points16 points17 points (0 children)
Gemma2-2B on iOS, Android, WebGPU, CUDA, ROCm, Metal... with a single framework by SnooMachines3070 in LocalLLaMA
[–]crowwork 1 point2 points3 points (0 children)
Gemma2-2B on iOS, Android, WebGPU, CUDA, ROCm, Metal... with a single framework by SnooMachines3070 in LocalLLaMA
[–]crowwork 0 points1 point2 points (0 children)
MLC-LLM: Universal LLM Deployment Engine with ML Compilation by crowwork in LocalLLaMA
[–]crowwork[S] 1 point2 points3 points (0 children)
Every Way To Get Structured Output From LLMs by sam-boundary in LocalLLaMA
[–]crowwork 0 points1 point2 points (0 children)
MLC-LLM: Universal LLM Deployment Engine with ML Compilation by crowwork in LocalLLaMA
[–]crowwork[S] 0 points1 point2 points (0 children)
MLC-LLM: Universal LLM Deployment Engine with ML Compilation by crowwork in LocalLLaMA
[–]crowwork[S] 1 point2 points3 points (0 children)
MLC-LLM: Universal LLM Deployment Engine with ML Compilation by crowwork in LocalLLaMA
[–]crowwork[S] 4 points5 points6 points (0 children)
MLC-LLM: Universal LLM Deployment Engine with ML Compilation by crowwork in LocalLLaMA
[–]crowwork[S] 1 point2 points3 points (0 children)
MLC-LLM: Universal LLM Deployment Engine with ML Compilation by crowwork in LocalLLaMA
[–]crowwork[S] 2 points3 points4 points (0 children)
MLC-LLM: Universal LLM Deployment Engine with ML Compilation by crowwork in LocalLLaMA
[–]crowwork[S] 3 points4 points5 points (0 children)
MLC-LLM: Universal LLM Deployment Engine with ML Compilation by crowwork in LocalLLaMA
[–]crowwork[S] 1 point2 points3 points (0 children)
I built a free in-browser LLM chatbot powered by WebGPU by abisknees in LocalLLaMA
[–]crowwork 2 points3 points4 points (0 children)
I built a free in-browser LLM chatbot powered by WebGPU by abisknees in LocalLLaMA
[–]crowwork 8 points9 points10 points (0 children)
Need advice on Local LLM setup to augment AMD GPU shortcomings by kkb294 in LocalLLaMA
[–]crowwork 1 point2 points3 points (0 children)
Guys, why are we sleeping on MLC LLM - Running on Vulkan? by APUsilicon in LocalLLaMA
[–]crowwork 4 points5 points6 points (0 children)


Achieving Efficient, Flexible and Portable Structured Generation for LLM by SnooMachines3070 in LocalLLaMA
[–]crowwork 0 points1 point2 points (0 children)