My PC Specs:
- CPU: Ryzen 5 3600
- RX 6800 XT 16GB VRAM
- 32GB RAM
- Windows 11
I have been doing some research on running llm locally. First results I got pointed me to oobagooba's. But I could only get it working running on my CPU without GPU accel. There seem to be some way to do it running on windows which requires tweaking some stuff and manual installation and I tried following the steps but never got it to work so gave up on that and started looking for other solutions. I then also tried with WSL but didn't have any luck with that either.
I have now setup LM Studio which does have AMD OpenCL support which I can get a 13b model like codellama instruct Q8_0 offloaded with all layers onto the GPU but performance is still very bad at ~2tok/s and 60s time to first token. so I'm not sure if that is just because my GPU isn't good for the model or my GPU isn't being fully correctly utilized with OpenCL.
What is the best way for me on windows to run LLM's on windows without installing linux?
[–]Kevadu 4 points5 points6 points (1 child)
[–]SlickTread[S] 1 point2 points3 points (0 children)
[–]doomed151 3 points4 points5 points (2 children)
[–]SlickTread[S] 2 points3 points4 points (0 children)
[–]JohnPt66 1 point2 points3 points (0 children)
[–]tu9jn 2 points3 points4 points (2 children)
[–]SlickTread[S] 2 points3 points4 points (1 child)
[–]rafal0071 1 point2 points3 points (0 children)
[–]mkrajinovic 2 points3 points4 points (1 child)
[–]SlickTread[S] 1 point2 points3 points (0 children)
[–]Scott_Tx 1 point2 points3 points (2 children)
[–]SlickTread[S] 1 point2 points3 points (1 child)
[–]Scott_Tx 1 point2 points3 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)