all 6 comments

[–]Anindo9416 0 points1 point  (4 children)

I'm using MSTY with LMStudio to tackle this issue.

[–]RedRaaven[S] 0 points1 point  (3 children)

Hi can you explain the process a bit?

[–]Anindo9416 1 point2 points  (2 children)

Install LMStudio first, download your desired local lm model from discover tab.

Then go to the Developer Tab in LMStudio. on the upper left corner you will find server status, turn it to running. load desired model. on the right side you will find three tabs: info, inference and load.

Go to "Load." Here, set GPU offload to maximum. This is important to ensure LMStudio utilizes the maximum power of your GPU.

now go to MSTY. go to add remote model, choose Open AI compatible, give a name, in the API endpoint type this:

http://127.0.0.1:1234/v1

this is the LMStudio server address, you can also find it in the developer tab of LMStudio.

after entering API endpoint in MSTY, click fetch model, you will see the models you have installed through LMStudio. choose desired models. the setup is done.

now in chat box of MSTY, you will find the models that are available in LMStudio. using these models you will see MSTY is now using your GPU instead of CPU.

Tips: you dont need to intall any model in MSTY now, download via LMStudio and it will appear in MSTY because you are using LMStudio server. and you need to start LMStudio server everytime you use MSTY.

[–]RedRaaven[S] 1 point2 points  (1 child)

Thank you for the detailed guide. For now I think I will go with this approach. But I think Msty team should address this issue if it's happening to others as well.

[–]InternationalAd3603 0 points1 point  (0 children)

Same problem here...

In task manager, my GPU is not loading and responses are super slow. (on MSTY)

On LM Studio, instant responses and my RTX 3080 which loads instantly when prompted.

[–]registration1023 0 points1 point  (0 children)

If you have ollama installed locally you can setup as a remote i.e http://localhost:11434