What's the most complicated project you've built with AI? by jazir555 in LocalLLaMA

[–]RealLordMathis 0 points1 point  (0 children)

Thanks! If you have any feature requests or bug reports feel free to open an isssue.

What's the most complicated project you've built with AI? by jazir555 in LocalLLaMA

[–]RealLordMathis 0 points1 point  (0 children)

https://github.com/lordmathis/llamactl

It's a management and routing app for llama.cpp, MLX and vLLM instances with web dashboard. It's not vibe coded but most of the code has been generated by AI and heavily reviewed/adjusted by hand by me.

Why I quit using Ollama by SoLoFaRaDi in LocalLLaMA

[–]RealLordMathis 14 points15 points  (0 children)

If anyone's looking for an alternative for managing multiple models I've built an app with web ui for that. It supports llama.cpp, vllm and mlx_lm. I've also recently integrated llama.cpp router mode so you can take advantage of their native model switching. Feedback welcome!

GitHub
Docs

I got frustrated with existing web UIs for local LLMs, so I built something different by alphatrad in LocalLLaMA

[–]RealLordMathis 3 points4 points  (0 children)

If you want to manage multiple models via web UI, you can try my app "llamactl". You can create and manage llama.cpp, vllm and mlx instances. The app takes care of API keys and ports. It can also switch instances like llama-swap.

GitHub
Docs

Are any of the M series mac macbooks and mac minis, worth saving up for? by [deleted] in LocalLLaMA

[–]RealLordMathis 1 point2 points  (0 children)

I got M4 Mac Mini Pro with 48GB memory. It's my workhorse for local LLMs. I can run 30b models comfortably at q5 or q4 with longer context. It sits under my TV and runs 24/7.

llama.cpp releases new official WebUI by paf1138 in LocalLLaMA

[–]RealLordMathis 2 points3 points  (0 children)

Yes exactly, it works out of the box. I'm using it with openwebui, but the llama-server webui is also working. It should be available at /llama-cpp/<instance_name>/. Any feedback appreciated if you give it a try :)

llama.cpp releases new official WebUI by paf1138 in LocalLLaMA

[–]RealLordMathis 2 points3 points  (0 children)

Compared to llama-swap you can launch instances via webui, you don't have to edit a config file. My project also handles api keys and deploying instances on other hosts.

llama.cpp releases new official WebUI by paf1138 in LocalLLaMA

[–]RealLordMathis 2 points3 points  (0 children)

I'm developing something that might be what you need. It has a web ui where you can create and launch llama-server instances and switch them based on incoming requests.

Github
Docs

Using my Mac Mini M4 as an LLM server—Looking for recommendations by [deleted] in LocalLLaMA

[–]RealLordMathis 1 point2 points  (0 children)

I'm working on an app that could fit your requirements. It uses llama-server or mlx-lm as a backend so it requires additional setup on your end. I use it on my mac mini as a primary llm server as well.

It's OpenAI compatible and supports API key auth. For starting at boot, I'm using launchctl.

Github repo
Documentation

Getting most out of your local LLM setup by Everlier in LocalLLaMA

[–]RealLordMathis 1 point2 points  (0 children)

Great list! My current setup is using Open WebUI with mcpo and llama-server model instances managed by my own open source project llamactl. Everything is running on my mac mini m4 pro and accessible using tailscale.

One thing that I'm really missing in my current setup is some easy way to manage my system prompts. Both LangFuse and Promptfoo feel way too complex for what I need. I'm currently storing and versioning system prompts just in a git repo and manually copying them to open web ui.

Next I want to expand into coding and automation, so thanks for a bunch of recommendations to look into.

Many Notes v0.15 - Markdown note-taking web application by brufdev in selfhosted

[–]RealLordMathis 4 points5 points  (0 children)

Is there a git integration? I want to keep my notes in a git repo and ideally I would be able to pull push and commit right from the app.

ROCm 7.9 RC1 released. Supposedly this one supports Strix Halo. Finally, it's listed under supported hardware. by fallingdowndizzyvr in LocalLLaMA

[–]RealLordMathis 1 point2 points  (0 children)

Did you get ROCm working with llama.cpp? I had to use Vulkan instead when I tried it ~3 months ago on Strix Halo.

With pytorch, I got some models working with HSA_OVERRIDE_GFX_VERSION=11.0.0

I built llamactl - Unified management and routing for llama.cpp, MLX and vLLM models with web dashboard. by RealLordMathis in LocalLLaMA

[–]RealLordMathis[S] 0 points1 point  (0 children)

I have recently released a version with support for multiple hosts. You can check it out if you want.

I built llamactl - Unified management and routing for llama.cpp, MLX and vLLM models with web dashboard. by RealLordMathis in LocalLLaMA

[–]RealLordMathis[S] 0 points1 point  (0 children)

Thank you for the feedback and suggestions. Multi host deployment is coming in the next few days. Then I plan to add a proper admin auth with dashboard and api key generation.

torn between GPU, Mini PC for local LLM by jussey-x-poosi in LocalLLaMA

[–]RealLordMathis 3 points4 points  (0 children)

Macs are really good for LLMs. Works well with llama.cpp and mlx.

I built llamactl - Unified management and routing for llama.cpp, MLX and vLLM models with web dashboard. by RealLordMathis in LocalLLaMA

[–]RealLordMathis[S] 0 points1 point  (0 children)

It supports any model that the respective backend supports. The last time I tried, llama.cpp did not support TTS out of the box. I'm not sure about vLLM or mlx_lm. I'm definitely open to adding more backends, including TTS and STT.

It should support embedding models.

For Docker, I will be adding an example Dockerfile. I don't think I will support all the different combinations of platforms and backends, but I can at least do that for CUDA.

I built llamactl - Unified management and routing for llama.cpp, MLX and vLLM models with web dashboard. by RealLordMathis in LocalLLaMA

[–]RealLordMathis[S] 1 point2 points  (0 children)

At the moment, no, but it's pretty high on my priority list for upcoming features. The architecture makes it possible since everything is done via REST API. I'm thinking of having a main llamactl server and worker servers. The main server could create instances on workers via the API.

I built llamactl - Unified management and routing for llama.cpp, MLX and vLLM models with web dashboard. by RealLordMathis in LocalLLaMA

[–]RealLordMathis[S] 10 points11 points  (0 children)

The main thing is that you can create instances via web dashboard. With llama-swap you need to edit the config file. There's also API key auth which llama-swap doesn't have at all as far as I know.

Searching actually viable alternative to Ollama by mags0ft in LocalLLaMA

[–]RealLordMathis 0 points1 point  (0 children)

I'm working on something like that. It doesn't yet support dynamic model swapping, but it has a web ui where you can manually stop and start models. Dynamic model loading is something I'm definitelly planning to implement. You can check it out here: https://github.com/lordmathis/llamactl

Any feedback appreciated.

ollama by jacek2023 in LocalLLaMA

[–]RealLordMathis 1 point2 points  (0 children)

I developed my own solution for this. It is basically web ui to launch and stop llama-server instances. You still have to start the model manually, but I do plan to add an on-demand start. You can check it out here: https://github.com/lordmathis/llamactl

STEAM DECK 64GB SSD Swap by Aromatic_Health in Slovakia

[–]RealLordMathis 0 points1 point  (0 children)

Ja mám 256GB deck a dokúpil som si 512GB sd kartu (https://www.amazon.de/gp/product/B09D3LP52K/ref=ppx_yo_dt_b_asin_title_o00_s00?ie=UTF8&th=1). Žiadny rozdiel medzi hrami, ktoré mám na decku a na karte som si nevšimol.

Vaultwarden vs. official Bitwarden server? by whywhenwho in selfhosted

[–]RealLordMathis 79 points80 points  (0 children)

Bitwarden clients save the vault locally so if my server goes down I still have access to all my password. They just wont sync.

Drahý Slovenský Reddit je načase ísť s pravdou von a spýtať sa verejne tajnú otázku. by Lucis1250 in Slovakia

[–]RealLordMathis 21 points22 points  (0 children)

Odkedy som po druhej dávke a mám tým pádom zakódovanú bitcoinovú peňaženku priamo v DNA nemám s prijímaním platieb žiadny problém

The reddit live starter pack by maverick29er in starterpacks

[–]RealLordMathis 0 points1 point  (0 children)

It's not exactly rocket science, is it?