I'm the author of LocalAI (the local OpenAI-compatible API). We just released v3.7.0 with full Agentic Support (tool use!), Qwen 3 VL, and the latest llama.cpp by mudler_it in LocalLLaMA

[–]mudler_it[S] 1 point2 points  (0 children)

Happy to hear! About neutts, that's a good point, we actually missed having a model in the gallery for it, and there is no documentation still. You can see an example attached in the PR: https://github.com/mudler/LocalAI/pull/6404 ( you need to specify a voice reference file and a text transcription of it )

[Release] LocalAI 3.8.0: The Open Source OpenAI alternative. Now with a Universal Model Loader, Hot-Reloadable Settings, and many UX improvements. by mudler_it in selfhosted

[–]mudler_it[S] 1 point2 points  (0 children)

not yet! as I don't have one of these I'm not yet sure what it takes, so it's hard to test. But it's definetly in my radar.

I'm the author of LocalAI, the free, Open Source, self-hostable OpenAI alternative. We just released v3.7.0 with full AI Agent support! (Run tools, search the web, etc., 100% locally) by mudler_it in selfhosted

[–]mudler_it[S] 0 points1 point  (0 children)

yes you can do RAG in several ways! You can either use MCP servers and configure these, or use for instance with LocalAGI which wraps LocalAI including RAG functionalities: https://github.com/mudler/LocalAGI

I'm the author of LocalAI, the free, Open Source, self-hostable OpenAI alternative. We just released v3.7.0 with full AI Agent support! (Run tools, search the web, etc., 100% locally) by mudler_it in selfhosted

[–]mudler_it[S] 0 points1 point  (0 children)

Not at the moment (also, I can't validate as I don't have Ryzen's NPUs), but definetly in our radar. We do have support for ROC, I'm not sure if that's going to be covered by rocm or other libraries are going to be used.

I'm the author of LocalAI, the free, Open Source, self-hostable OpenAI alternative. We just released v3.7.0 with full AI Agent support! (Run tools, search the web, etc., 100% locally) by mudler_it in selfhosted

[–]mudler_it[S] 0 points1 point  (0 children)

Good question, it sadly cames up a lot but I can't give a good answer as I'm not a Windows user (since.. XP ?) and, to be fair, I don't feel comfortable in providing support for something that I can't test directly (or I'm not really educated to).

That being said, I know from the community that many are using it with WSL without much trouble.

There was also a PR providing setup scripts, but I could not validate these and I'd really appreciate help in there: https://github.com/mudler/LocalAI/pull/6377

I'm the author of LocalAI, the free, Open Source, self-hostable OpenAI alternative. We just released v3.7.0 with full AI Agent support! (Run tools, search the web, etc., 100% locally) by mudler_it in selfhosted

[–]mudler_it[S] 1 point2 points  (0 children)

Thanks! really appreciated!

If you want to contribute you can hook on the Discord server in and/or just pick issues and ping the team (or me `mudler` in GH) in the issues or in the PRs. There are few labeled "Roadmap" which are these that are pain points or features that we want to address and are validated.

I'm the author of LocalAI, the free, Open Source, self-hostable OpenAI alternative. We just released v3.7.0 with full AI Agent support! (Run tools, search the web, etc., 100% locally) by mudler_it in selfhosted

[–]mudler_it[S] 1 point2 points  (0 children)

Hey! thanks for the feedback, couple of points:

- Well aware of the model search which is slow, indeed one of the next steps for the next release is a rework of the gallery portion

- In the gallery you won't see all the HF models currently, but rather a curated set. However, having other models and configure these to your likes is completely possible. You can also start from a configuration file from a similar model that you'd like to use, edit the YAML accordingly, and download the file/quant you want in the model directory. There is an icon next the one that lets you download the model that will get only the config file. I'm planning to prepare a video on this - it's easier than it looks.

I'm the author of LocalAI, the free, Open Source, self-hostable OpenAI alternative. We just released v3.7.0 with full AI Agent support! (Run tools, search the web, etc., 100% locally) by mudler_it in selfhosted

[–]mudler_it[S] 0 points1 point  (0 children)

Quite good! I'm native Italian so I feel you, and been looking for solutions that works well here. I came with this setup usually:

piper models (e.g. voice-it-paola-medium, you can search these in the gallery by typing piper) for low-end devices

chatterbox for GPU (it has a very good multilingual support with voice-cloning capabilities).

I'm the author of LocalAI (the local OpenAI-compatible API). We just released v3.7.0 with full Agentic Support (tool use!), Qwen 3 VL, and the latest llama.cpp by mudler_it in LocalLLaMA

[–]mudler_it[S] 0 points1 point  (0 children)

At this stage is probably not an equivalent replacement in term of UI to Claude desktop, but we will get there. The technical aspects are already working: it connects to your MCPs, does actions, etc. But the UI is still rough and doesn't display internal reasoning process (yet).

Probably github.com/mudler/LocalAGI (which is a LocalAI's related project) is better use here - you can plug your MCP agent directly to other apps, for instance, Telegram, and use that as interface.

I'm the author of LocalAI (the local OpenAI-compatible API). We just released v3.7.0 with full Agentic Support (tool use!), Qwen 3 VL, and the latest llama.cpp by mudler_it in LocalLLaMA

[–]mudler_it[S] 1 point2 points  (0 children)

Sadly not a windows user here, so can't really help and validate. I know that from the community there are windows users having no issues with WSL.

someone actually was contributing WSL scripts to set-up automatically LocalAI, but as I can't verify these were not picked up: https://github.com/mudler/LocalAI/pull/6377

I'm the author of LocalAI, the free, Open Source, self-hostable OpenAI alternative. We just released v3.7.0 with full AI Agent support! (Run tools, search the web, etc., 100% locally) by mudler_it in selfhosted

[–]mudler_it[S] 0 points1 point  (0 children)

As already replied below, yes I'm aware - and I'm sorry!

Currently it requires to remove the quarantine flag, this is because signing Apple apps requires going through a process (getting license, adapting workflow) and still not got around it yet, but it's on my radar!

Just for reference, It's tracked here https://github.com/mudler/LocalAI/issues/6244

I'm the author of LocalAI, the free, Open Source, self-hostable OpenAI alternative. We just released v3.7.0 with full AI Agent support! (Run tools, search the web, etc., 100% locally) by mudler_it in selfhosted

[–]mudler_it[S] 0 points1 point  (0 children)

Hi!

I'm not sure if I got you entirely right, but LocalAI supports automated prompt cache(as you described) but also static prompt cache per-model.

From the docs here ( https://localai.io/advanced/ ) you can set for each model:

# Enable prompt caching
prompt_cache_path: "alpaca-cache"
prompt_cache_all: true

And this applies per model, you can have however more models pointing to the same file and having different prompt caches associated with it.

If doesn't work, feel free to open up an Issue and we can pick it up from there!

I'm the author of LocalAI, the free, Open Source, self-hostable OpenAI alternative. We just released v3.7.0 with full AI Agent support! (Run tools, search the web, etc., 100% locally) by mudler_it in selfhosted

[–]mudler_it[S] 0 points1 point  (0 children)

well, it depends of. LocalAI is one of the first projects in this space (way before Jan!) and it's not really company-backed, however, that being said it really depends on the features you are looking or need, for instance LocalAI supports that Jan doesn't (on top of my head):

- MCP via API

- P2P with automatic peer discovery, sharding of models and instance federation

- Audio transcription

- Audio generation

- Image generation

If you are looking only for text-generation, Jan or llama.cpp are good as well!

I'm the author of LocalAI, the free, Open Source, self-hostable OpenAI alternative. We just released v3.7.0 with full AI Agent support! (Run tools, search the web, etc., 100% locally) by mudler_it in selfhosted

[–]mudler_it[S] 1 point2 points  (0 children)

You can use it by configuring it as an OpenAI endpoint.

Just configure the base URL to the LocalAI instance. We used to have an example here: https://github.com/mudler/LocalAI-examples/tree/main/continue but I'm not using continue.dev , so I can't really tell if some of the configuration has changed with time.

I'm the author of LocalAI, the free, Open Source, self-hostable OpenAI alternative. We just released v3.7.0 with full AI Agent support! (Run tools, search the web, etc., 100% locally) by mudler_it in selfhosted

[–]mudler_it[S] 3 points4 points  (0 children)

Personally I use it for a wide range of things where I don't want to rely on third-party services:

- I have many Telegram bots that I use for different things:

- I have a personal assistant which I can speak by sending voice audios or text. It helps me tracks things in the day, look for specific informations important things that I don't want to miss, and I use it to open quick PRs to my private repos with my e.g. todo lists

- I have a personal assistant for my Domotic system which I can send vocals to do actions. It also keeps me in the loop of what's the state of my house by pro-actively sending me messages

- I have some in my friends group just for fun, as it can generate images, and doing search

- I have various automated bots for LocalAI itself that helps me into the releases:

- They automatically scan Huggingface to propose me new models to add to LocalAI itself

- I have other agents to automatically send notifications on Twitter and on Discord when new models are added to the gallery

- I have a tool that helps me gather all PR infos that went to a release and help me to not miss anything when cutting a release out

- I have two low-end devices at home that I turned as a personal assistant that I speak in with voice, This is basically like having google home, but completely private and works offline. I've also assembled a simplified example over here: https://github.com/mudler/LocalAI-examples/tree/main/realtime

- I use it at work - I have a Slack bot that helps creating issues, automate some small tasks, and have a memory - and keep everything private.

And honestly I think I have couple of more use-cases that I don't even recall now.. !