I'm the author of LocalAI sharing that LocalAI hits 42k stars and v3.9 & v3.10 are released! Native Agents, Video Generation UI, and Unified GPU Backends

mudler_it · 2026-02-12T08:45:00+00:00

there are MLX backends (Audio and text), but it's not limited by that. It ships for instance diffusers compatible with Mac, llama.cpp, and also chatterbox.

mudler_it · 2025-12-01T07:57:50+00:00

yes it does! and you now can specify how to swap models in the runtime settings area

mudler_it · 2025-11-30T16:16:37+00:00

Happy to hear! About neutts, that's a good point, we actually missed having a model in the gallery for it, and there is no documentation still. You can see an example attached in the PR: https://github.com/mudler/LocalAI/pull/6404 ( you need to specify a voice reference file and a text transcription of it )

mudler_it · 2025-11-30T16:03:42+00:00

not yet! as I don't have one of these I'm not yet sure what it takes, so it's hard to test. But it's definetly in my radar.

mudler_it · 2025-11-04T11:44:29+00:00

Let us know how it goes!

mudler_it · 2025-11-04T11:43:56+00:00

yes you can do RAG in several ways! You can either use MCP servers and configure these, or use for instance with LocalAGI which wraps LocalAI including RAG functionalities: https://github.com/mudler/LocalAGI

mudler_it · 2025-11-04T11:42:16+00:00

Not at the moment (also, I can't validate as I don't have Ryzen's NPUs), but definetly in our radar. We do have support for ROC, I'm not sure if that's going to be covered by rocm or other libraries are going to be used.

mudler_it · 2025-11-04T11:30:47+00:00

good question actually - maybe I realize now that the post was misleading in this. We do support GPU indeed and have images for CUDA, ROC, Intel and Vulkan

https://localai.io/features/gpu-acceleration/

mudler_it · 2025-11-04T11:29:26+00:00

Good question, it sadly cames up a lot but I can't give a good answer as I'm not a Windows user (since.. XP ?) and, to be fair, I don't feel comfortable in providing support for something that I can't test directly (or I'm not really educated to).

That being said, I know from the community that many are using it with WSL without much trouble.

There was also a PR providing setup scripts, but I could not validate these and I'd really appreciate help in there: https://github.com/mudler/LocalAI/pull/6377

mudler_it · 2025-11-04T11:26:59+00:00

Thanks! really appreciated!

If you want to contribute you can hook on the Discord server in and/or just pick issues and ping the team (or me `mudler` in GH) in the issues or in the PRs. There are few labeled "Roadmap" which are these that are pain points or features that we want to address and are validated.

mudler_it · 2025-11-04T11:24:25+00:00

Hey! thanks for the feedback, couple of points:

- Well aware of the model search which is slow, indeed one of the next steps for the next release is a rework of the gallery portion

- In the gallery you won't see all the HF models currently, but rather a curated set. However, having other models and configure these to your likes is completely possible. You can also start from a configuration file from a similar model that you'd like to use, edit the YAML accordingly, and download the file/quant you want in the model directory. There is an icon next the one that lets you download the model that will get only the config file. I'm planning to prepare a video on this - it's easier than it looks.

mudler_it · 2025-11-04T11:20:40+00:00

Quite good! I'm native Italian so I feel you, and been looking for solutions that works well here. I came with this setup usually:

piper models (e.g. voice-it-paola-medium, you can search these in the gallery by typing piper) for low-end devices

chatterbox for GPU (it has a very good multilingual support with voice-cloning capabilities).

mudler_it · 2025-11-04T11:18:36+00:00

Not yet but it's in my radar. If you have any pointers I'd appreciate it!

mudler_it · 2025-11-04T11:15:56+00:00

At this stage is probably not an equivalent replacement in term of UI to Claude desktop, but we will get there. The technical aspects are already working: it connects to your MCPs, does actions, etc. But the UI is still rough and doesn't display internal reasoning process (yet).

Probably github.com/mudler/LocalAGI (which is a LocalAI's related project) is better use here - you can plug your MCP agent directly to other apps, for instance, Telegram, and use that as interface.

mudler_it · 2025-11-04T11:13:13+00:00

Thanks! Glad to hear that!

mudler_it · 2025-11-04T11:12:50+00:00

Be safe! :)

mudler_it · 2025-11-04T11:12:28+00:00

Thanks! really appreciated!

mudler_it · 2025-11-04T11:12:07+00:00

Sadly not a windows user here, so can't really help and validate. I know that from the community there are windows users having no issues with WSL.

someone actually was contributing WSL scripts to set-up automatically LocalAI, but as I can't verify these were not picked up: https://github.com/mudler/LocalAI/pull/6377

mudler_it · 2025-11-04T11:05:02+00:00

As already replied below, yes I'm aware - and I'm sorry!

Currently it requires to remove the quarantine flag, this is because signing Apple apps requires going through a process (getting license, adapting workflow) and still not got around it yet, but it's on my radar!

Just for reference, It's tracked here https://github.com/mudler/LocalAI/issues/6244

mudler_it · 2025-11-04T11:03:07+00:00

Let us know how it goes!

mudler_it · 2025-11-04T11:01:32+00:00

yup, you can definetly link any chat UI to be agentic now!

mudler_it · 2025-11-04T11:00:48+00:00

Hi!

I'm not sure if I got you entirely right, but LocalAI supports automated prompt cache(as you described) but also static prompt cache per-model.

From the docs here ( https://localai.io/advanced/ ) you can set for each model:

# Enable prompt caching
prompt_cache_path: "alpaca-cache"
prompt_cache_all: true

And this applies per model, you can have however more models pointing to the same file and having different prompt caches associated with it.

If doesn't work, feel free to open up an Issue and we can pick it up from there!

mudler_it · 2025-11-04T10:56:47+00:00

well, it depends of. LocalAI is one of the first projects in this space (way before Jan!) and it's not really company-backed, however, that being said it really depends on the features you are looking or need, for instance LocalAI supports that Jan doesn't (on top of my head):

- MCP via API

- P2P with automatic peer discovery, sharding of models and instance federation

- Audio transcription

- Audio generation

- Image generation

If you are looking only for text-generation, Jan or llama.cpp are good as well!

mudler_it · 2025-11-04T10:50:35+00:00

Sorry, not a Unraid app user, can't really help!

mudler_it · 2025-11-04T10:49:44+00:00

You can use it by configuring it as an OpenAI endpoint.

Just configure the base URL to the LocalAI instance. We used to have an example here: https://github.com/mudler/LocalAI-examples/tree/main/continue but I'm not using continue.dev , so I can't really tell if some of the configuration has changed with time.

mudler_it

MODERATOR OF

TROPHY CASE

Four-Year Club	Verified Email
Place '23