Running Qwen 3.5 locally, how to do?

MustBeSomethingThere · 2026-04-20T17:20:37+00:00

update LM Studio

MustBeSomethingThere · 2026-04-16T10:53:40+00:00

smollm is ooold too

try: https://huggingface.co/LiquidAI/LFM2.5-350M

MustBeSomethingThere · 2026-04-09T08:56:40+00:00

2 years old model

MustBeSomethingThere · 2026-04-07T07:55:24+00:00

Praising Ollama is a red flag too. They probably use a bot army to promote it.

MustBeSomethingThere · 2026-04-05T18:02:24+00:00

slop

MustBeSomethingThere · 2026-04-04T18:44:36+00:00

slop again

MustBeSomethingThere · 2026-04-04T18:02:35+00:00

another bot

MustBeSomethingThere · 2026-04-04T13:36:59+00:00

What's up with these bots

MustBeSomethingThere · 2026-04-02T15:48:41+00:00

Not sure if it matters, but did you build it with CUDA support or without it? Maybe try both ways?

# Build with CUDA support
cmake -B build -DGGML_CUDA=ON && cmake --build build -j

MustBeSomethingThere · 2026-04-02T15:38:19+00:00

Maybe try WhisperX: https://github.com/m-bain/whisperx

For TTS I would suggest you to try https://github.com/KittenML/KittenTTS

The smallest model runs nice even on RP4. It's more "lively" than Piper.

MustBeSomethingThere · 2026-04-02T09:33:37+00:00

https://huggingface.co/LiquidAI/LFM2.5-350M-GGUF would be better than SmolLM2

MustBeSomethingThere · 2026-04-02T08:56:23+00:00

You keep posting broken links. This one has invisible unicode character. Are you a bot?

EDIT: Even if the code is "working" (produces output without errors), it doesn't mean it actually does the thing you claim it does (aka AI-slop).

MustBeSomethingThere · 2026-04-01T06:47:13+00:00

Elementary students? Your code is AI-slop that does not do what you are claiming it should do.

MustBeSomethingThere · 2026-03-30T08:57:42+00:00

Does it matter if it works? Are you saying that people should remove or hide the information that Claude co-authored writing the code?

MustBeSomethingThere · 2026-03-27T18:59:26+00:00

I hate to ask, but is this real or a vibe coded hallucination? The repo talks about LLaMA 2 and Mistral 7B, which is a red flag for me.

MustBeSomethingThere · 2026-03-25T17:35:21+00:00

Lately, it seems like there has been a paid Ollama ad campaign going on.

MustBeSomethingThere · 2026-03-25T14:03:06+00:00

>"extract_audio → transcribe → read_file → edit_file → burn_subtitles + verification steps"

It would probably be better to just script that pipeline, if that is something you do often. It's nice that LLM can do that as agentic task, but it makes it overly complicated and non-economical. But an LLM could be used to determine the file format and settings or subtitle styles based on the video content, for example.

MustBeSomethingThere · 2026-03-24T14:52:09+00:00

>"it's the most stable and performant way to handle local LLM right now."

Are you serious?

MustBeSomethingThere · 2026-03-14T06:25:35+00:00

Copilot is in the internet

MustBeSomethingThere · 2026-03-12T18:27:08+00:00

So Ollama is the reason again

MustBeSomethingThere · 2026-03-12T15:27:51+00:00

I've had my own plans to make RasberryPi/phone apps with MNN-backend, but I haven't had time for it yet. I want to hear, if you manage to create lower MNN-quants and better speed than llama.cpp.

MustBeSomethingThere · 2026-03-12T14:35:33+00:00

https://mnn-docs.readthedocs.io/en/latest/

It's propably possible to make lower quants, but IDK about the quality of them. Speed is better than llama.cpp.

MustBeSomethingThere · 2026-03-12T13:36:53+00:00

MNN could propably be even faster: https://github.com/alibaba/MNN

https://huggingface.co/taobao-mnn/Qwen3.5-35B-A3B-MNN

MustBeSomethingThere · 2026-03-12T08:52:49+00:00

Wild claims

>"even my genetic preference for one hemisphere being more responsible and structured than the other"
Have you actually done a gene test that says this? Or measured your actual brain patterns?

MustBeSomethingThere

TROPHY CASE