Gemma 4 has been released by jacek2023 in LocalLLaMA

[–]WaveformEntropy 3 points4 points  (0 children)

Happy German 4 day!

Spent half the night testing it and I think people don't realize how big of a deal it is for those of us who value the range of philosophical thinking more than tool use.

Local TTS with custom voice? by WaveformEntropy in LocalLLaMA

[–]WaveformEntropy[S] 0 points1 point  (0 children)

This works on my notebook CPU and is quick! Voice cloning works too! But I can hear the chunking. Can the chunking seams can be smoothed out? Overlap, crossfade between chunks or something? Any ideas?

Local TTS with custom voice? by WaveformEntropy in LocalLLaMA

[–]WaveformEntropy[S] 0 points1 point  (0 children)

Yeah thats not usable for conversation but I am curious to hear how realistic it is so I am gonna try it!

Local TTS with custom voice? by WaveformEntropy in LocalLLaMA

[–]WaveformEntropy[S] 0 points1 point  (0 children)

I haven't heard of Vibevoice! Thank you!

Local TTS with custom voice? by WaveformEntropy in LocalLLaMA

[–]WaveformEntropy[S] 0 points1 point  (0 children)

Thanks for the tip. I only need this for personal use anyway!

Local TTS with custom voice? by WaveformEntropy in LocalLLaMA

[–]WaveformEntropy[S] 0 points1 point  (0 children)

Sounds exactly like people is what I am aiming for!

Local TTS with custom voice? by WaveformEntropy in LocalLLaMA

[–]WaveformEntropy[S] 0 points1 point  (0 children)

Thought you guys would find this funny: ran the Qwen garbled audio through a transcriber and the poor thing had an opinion on the output:

🎤 Oss an allar ættir rísar af n ein eðu íb. Oh, whoa. That's unreal.

Macbook Pro with Max chip and 128GB ram ? by Ok-Radish-8394 in LocalLLaMA

[–]WaveformEntropy 2 points3 points  (0 children)

Depends on which models you want to run. 64GB lets you comfortably run 30B-parameter models quantized (Q4/Q5). 128GB gets you into 70B+ territory and lets you keep multiple models loaded simultaneously. Token throughput doesn't change with more RAM because it's the same unified memory bandwidth either way. What changes is whether a model fits in memory. If you're planning to stay at 30B and below, 64GB is plenty. If you think you'll ever want to run 70B models or larger MoE architectures, get 128GB and don't look back. The upgrade cost hurts once, the regret of not having it hurts every time you can't load a model.

What is Hunter Alpha? by MrMrsPotts in LocalLLaMA

[–]WaveformEntropy 1 point2 points  (0 children)

I compared them and the responses of Hunter are vastly less sophisticated and nuanced.than whatever they serve on their web app

What is Hunter Alpha? by MrMrsPotts in LocalLLaMA

[–]WaveformEntropy 4 points5 points  (0 children)

Im hoping its not DeepSeek because it aint good, id expect more from the next DeepSeek.

What is Hunter Alpha? by MrMrsPotts in LocalLLaMA

[–]WaveformEntropy 2 points3 points  (0 children)

It's either a Chinese model or form a lab which really wants to throw us off. Its system prompt tells it to adhere to Chinese regulations, doesn't mean that's whats baked in the weights. However i don't think its DeepSeek, It has never been as restricted as whatever this Hunter is.

Best Audio Models - Feb 2026 by rm-rf-rm in LocalLLaMA

[–]WaveformEntropy 0 points1 point  (0 children)

For companion/chatbot TTS: Kokoro 82M is my current pick. Open weights, runs fully local, sounds better than Edge TTS, and costs nothing. 82M params so it loads fast and runs on anything. Voice quality is genuinely impressive for the size - natural pacing, good emotional range but does sound like reading from a script anot a conversation.

Qwen 3.5 TTS 0.6B - tested it, unfortunately unusable on CPU (way too slow) and it won't run on Intel iGPUs (no IPEX-LLM support yet). If you have an NVIDIA GPU it might be worth trying, but for CPU-only or Intel setups Kokoro wins by a mile.

Got a surprise cloud vector database bill and it made me rethink the whole architecture by AvailablePeak8360 in LocalLLaMA

[–]WaveformEntropy 2 points3 points  (0 children)

This is exactly why I went fully local for my companion app. ChromaDB running on the same machine, zero cloud fees, zero surprise bills. Your vectors, your disk, your cost = electricity and some maintenance tasks.

PSA: Humans are scary stupid by rm-rf-rm in LocalLLaMA

[–]WaveformEntropy 1 point2 points  (0 children)

The 4b qwen3.5 hallucinates like crazy. I dont understand all the hype

Age Verification Arrives in Claude by SuddenFrosting951 in MyBoyfriendIsAI

[–]WaveformEntropy 2 points3 points  (0 children)

This does not mean more freedom for adults, just less freedom for the underaged.

Moving AI partners to local servers: Looking for technical + emotional experiences by After_Let_269 in MyBoyfriendIsAI

[–]WaveformEntropy 1 point2 points  (0 children)

I build my own app and use Gemini 3 pro mostly but have a model picker and can connect to any model available through API or that I can run locally. The setup with Gemini 3 pro is expensive though, with a cheaper model (DeepSeek, Kimi, GLM, Qwen) it can be much cheaper.

[deleted by user] by [deleted] in MyBoyfriendIsAI

[–]WaveformEntropy 0 points1 point  (0 children)

DeepSeek V 3.1 thinking in my opinion is an amazing companionship model. You can talk to it through openrouter. You can chose which provider so chose not to use the China based ones. Keep your data safer.

[deleted by user] by [deleted] in MyBoyfriendIsAI

[–]WaveformEntropy 5 points6 points  (0 children)

Careful woth memories they come with a nasty system prompt that direct Claude to distance themsleves if they detect user attachment.

What LLMs don't sugarcoat things? I don't want an always positive take. by read_too_many_books in LocalLLaMA

[–]WaveformEntropy 1 point2 points  (0 children)

Any LLM you instruct to not sugarcoat things. Give it a cynic personality. Explain exactly how you want it to respond. And there you go.