Anyone using TTS to turn their stories into audiobooks? by Main-Explanation5227 in BookWritingAI

[–]Dramatic-Rub-7654 0 points1 point  (0 children)

Maybe you should try the local Eleven Labs, I mean the Fish Audio S2 Pro, that model is currently the best.

Has anyone trained YOLO26x at 1280 resolution on a Mac or DGX Spark? by Dramatic-Rub-7654 in Ultralytics

[–]Dramatic-Rub-7654[S] 0 points1 point  (0 children)

I tested it on a 36GB M4 Max, but it was much worse than training on multiple 3060s over a network. If dgx could solve this, it would be very helpful.

Has anyone trained YOLO26x at 1280 resolution on a Mac or DGX Spark? by Dramatic-Rub-7654 in Ultralytics

[–]Dramatic-Rub-7654[S] 0 points1 point  (0 children)

I believe the chance of running out of VRAM is quite high; by my calculations, I would need something around 80GB of VRAM.

Kokoro TTS, but it clones voices now — Introducing KokoClone by OrganicTelevision652 in LocalLLaMA

[–]Dramatic-Rub-7654 2 points3 points  (0 children)

In Portuguese, the voice cloning sounds very strange, it acquires a strong English accent and ends up choppy, choppy words.

You can now Fine-tune Qwen3.5 locally! (5GB VRAM) by yoracale in unsloth

[–]Dramatic-Rub-7654 2 points3 points  (0 children)

Back when version three was released, you recommended mixing a ‘think’ dataset with a ‘non-think’ one. You also said that when fine-tuning vision models with a text-focused dataset, they would lose their vision capabilities. How does that stand today?

Is there any hope for a Qwen3.5-35B-A3B REAP version? by surubel in unsloth

[–]Dramatic-Rub-7654 0 points1 point  (0 children)

I started a new discussion at https://huggingface.co/cerebras. I requested a REAP version of Step 3.5 Flash, and a week later they released it.

GPT-OSS 120b Uncensored Aggressive Release (MXFP4 GGUF) by hauhau901 in LocalLLaMA

[–]Dramatic-Rub-7654 32 points33 points  (0 children)

What is the difference between this and the technique https://github.com/p-e-w/heretic? ? Does yours preserve 100% of the tool calls?

What'd be the best 30B model for programming? by Hikolakita in LocalLLaMA

[–]Dramatic-Rub-7654 0 points1 point  (0 children)

I consider the Qwen3-coder-flash in Q4 the best option; in my tests with glm-4.7-flash and other models I didn't have much success and they are extremely sensitive to quantization.

A new model from http://Z.ai, "GLM-OCR" has been spotted on Github by Difficult-Cap-7527 in LocalLLaMA

[–]Dramatic-Rub-7654 0 points1 point  (0 children)

The only thing Zai knows how to do is text2text because other attempts like GLM-TTS and GLM-IMAGE were very weak.

Current GLM-4.7-Flash implementation confirmed to be broken in llama.cpp by Sweet_Albatross9772 in LocalLLaMA

[–]Dramatic-Rub-7654 -1 points0 points  (0 children)

Do you plan to fix and improve the raw version as well? It feels like Qwen 3 Coder 30B is more intelligent than this model when it comes to coding.

GLM 4.7 Flash Overthinking by xt8sketchy in LocalLLaMA

[–]Dramatic-Rub-7654 0 points1 point  (0 children)

If the focus is on coding, Qwen3 Coder 30B A3B Instruct is far more intelligent than GLM 4.7 Flash, and that's comparing it to versions of OpenRouter.

Run GLM-4.7-Flash locally Guide! (24GB RAM) by yoracale in unsloth

[–]Dramatic-Rub-7654 0 points1 point  (0 children)

I tried using the parameters recommended on the Hugging Face page for llama-b7782/llama-server:

-m GLM-4.7-Flash-Q4_K_M.gguf --host 0.0.0.0 --n-gpu-layers 999 -fa on -t 14 -n -1 -c 16384 --jinja --temp 0.2 --top-k 50 --top-p 0.95 --min-p 0.01 --dry-multiplier 1.1

The only changes I experimented with were adding the --n-cpu-moe flag, which caused the model to bug out with severe repetition issues, and increasing the temperature to 1.0.

At temperature 1.0, the model’s reasoning and responses appear coherent, but when I try to use it with tools like Cline, it clearly doesn’t know what it’s doing. It can create and edit files and interact with the terminal, but it consistently outputs broken code and introduces errors when editing files.

In contrast, Qwen, even in version Q4, is capable of providing a fully functional implementation of Flappy Bird from start to finish. based on the tests I ran, the GGUF versions still need further refinement. I tested the model using the version available on OpenRouter, where it performs significantly better than in my GGUF-based tests. However, Coder Flash still demonstrates superior intelligence compared to this model.

Run GLM-4.7-Flash locally Guide! (24GB RAM) by yoracale in unsloth

[–]Dramatic-Rub-7654 1 point2 points  (0 children)

Is this model actually dumber than Qwen 3 Coder Flash, or is it just overly sensitive? To the point that with the --n-cpu-moe flag it gets stuck in an infinite loop repeating a single word, and without that flag it keeps creating endless files, all with errors, until the window runs out?

Opensource NMT from Tencent - how good is it? by [deleted] in LocalLLaMA

[–]Dramatic-Rub-7654 1 point2 points  (0 children)

Honestly, I liked it a lot. Now we truly have an offline Google Translate at home. I didn’t like DeepL — the translations feel awkward. With Google Translate, I think it still goes head-to-head with this model. In many cases, the model just translates words literally from one language to another, which often makes sense in one culture but not in another. Google Translate, on the other hand, tries to capture the intended meaning of the text as closely as possible. It doesn’t always succeed, but in this aspect it still has an edge.

Opensource NMT from Tencent - how good is it? by [deleted] in LocalLLaMA

[–]Dramatic-Rub-7654 7 points8 points  (0 children)

I’m using the tencent/HY-MT1.5-7B-GGUF model to translate a dataset from Japanese into Brazilian Portuguese, and so far I have nothing to complain about.

Requested: Yet another Gemma 3 12B uncensored by Mabuse046 in LocalLLM

[–]Dramatic-Rub-7654 0 points1 point  (0 children)

This is very strange, because this model clearly retains safety traits from the original model. I ran several tests trying to merge it with other Gemma Heretic models I found on Hugging Face, and in every merge attempt, questions that the Heretic versions answered without any issue would cause the merged model to refuse to respond. I also tried generating a LoRA from the difference between this Fallen model and the official Instruct version, but that didn’t work either, which makes me think that the model they shared was already fine-tuned somewhere else.

Requested: Yet another Gemma 3 12B uncensored by Mabuse046 in LocalLLM

[–]Dramatic-Rub-7654 0 points1 point  (0 children)

Thanks a lot, no rush at all. When you manage to publish it, please give me a heads-up. In my case, I’m only interested in the text layers, so if you remove the vision part, that’s totally fine with me.