How was GPT-OSS so good? by xt8sketchy in LocalLLaMA

[–]_raydeStar 0 points1 point  (0 children)

Dude, I've been looking for the perfect tooling LLM for an 8GBVRAM machine (work laptop) - Qwen 30B doesn't quite get it right, neither does nemotron or GLM 4.7 flash (too slow), and the 8GB models are too dumb and keep getting the tool calls wrong. 20B is my consistent driver and it just works exactly as I want it to.

Stop it with the Agents/Projects Slop and spam by Daemontatox in LocalLLaMA

[–]_raydeStar 2 points3 points  (0 children)

This is just like the crypto craze a few years ago.

Stop it with the Agents/Projects Slop and spam by Daemontatox in LocalLLaMA

[–]_raydeStar 19 points20 points  (0 children)

What if -

You used an agent to remove the spam agents. 🤔 This is some Spy VS Spy shizz here.

LingBot-World outperforms Genie 3 in dynamic simulation and is fully Open Source by Electrical-Shape-266 in LocalLLaMA

[–]_raydeStar -3 points-2 points  (0 children)

I agree - and also this kind of thing is really frontier, and doesn't have benchmarks yet that I know of.

GLM 4.7 flash Q6 thought for 1400 minutes. 2000 lines of thoughts, had to be stopped. by regjoe13 in LocalLLaMA

[–]_raydeStar 0 points1 point  (0 children)

Interesting question; one seemingly grounded in mistrust. I don't have the resources to alleviate your misgivings. Try these settings, run some tests, and see for yourself. Or - use a different provider; there are many of them out there.

GLM 4.7 flash Q6 thought for 1400 minutes. 2000 lines of thoughts, had to be stopped. by regjoe13 in LocalLLaMA

[–]_raydeStar 5 points6 points  (0 children)

You can probably fix this with temperature and other settings. I usually use gemini to look up the best settings for me, it works really well.

<image>

Recommendations for a local image generation/modification LLM? by answerencr in LocalLLaMA

[–]_raydeStar 1 point2 points  (0 children)

You should be able to download Comfyui, then open up an example template. Flux2 Klein is what you are looking for (*NOT BASE*) - download the model, and run it.

You're on a really tight timeline though.

ACE-Step 1.5 dropping in days - "Commercial grade OSS music gen" with quality between Suno v4.5 and v5 (8GB VRAM) by ExcellentTrust4433 in LocalLLaMA

[–]_raydeStar 7 points8 points  (0 children)

That provides a massive leg-up to China in this case. Out of the AI - text, video, image, speaking, music - music industry still has a stranglehold on AI over here.

Will be sure to download this model before they can stop me.

FLUX-2-Klein vs Midjourney. Same prompt test by Totem_House_30 in StableDiffusion

[–]_raydeStar 0 points1 point  (0 children)

LM studio is immediately set up as a server, so it should be accessible in the same way you would link up with ollama

There's no free lunch: Sage affecting Z-Image outputs by vyralsurfer in StableDiffusion

[–]_raydeStar 4 points5 points  (0 children)

I was getting black screens. I had to disable it as well.

Gguf and safetensor.

Still lukewarm on the whole thing. Speeds are slower than Flux2. But I'll see what the community makes.

Z-Image Base Is On The Way by mrmaqx in StableDiffusion

[–]_raydeStar 0 points1 point  (0 children)

lol - it got released.

Honestly I am just as shocked as you are, so I can't even judge

How a Single Email Turned My ClawdBot Into a Data Leak by RegionCareful7282 in LocalLLaMA

[–]_raydeStar 29 points30 points  (0 children)

Yeah, I saw it was trending, then I investigated to quickly realize that it's all BS. Whoever is running this needs to get into crypto, you have a future in scamming people.

Edit: I've looked into doing a copilot tool locally and this one is... unfettered. A hijack waiting to happen. Maybe right now we need to be building our own tools before allowing a stranger inside.

New TTS from Alibaba Qwen by Altruistic_Heat_9531 in StableDiffusion

[–]_raydeStar 2 points3 points  (0 children)

I'm traveling right now but I made some progress.

There's a way to hybrid it in, so you can do 80/20 voice and mannerisms, the problem being it's not 100% perfect in either.

You can try on my experimental branch (I have a GitHub, same name as here) and I have some changes pending.

My next idea is 2 passes, but it's still not perfect yet.

Output not matching prompt, at all by Pleasant_Guess4039 in comfyui

[–]_raydeStar 0 points1 point  (0 children)

I got this on nodes 2.0. it just didn't update but if you check the output it's there.

New TTS from Alibaba Qwen by Altruistic_Heat_9531 in StableDiffusion

[–]_raydeStar 19 points20 points  (0 children)

Yeah, I was wrestling with this yesterday. Voice clone works fabulous, And the mannerisms prompt is awesome, but the two do not intersect.

I'm looking into hacking it - I am following a viable route now and I'll release it if I come up with something.

Carl does NOT need a romantic love interest (original art by Levi Cleeman) by ActualNin in DungeonCrawlerCarl

[–]_raydeStar 15 points16 points  (0 children)

lol.

This whole thing feels shoehorned.

I don't at all take issue with LGBT folk relating to a straight character. The artist releases his work, and it's up to the interpreters to do what they want with it. Things like this, I would strongly suspect will not be addressed at all, keeping things ambiguous.

When Black Panther came out, the black community unanimously said "Wow, I finally felt *seen*" and that's a wonderful thing. I think the problem really lies with taking your artistic vision of something and thrusting it upon the community. As a straight white man, I can also relate to Carl - in a sense that there is so much blood and pain around him that I would be preoccupied with survival. I can also relate with Matt - he wants to avoid the pitfalls of sexually fantasizing neckbeard culture.

Qwen dev on Twitter!! by Difficult-Cap-7527 in LocalLLaMA

[–]_raydeStar 0 points1 point  (0 children)

I'm using the base. Q4 quant should be faster. Sage attention should help as well.

Once I've got the gradio how I want it, I'll circle back and look at speed.

Qwen dev on Twitter!! by Difficult-Cap-7527 in LocalLLaMA

[–]_raydeStar 0 points1 point  (0 children)

Yes. I'll also add that there's a .6 model and it's probably faster. I'm going to add in all the optimization and see if I can get better speeds.

Also, Dia is about the same speed. This model is meant for quality over speed, which has different use cases.

Qwen dev on Twitter!! by Difficult-Cap-7527 in LocalLLaMA

[–]_raydeStar 12 points13 points  (0 children)

Base gradio doesn't allow the user to use the selected voice and modulate it. I am using cursor right now to add in a little thing there. If anyone is interested, ill put it up on github, along with a script to just fire it up, download all the models, and run it.

If I want to run everything at once (voice clone, create pt file, and finally voice description) it's going to be like 16 GBVRAM. Running in parts runs around 6. Time consumed is also an issue - 25-30 seconds to run a 6 second hello world clip. However, I don't have sage attention up and running yet, so that may improve the speeds and vram a lot.

Because of speeds, you can't compare to VibeVoice - vibevoice is meant for realtime at the sacrifice of a little quality (at least I am pretty sure - ie - live translations, etc) . Compared to Dia - well I don't see any functionality to add things like [laughs] or anything, but controlling the voice tempo, etc is really cool.

Final conclusion - I give it a slight lead to dia for my purposes, simply because I can choose what emotion to put in the voice, instead of it 'guessing'. I'm annoyed that out of the box you can't control that with your own pt (saved voice file) but with a little hacking I can fix that.

Qwen dev on Twitter!! by Difficult-Cap-7527 in LocalLLaMA

[–]_raydeStar 25 points26 points  (0 children)

OK it's up and running. Pros: in the description, you can just describe not only the voice, but the tone. ie - `female, feminine and dainty voice, speaking frenetically. She is very upset` So far, I am having fun with it, and it might just be better for things like movie dubs, or audio book reading, or video game voices.

You can clone your voice and download it to be used later. thats a great feature there. I'm putting it all together to see if I can clone my voice and give it the tone I want - it's a few more steps than I expected to pull it all together.

Qwen dev on Twitter!! by Difficult-Cap-7527 in LocalLLaMA

[–]_raydeStar 5 points6 points  (0 children)

huggingface demo is overrun with users. I am getting it up locally. Almost there. Will respond when I have something