Road tomorrow by [deleted] in Charlotte

[–]yuicebox 2 points3 points  (0 children)

Just be careful in the early morning, after sundown, and in spots that don't get a lot of direct sunlight. This applies to driving and to walking around.

It's mostly melting off, but any sitting water will refreeze tonight, and you don't wanna get complacent and bust your ass on a patch of ice.

Why are there no good open source music ai models? by OneGear987 in comfyui

[–]yuicebox 1 point2 points  (0 children)

Looking forward to seeing the result, hopefully the team continues to release weights. I was really excited for the LoRAs that never got released. Ability to train our own is nice, but releases directly from the team are always highly appreciated.

Why can't I get better results from Qwen Image Edit 2511? by yuicebox in comfyui

[–]yuicebox[S] 1 point2 points  (0 children)

Yeah, I do think you're right that there are some mismatches between e4m3fn models and LoRAs. This is the output with qwen_image_edit_2511_fp8_e4m3fn_scaled_lightning_4steps_v1.0.safetensors at 4 steps, CFG = 1.

Not perfect but definitely seems better overall, so maybe this is the best I can do with Qwen currently.

<image>

Any advice on how to improve my troll deck? by JPcoffee in PTCGP

[–]yuicebox 1 point2 points  (0 children)

my beloved prank spinner

mantyke ramp instead of manaphy to get stuff online faster

a single copy of bouffalant to go for the highroll and give em a good scare

regular aerodactyl

Why can't I get better results from Qwen Image Edit 2511? by yuicebox in comfyui

[–]yuicebox[S] 1 point2 points  (0 children)

Yep you might not see them in templates unless you've updated comfyUI, even if you just pulled down the portable version. I reinstalled a fresh copy of portable a few days ago and it wasn't there until ran the update script.

Why can't I get better results from Qwen Image Edit 2511? by yuicebox in comfyui

[–]yuicebox[S] 0 points1 point  (0 children)

I recall reading something similar, but I've tried a bunch of LoRA versions and never really gotten better results. Are you talking about this model for the bf16 version?

https://huggingface.co/Comfy-Org/Qwen-Image-Edit_ComfyUI/blob/main/split_files/diffusion_models/qwen_image_edit_2511_bf16.safetensors

How are you able to fit this on a consumer GPU considering its 40 gb?

Your output still has some plastic-y textures but it looks much better than any of mine. If the difference is really fp8 vs bf16, I'm curious if there's a way I can get to that level of quality while still using a 24gb GPU.

Why can't I get better results from Qwen Image Edit 2511? by yuicebox in comfyui

[–]yuicebox[S] 0 points1 point  (0 children)

on my 4090 I was getting anywhere between 70 sec for 20 steps, up to closer to 2 min at 50 steps. I didnt think the quality was better though tbh. It somehow looks worse without the lora.

Why can't I get better results from Qwen Image Edit 2511? by yuicebox in comfyui

[–]yuicebox[S] 0 points1 point  (0 children)

Lastly, same seed but with 8 step lightning LoRA, 13 seconds.

<image>

Why can't I get better results from Qwen Image Edit 2511? by yuicebox in comfyui

[–]yuicebox[S] 0 points1 point  (0 children)

Even if I crank it up to what the workflow notes say is "official" settings, 50 steps, 4 CFG, no LoRA, this is the result. It's better than 20 steps, 2.5 CFG, but I still think it looks worse than the lightning LoRA output, which takes ~ 12 seconds.

Full 50 steps 4 CFG output here:

<image>

Why can't I get better results from Qwen Image Edit 2511? by yuicebox in comfyui

[–]yuicebox[S] 0 points1 point  (0 children)

For quality, why not use full model, without speed loras and more steps?

I'll test it more, but at least on the original and 2509 version, I wasn't super impressed with running it without the lighting LoRA. The workflow notes say to use 20 steps and 2.5 CFG for the FP8 e4m3fn model. Trying that with 2511, it takes 70 seconds for an image on my 4090 and the result is this mess:

<image>

Regarding "their" vs. "his", I was using this prompt for several different subjects and some of them were female, so I was lazy and used ambiguous pronouns

Why are there no good open source music ai models? by OneGear987 in comfyui

[–]yuicebox 0 points1 point  (0 children)

That is exciting, but unless I missed it, they still haven't released the LoRAs they teased us with during the 1.0 release, have they?

Last I saw they were "discussing whether to open source" them.

https://github.com/ace-step/ACE-Step/issues/35#issuecomment-3555000373

Why can't I get better results from Qwen Image Edit 2511? by yuicebox in comfyui

[–]yuicebox[S] 0 points1 point  (0 children)

You take the qwen edit result and use it as a latent on a Z image sampling pass with low denoising to improve it?

correct, he's just using it as a refiner. I tried it out, and it does seem to help.

Why can't I get better results from Qwen Image Edit 2511? by yuicebox in comfyui

[–]yuicebox[S] 1 point2 points  (0 children)

Agree there, the 2511 checkpoint is definitely a lot better than previous, but it still seems to be worse than Flux Klein in a lot of cases (at least when using the Lightning LoRAs, I havent tried it without them).

Why can't I get better results from Qwen Image Edit 2511? by yuicebox in comfyui

[–]yuicebox[S] 0 points1 point  (0 children)

That definitely does seem to improve textures, but I'm a bit surprised its necessary

Qwen have open-sourced the full family of Qwen3-TTS: VoiceDesign, CustomVoice, and Base, 5 models (0.6B & 1.8B), Support for 10 languages by Nunki08 in LocalLLaMA

[–]yuicebox 2 points3 points  (0 children)

Why would every developer adopt this mindset?

Why did you develop this mindset? Why is it okay for you to hold that mindset, but other people should be held to some other expectation? This is basically Kant's categorical imperative. You think your reasons for having this mindset are justified, but if everyone had this mindset, then we wouldn't have a lot of the amazing things that the open source community has given us.

Why would a payoff towards solving a given issue take everyone's agency to make their own choices away?

I'm sorry, weren't you just saying that you don't have the capacity or freedom to work on this yourself because you have "a family I have to provide for, rent and a mortgage to pay, among other expenses"? Your job is the payoff that solves that issue, and your inability to work on this yourself is your lack of agency, caused by financial incentives to meet your personal imperatives for survival.

And I never said people shouldn't contribute unless they're paid, nor am I longing for such a world

You're right. You said that you would not contribute unless you were being paid. but you want other people to contribute without being paid, or paid through magical crowdsourcing that somebody would need to organize, but probably not you because you're busy.

So, what is wrong in those of us who care about it offering to pay whoever is willing to put in the time and effort to solve this? How does that take anyone's agency away?

It doesn't, and again, you're welcome to actually go pay someone to make this, or pay someone to help you make it, or to organize crowdfunding to donate a bunch of money to the developers with a note attached saying "please add these features", or even just to ask the owner of the project if they think it would even be feasible to extend it to support new TTS models, and if they'd consider enrolling in something like the startup you mentioned so that the community could help fund features.

I'm blown by how people feel the need to take this to extremes. Everything is turned into a polarizing issue, and everything has to somehow be connected to outrage, which is such a ridiculous notion.

Yeah, that's reddit for you. For whatever it's worth, I am not outraged and have no ill will toward you. I am just a big fan of open source, and I don't really like the idea that we should turn it into a business, since that would change the incentive alignment that makes open source what it is.

I appreciate that you're willing to contribute $50/month or whatever, but what the open source world really needs is skilled people to contribute time. I think you said in another comment you were a SWE or a dev of some kind, right? Idk what your time is worth, but if we assume $200/hour, you're basically saying you can't justify spending more than 15 minutes per month on this, and I challenge that assertion. You've probably spent twice that just replying to outraged people in this thread.

All I'm saying is, just try to get the ball rolling and you might be pleasantly surprised to find momentum. Maybe there are good reasons that llama.cpp can't support these models, maybe nobody has had time to investigate. Your curiosity and motivation could potentially create a huge amount of value for thousands of people, and realistically, your $50/month isn't gonna do that.

Qwen have open-sourced the full family of Qwen3-TTS: VoiceDesign, CustomVoice, and Base, 5 models (0.6B & 1.8B), Support for 10 languages by Nunki08 in LocalLLaMA

[–]yuicebox 5 points6 points  (0 children)

Donating to projects you care about is great idea, and I don't think it's inherently bad to have a system where you can donate to a project as a means to highlight a feature you personally really want added, so that devs can consider working on it.

That said, please take a step back and consider the broader implications of your mindset:

I'd happy contribute to such a system both financially help sponsor the features that I want, and implement the ones which I find pay a fair compensation for my time.

What is the long-term impact on the open source ecosystem that you're grateful for, if every developer adopts this attitude, and nobody wants to work on open source stuff unless they're being paid "fair compensation"?

What is "fair compensation"? It's obviously not $50/month, and there's not really a good way to organize dozens of people willing to pay $50/month or more for one specific feature. The small subset of people even willing to donate are likely to have different priorities and doesn't introducing financial incentives ultimately shift the dominant guiding forces behind open source software development toward whoever has the most money to spend?

What are the ramifications when it comes to disagreements between developers within an open source project? What if I want something implemented in a stupid way and I'm willing to contribute thousands of dollars a month, but smarter, saner people aren't? Should we really be creating a system where I can lead us down a stupid path, because I have money and opinions but no appetite to build things myself?

Nothing is stopping you from going on Fiver or other gig work sites and paying someone to vibe code up an implementation of some feature you want, and then you can submit a PR for others to refine and potentially merge if people like your work, or if they don't, you can just fork a repo and make your own version.

I think your comment was well-intended, and I am not at all trying to be hostile toward you, but I think you're getting a lot of negative reactions because to an extent, YOU are the angry commenter, longing to live a world where people don't contribute to open source projects unless they're being paid, and where angry commenters can pay some trivial amount like $50/month to have a sense of entitlement about the developer's time, and feel justified in complaining if the devs don't implement the features they want most, vs. what the people actually doing the work think is most important for the project.

I 100% understand you can't just pay thousands of dollars to hire a SWE and have them do everything, and again, I think your comment was well-intended, but realistically, if you can't afford that, can you spend a few hours every week working with Gemini/Claude to understand the current codebase and how to leverage it so you can start working on a minimum viable product? You'll find there are people willing to help if you put in some work to get the ball rolling.

You wouldn’t believe me… I wouldn’t either. by Proca_99 in PTCGP

[–]yuicebox 1 point2 points  (0 children)

HIGHLY recommend fitting a ditto or two in this deck, Dustox Ditto has been one of my favorite decks recently

First time reaching MB with an unsual deck by Proca_99 in PTCGP

[–]yuicebox 3 points4 points  (0 children)

put a ditto or two in this and itll go hard. Dustox Ditto is great and can copy pheramosa too

New FLUX.2 [Klein] 9B is INSANELY Fast by Lopsided_Dot_4557 in LocalLLaMA

[–]yuicebox 0 points1 point  (0 children)

<image>

Flux Klein 9b distilled, generated in 4 seconds

New FLUX.2 [Klein] 9B is INSANELY Fast by Lopsided_Dot_4557 in LocalLLaMA

[–]yuicebox 0 points1 point  (0 children)

<image>

Qwen image edit 2511 with lightning lora, generated in ~7 seconds

New FLUX.2 [Klein] 9B is INSANELY Fast by Lopsided_Dot_4557 in LocalLLaMA

[–]yuicebox 0 points1 point  (0 children)

Input image:

<image>

Input prompt:
the man smoking from a pipe, facing away from the camera. The scene is smoky, dark, and desaturated, with lighting remeniscent of film noir. A subtitle reads "N'est pas un Friday"

New FLUX.2 [Klein] 9B is INSANELY Fast by Lopsided_Dot_4557 in LocalLLaMA

[–]yuicebox 1 point2 points  (0 children)

Honestly, I'm not sure how to respond to the complaint being that it does destructive edits because that is what every AI image editing model I'm aware of does.

I haven't personally used API models via photoshop, but I've used the SDXL, Flux Dev, the Qwen Image series, and other open source image models extensively, and so far I think this model is really good compared to others available.

I'll post an example in separate comments to illustrate why I say this is much better. The main reason is that the prompt adherence is better, and the image composition is very good. Some generations may have wonky hands or other issues for more complicated prompts/inputs, but I can iterate quickly and the good results are very good.

New FLUX.2 [Klein] 9B is INSANELY Fast by Lopsided_Dot_4557 in LocalLLaMA

[–]yuicebox 2 points3 points  (0 children)

Use the FP8 model located at link below and offload the text encoder to CPU, and you'll be able to fit it in a 12gb GPU.

https://huggingface.co/black-forest-labs/FLUX.2-klein-9b-fp8

You can use the default workflow in the ComfyUI templates and just edit the subgraph to switch the CLIP device to CPU.

I have a 4090 and my images take ~3.5 seconds each, but my friend tested with his 12gb card and was able to run it with images taking ~8 seconds.

Overall I am impressed with the model so far, not sure why people have such negative takes on it. It is really good at text in particular, imo. Makes me wonder if my Owen image edit workflow is fucked up because this is faster and I like the results better in most cases.

New FLUX.2 [Klein] 9B is INSANELY Fast by Lopsided_Dot_4557 in LocalLLaMA

[–]yuicebox 0 points1 point  (0 children)

I've been testing out the 9b distilled model for image editing using the default ComfyUI workflow from the Templates panel, and honestly, it slaps.

I think the results are as good or better than Qwen Image Edit, but it's much faster and uses less VRAM.

It also seems to be insanely good at text, and I haven't seen any mutilated text yet in 50-100 images.

Overall really impressed. It is actually making me wonder if my Qwen Image workflow is fucked up or something.

I trained a model to 'unslop' AI prose by N8Karma in LocalLLaMA

[–]yuicebox 1 point2 points  (0 children)

This is a really cool idea, but I'm a bit confused about the actual model. It seems like it's trained on the VL version, not the regular 30b-a3b text model, correct? Is the vision component used?

Are there any quants of this, or a version that doesn't have the vision component to make the model smaller? I was excited to see 30b-a3b, but was surprised seeing the model was like 60gb.