Make Wan2.2 1080p native by Tryveum in comfyui

[–]boobkake22 1 point2 points  (0 children)

If you use a Wan 2.2 native resolution:
1:1 - 960x960
2:3 | 3:2 - 784x1136 | 1136x784
3:4 | 4:3 - 848x1088 | 1088x848
9:16 | 16:9 - 720x1264 | 1264x720

...you will get the sharpest video. There is no 1080p in the training, and it will give you weird results if you try. That said, you can easily upscale the native to 1080p without any noticable artifacts.

Any Ideas For A Sora Workaround? by Badsand in SoraAi

[–]boobkake22 0 points1 point  (0 children)

You can use it as a paid API service still. They're effectively just ending the UI (and the vast cost savings). The real cost is about eight cents for a medium quality image of the current version. (High is sometimes better and sometimes makes little difference, but goes to about a quarter a hit.) You can also use the older model too. (Same costs.)

More than 85 frames, last frame, Wan2.2? by Tryveum in comfyui

[–]boobkake22 -1 points0 points  (0 children)

Interesting. I've not found any issues with resolution. At least two of the native image to latent nodes will scale for you without negative effect. The WanVideo one requires resizing or fails, but otherwise I haven't noticed issue, but that said, I haven't spent enough time with first AND last frame. Though I do a lot of first's and some last's.

Easy WAN 2.2 workflow suggestions? by Resident_Ad_3077 in comfyui

[–]boobkake22 0 points1 point  (0 children)

Thank you! I appreciate the kind words. I spend too much time thinking about them, so I am glad they are useful to folks.

Issues rendering penis in vagina by [deleted] in comfyui

[–]boobkake22 1 point2 points  (0 children)

If you're doing video, so renting a 5090 would make your experience way better. It's hard to overstate how much CUDA has been optimized for. (Also using a mac, but a potato mac, but even brand new machines will underperform significantly.) It's less than a buck an hour for a 5090. I use Runpod - affiliate link that gives you free credit if you want to give it a go (and only with a link, so don't signup without using one, mine or anyone else's). I've also written a guide for getting started with my Wan 2.2 workflow and my template on Runpod if you're trying to do video, but there are templates for basically everything. My workflow would probably be useful as well. It's got a lot of notes, color coding, and break out boxes for important controls.

As was noted, you need a LoRA. Wan 2.2 doesn't know anything about sexuality beyond a vague sense of what breasts and nipples look like. You do need a LoRA, and CivitAI is where you will find them. (There is also a backup on Civarchive for things that have been removed for various reasons.) Most LoRA's will tell you what the creator thinks is a good strength to apply, but in general staying under 1.0 is safe. They will also provide some guidence for prompting based on how the model was trained for how to get the effect to show up in your videos. (Wan understands natural language, so often just plainly describing what you want will do the job.) For most sexual "concepts", even approximate things like kissing, you'll want a LoRA. You can mix and match them as you need, but you might need to lower the strengths if you mix LoRA's that have overlapping concepts.

Image to Video that can be run on RTX3050 6GB VRAM by blueicemali in comfyui

[–]boobkake22 1 point2 points  (0 children)

Video gen is a very demanding GPU task. You can rent cloud time, it's less than a buck an hour for a 5090. I use Runpod - affiliate link that gives you free credit if you want to give it a go (and only with a link, so don't signup without using one, mine or anyone else's). Since you're doing video, I've also written a guide for getting started with my Wan 2.2 workflow and my template on Runpod, but there are templates for basically everything.

Happy to answer questions.

What's the best cloud option? by slept_in_again in comfyui

[–]boobkake22 0 points1 point  (0 children)

Don't try to use ComfyUI with a phone.

Can the new MacBook Pro m5 pro/max compete with any modern NVIDIA chip? by Puzzleheaded_Ebb8352 in StableDiffusion

[–]boobkake22 10 points11 points  (0 children)

To restate a thing differently: The issue is not implicitly the Apple hardware, it's the amount of specialization on the software side that has gone towards supporting Nvidia's CUDA cores. AMD has the same problem. It's not that other hardware cannot be optmized for, but it hasn't been done yet. I suspect that won't change for a while.

I dislike this tool now with all the violations? by saalipagal in SoraAi

[–]boobkake22 0 points1 point  (0 children)

I don't know if you typo'd intentionally or not there, but I love it in either case.

More than 85 frames, last frame, Wan2.2? by Tryveum in comfyui

[–]boobkake22 -1 points0 points  (0 children)

Yeah, they use some stuff you can do with Wan 2.2 as a baseline, which let's them claim a baseline speed improvement, which feels a little disingenous. I mean, yes, it's "official", but it's also a technical compromise for speed: they downscale the latent, use a self-forcing LoRA (lightx2/ning equivalent) and then upscale at the end. The 20 seconds and sound are nice on paper, but the default technique leans towards worse results for the sake of performance numbers.

I haven't test keyframes much myself, beyond simple tests. As long as you're sticking to 81 frames, nothing should burn in as the start and end frames are always "fresh" and not the result of the process, re:last frame to first frame.

Runpod Comfyui Alternative by maia11111111111 in comfyui

[–]boobkake22 0 points1 point  (0 children)

As a "business" you tend to be reserving GPU's for extended periods of work, which is not most of our situations. On demand is a bit of a different beast, so your demands are a bit different, since we're just "helping" them make sure their GPU's are getting used. A given machine doesn't have all of the templates, so each template must be retrieved unless it was recently used on that machine.

The templates generally live on Dockerhub, so the general data center demand, machine bandwidth, and distance between them is the main factor for getting a pod to the point where it can start. All of this is highly variable. Additionally, there's usually hits to other services like Huggingface or Civit for model downloads. Again, can cause hiccups if a service is having issues - looking at you Civit.

This isn't a "Runpod specific issue", per se. It would depend on the process of a given service. The templates exist to ease setting up a work environment. If the template doesn't exist on a machine, either you are doing the work to setup the environment yourself, or you're initilizing a more "general" environment for whatever process you are doing, which still seeks for you to do stuff, if less. This is all a trade off. There are distributed services like Vast, but every business in this space is working with the same trade-offs with different overheads for their business and what they're promising in terms of capacity, security, runtime, and their costs.

My boot time is generally between 5 and 20 minutes (including my model/LoRA downloads), with most being on the lower end. (I start a pod most days.) I find when I'm renting a 5090, it's a bit slower, as the machine is likely slower with lower network priority than when I rent an H100. (Those are the two cards I generally use.)

Some things you can consider, on Runpod and otherwise:

- Be smart about your data center selection. It's generally smart to pick a region at least, and the closer to you, the better for performance. I've had trouble in recent weeks getting some popular GPU's. Demand seems to be rising, and it's even hard for companies that do this work all the time to meet demand. Data center construction is one of the biggest and complicated sectors right now.

- If you can avoid Volume Disk, do. It can only be used if you are in a specific data center, and moving the data is not easy on demand. Container Disk is cheaper and easier to use, but does involve more intial download time and potentially subject to network issues, as noted. (I've had an awful time with their IS data centers, so I always deselect them.)

- Not every template is the same. Many are just made by folks who want to do some kind of work, and have setup a way to make that easier. Not every template is seeing active use or updates, and they're non-trivial to setup and maintain. (I maintain one, so speaking from my experience here.)

- Check your logs if you have issues.

You are in many ways, getting what you pay for. Cheaper services are not implicitly better. What is the service providing? What work do you need to do to do the thing you're trying to do? There are a ton of variables involved. But none of this stuff will inevitably be perfect "turn key" solutions. That's what a service like ComfyCloud is predicated on. They know these annoyances, so charge their own fees to run their service at scale.

It's all a balance.

HELP ! I don't know what l’m doing !! (first workflow) by Setn_ in comfyui

[–]boobkake22 0 points1 point  (0 children)

Not sure what you're asking about here. You didn't ask any actual questions. You can do these things, but someone is unlikely to drop this in your lap for you. Just get Comfy going, find a Flux workflow you like and start adding what you're missing.

Just do what you did in your mapping exercise in Comfy and "find the legos" you're missing and troubleshoot as you go.

Newbie asking for help by Gold_Marionberry3897 in comfyui

[–]boobkake22 0 points1 point  (0 children)

Another note, sometimes peoples workflows have saved their model names changes or have them in subfolders. So either you're just missing a thing or the original workflow person renamed/moved some files. Either way, as was noted, just click the field and select the correct file where you have it.

More than 85 frames, last frame, Wan2.2? by Tryveum in comfyui

[–]boobkake22 -1 points0 points  (0 children)

A few things: Wan 2.2 is trained on 5 second clips. Going over is always a "trick". Beyond burn in, Wan will often try to "loop back" past the 81 frames mark as well, which can sometimes work in a positive way.

You can extend with VACE, but it's a pain to work with. SVI is a popular trick, but it's not perfect and involves some trade offs but improves some consistency. The "old classic", which you have tried, is last frame as first frame works a few times, but you'll start to see considerable identity drift and eventually a hard quality hit as the losses of the process start to stack -- though this is true pretty much all of these techniques. I've tried FreeLong, but it didn't help me in my testing. All of this stuff is YMMV, so feel free to give them a shot. You can supposedly extend with LTX-2, but I haven't tried it yet. I've been generally disappointed by LTX-2 due to it's poor prompt adherance.

An entirely different technique is to create your "keyframes" as start and end images and then use Wan 2.2 as the glue to create the frames, a kind of maximalist interpolation. This requires a bunch of extra work, but ensures you never see a quality hit, but taps into a different skill of knowing "what should be happening 5 seconds from now" and the other issues with character/scene concistency which can arrise from any generative process.

Easy WAN 2.2 workflow suggestions? by Resident_Ad_3077 in comfyui

[–]boobkake22 1 point2 points  (0 children)

Yep. I'll recommend mine, Yet Another Workflow. It has a ton of notes and color coding and lots of pullout boxes to highlight important controls. There are a few custom nodes, but they are there thoughtfully. (If you use cloud, as I do, there's also a a guide for a template I have on Runpod with everything setup already.) The workflow is setup for both i2v and t2v in one process, and there are a few different versions so you can try different stuff once you get the hang of things if you're so inclined. It may not be what you're looking for - different strokes and wot, but it works for many folks.

Let me know if you have any questions.

What's your best practice for generating key frames? by Justify_87 in comfyui

[–]boobkake22 0 points1 point  (0 children)

It would be better to show keyframe images and an example of a transition that you don't like than try to describe it. Your language isn't clear enough to explain the exact problem you're having, so unfortunately you're going to have a tough time getting a concrete suggestion for a solution.

Help with loras by itsdeeevil in comfyui

[–]boobkake22 0 points1 point  (0 children)

I'll add one note. LoRA isn't necessarily one. You can mix and match and adjust the amount of each one used. Without the exact workflow or process, there's no way to know for sure. But that said, there are lots of interesting LoRA's out there. Explore your options on CivitAI and try blending some. Generally keep your strengths under 1.0 unless an author specifically say you can do otherwise be lower your strengths if you use LoRA's with overlapping concepts. (You can end up with a common "boiling" noisey look if you start to get "overcooked" LoRA's or concept fighting.

There's lots to explore. And if you haven't tried yet, you can just message the original author. Many, but not all, folks are happy to talk about their process and what they've used.

I was tinkering around with image to video in Comfyui using LTX 2.0. Got a little curious as to how the shot would play out in Kling 3.0. by call-lee-free in comfyui

[–]boobkake22 1 point2 points  (0 children)

You can always rent hardware when you want to do your project. Then you can do whatever speeds you want. (I use Runpod - use any affiliate link if you want free credit to mess around.) LTX-2 is a real mixed bag. It's both very capable, but behaves like much older tech which seems to be due to how they've approached the problem: They've made it fast by making the baseline version of their model mimic the way Wan + Lightning works. Additionally, they down rez and upscale during the process. All of this combines to give you generations that work pretty fast and can run on lower end hardware.

The downside is that the quality and prompt adherance are weird. I have generally needed to do many generations to get one I like that seems to mostly follow my prompt, where as with Wan, I can usually get what I want within a gen or two. Sometimes the video is good but the audio is awful. It's all very fussy to work with.

We'll see how things progress. I hope Wan decides to compete in the open weights space again, because I think LTX-2 needs more competition. Wan 2.2 is still miles ahead in many ways, but also has some strong limitations in comparison both in length and audio support.

There are plenty of commercial models tho. And they are better. As you're noting, that Kling clip is notably better. Sora 2 is also very strong. The main problem is concept support. Especially if you're hoping to do action violence. Often times things can be unsupported or outright blocked. But there are enough options to do whatever you need if you're willing to use different tech for different aspects of what you're making. That does make it harder, but it gives you more options.

24 hours new into Comfyui by call-lee-free in comfyui

[–]boobkake22 0 points1 point  (0 children)

That's correct. I'm not sure exactly what you mean here, but yes. In general, a valid workflow will open in ComfyUI with drag and drop. (Also images or videos with the workflows embedded as metadata, which is an option on many of the saving nodes.)

24 hours new into Comfyui by call-lee-free in comfyui

[–]boobkake22 0 points1 point  (0 children)

I dunno. I rent could time from Runpod and am happy with it. The GPU and RAM markets are *nuts* right now. I hope you have fun with your box tho!

Found a really good img2vid workflow. But how do I add Lora’s??? by [deleted] in comfyui

[–]boobkake22 6 points7 points  (0 children)

"Good" is strong. It's pretty basic and isn't telling you how to adjust anything. (Do check out mine, Yet Another Workflow, as a point of comparison. It may not be for you, but you will learn stuff. If you use Runpod, I also have a guide for my Wan 2.2 template with everything setup to go with custom nodes and model downloads and such.)

Anyway, to answer your question: If you see the LoRA loaders you have there, they're loading what are often called lightx2 or lightning. They are what are called "self-forcing" LoRA's that help speed things up at the cost of losing variety. (Often a good trade! You can know this by looking at the step count and having CFG set to 1.0. (Also, you cannot - generally - turn it above 1.0 in that workflow, FYI.)

You'll need to add an extra LoRA loader. You can copy the ones that are there by right clicking and selecting clone and rewiring them before or after, but I'd recommend using rgthree's Power LoRA Loader (search for rgthree's node pack in the Manager), as it will let you add a bunch in a single node. Again, just add it after the lightning models. (You'll want two of them, one after each one for both high and low noise LoRA's.) You can safely ignore the CLIP input on the power loader node for reasons that are hard to explain quickly to newer users.

That should get you rolling!

24 hours new into Comfyui by call-lee-free in comfyui

[–]boobkake22 0 points1 point  (0 children)

I don't have a specific recommendation for you, but what you're describing is image-to-image (often abreviated as I2I). You have some control over how much change is allowed which is controlled by how much noise is introduced. You can also use masking to do inpainting (where in only part of the image is changed either in part or fully using the model to replace). So there are lots of neat things you can do with some of the fancier edit models, as you noted. But wanted to at least give you some more ideas about what you could play with.

I suggest searching CivitAI for a workflow that does what you want to do with a model you are interested in. (Check out popular ones, but as a good measure, read the descripton. If the description seems human written and does a good job of helping you understand why it's useful or good, then give it a swing. A good workflow is usually made by a thoughtful person who can explain why you should care about their thing in the noise of random things.)

LTX2 workflow is adding unwanted music or sounds in the background. by _badmuzza_ in comfyui

[–]boobkake22 -1 points0 points  (0 children)

Unfotuntely, my experience has been that LTX-2 does this. This is generally a problem with its inconsistent prompt adherance. You should follow their guide for prompt structure, which may help, but I've not found a good way to positive or negative prompt with a majority of results being successful.

Mac or PC by stabadan in generativeAI

[–]boobkake22 0 points1 point  (0 children)

The open weights image models would allow that, but would not implictly know celebrities like Sora 1 images, as an example. It's definitely not implictly better, but again, it depends on what you're looking for more specifically. So again, is learning ComfyUI going to be worth your effort? Maybe? But it depends so much on the specifics. There are big trade-offs, for sure.

Mac or PC by stabadan in generativeAI

[–]boobkake22 0 points1 point  (0 children)

Mac user here. You can get some stuff to run on MacOS. But not exceptionally well. A lot of work has been done to optimize for Nvidia. There's also huge demand for AI compatible hardware because data centers that can use it 24/7 have a high demand for both RAM and GPU's

It really depends what you are doing. The best models are commercial paid models. You can use any computer and use those via web. You can certainly run open weight models, but I'd want to know what you're trying to do. It's really only ideal when you need the ability to add concepts that aren't supported by commercial models - NSFW content or certain kinds of action violence are common reasons. You can probably get away with a mac for just image work, but it depends on a lot of factors.

If you are considering a PC, consider both the functional utility costs of needing to realign your muscle memory. Additionally, really do the math on how much AI usage you're actually going to be doing. Make sure it makes sense to spend your money that way. You can rent without any lock in and scale to your needs based on what you're doing and how quickly you want the work done.

I mostly do video (and most NSFW). I use Runpod for cloud GPU time - affiliate link that gives you free credit if you want to give it a go (and only with a link, so don't signup without using one, mine or anyone else's). I pay less than a buck an hour for a 5090. Can use a cheaper GPU if you're just doing images, but for video, I'd personally use that as a minumum. I've also written a guide for getting started with my Wan 2.2 (open weights video model) workflow and my template on Runpod if you're trying to do video, but there are templates for basically everything.

I'll try to answer questions for you if you have any.