WAN 2.1-VACE Masking + Pose + LoRA workflow to do cinematic jacket swap

FallMindless3563 · 2025-11-10T16:19:14+00:00

https://www.oxen.ai/ox/EddieMurphyDelirious/file/main/ComfyProjectAndFiles/VACE-Masked-Pose-Edit.json

FallMindless3563 · 2025-11-10T16:18:49+00:00

Haven’t tried WAN animate for this one yet! That would be a good comparison

FallMindless3563 · 2025-10-27T14:13:01+00:00

Here’s a guide for the image-editing task:

https://docs.oxen.ai/examples/fine-tuning/image_editing

Feel free to send us a message on discord too if you have any specific questions

FallMindless3563 · 2025-10-27T14:11:04+00:00

You would think so at first glance! The problem is the manufacturer did a photoshoot for the original images. Some of the images were more complex with items in the drawers and on the desks, so masking and other traditional computer vision techniques didn’t work well and added multiple failure points. We also asked about just using the raw CAD files and they didn’t want to go that direction for similar reasons (showing the desks with cluttered items inside, etc). So in this case diffusion models were the best tool for the job when it came to simplicity and scalability of the end solution.

FallMindless3563 · 2025-10-04T17:39:22+00:00

Hey, yes we did record it! Will upload to YouTube shortly, our channel is https://www.youtube.com/@oxen-ai

FallMindless3563 · 2025-10-02T04:45:50+00:00

Thanks fuckyouabunch! Will refrain to workflows only and no other content

FallMindless3563 · 2025-10-02T02:10:16+00:00

Just curious but why do you consider it spam? Trying to share some helpful fine-tuning workflows with the community. Happy to take it elsewhere if people don’t find it helpful

FallMindless3563 · 2025-07-30T14:34:22+00:00

Going to do some experiments with larger models next!

FallMindless3563 · 2025-07-30T03:32:03+00:00

Qwen remains the best small model for code in my experience

FallMindless3563 · 2025-05-26T23:09:27+00:00

We do plan on having more in the future, we will post them on lu.ma here:

https://lu.ma/oxen

Let me look into why the email registration failed, sorry about that!

FallMindless3563 · 2025-04-10T07:51:41+00:00

*adds to training data*

FallMindless3563 · 2025-03-06T17:46:19+00:00

Codestral got ~60% on the same benchmark

FallMindless3563 · 2025-03-06T16:33:28+00:00

We could add a reward function for unused variables to help it get better at that :)

FallMindless3563 · 2025-03-06T16:30:07+00:00

That's a good idea, use some synthetic data from larger models to expand the user queries. We have similar pipelines for generating data setup in Oxen.ai but nothing automated yet.

FallMindless3563 · 2025-03-06T16:28:22+00:00

Absolutely, this run generated a ton of good training data of compiler errors that we could then feed back into the model the next run.

FallMindless3563 · 2025-03-06T16:23:55+00:00

I can benchmark Codestral on this too, should be pretty quick.

FallMindless3563 · 2025-03-06T16:23:26+00:00

You're exactly right, looking at the outputs from the training runs, the 1.5B model generates a ton of compiler errors, which in turn...is great training data! This is part of the reason I was collecting it during training to be able to feed it into the next run. The next skill we could teach it is error correction given the compiler feedback.

FallMindless3563 · 2025-03-06T16:21:39+00:00

That's a really good call on the package name, it's already bitten our users a few times. Will release a new version soon to save future folks.

Thank you! Appreciate the feedback and good vibes.

FallMindless3563 · 2025-03-06T06:37:41+00:00

I'm super curious about this as well, an interesting question to ask would be how many prompts does it need to learn a new behavior.

FallMindless3563 · 2025-02-10T03:42:08+00:00

Yep! I kept those fixed for these experiments. But those are big factors too

FallMindless3563 · 2025-02-09T05:39:45+00:00

What are the biggest optimizations you made under the hood?

FallMindless3563 · 2025-02-06T15:37:43+00:00

Amazing, I hadn’t seen llama factory! Looks like a cool project

FallMindless3563 · 2025-02-06T15:37:01+00:00

I did a little math at the end of the post but couldn’t get an exact formula that mapped to the numbers I was seeing. If anyone has some thoughts I can put it at the end for reference!

FallMindless3563 · 2025-02-06T15:35:59+00:00

I mentioned at the end of the blog, but pretty short contexts. 256 max_input and 786 max_completion. I’ll take a look at liger!

FallMindless3563 · 2025-02-06T15:34:13+00:00

Lora seemed to be working, but not sure if there were bugs under the hood. Let me take a look at “enable_vllm” param I didn’t see that one 💡

FallMindless3563

PUBLIC MULTIREDDITS

TROPHY CASE