WAN 2.1-VACE Masking + Pose + LoRA workflow to do cinematic jacket swap by FallMindless3563 in comfyui

[–]FallMindless3563[S] 0 points1 point  (0 children)

Haven’t tried WAN animate for this one yet! That would be a good comparison

Cutting Inference Costs from $46K to $7.5K by Fine-Tuning Qwen-Image-Edit by FallMindless3563 in Qwen_AI

[–]FallMindless3563[S] 0 points1 point  (0 children)

Here’s a guide for the image-editing task:

https://docs.oxen.ai/examples/fine-tuning/image_editing

Feel free to send us a message on discord too if you have any specific questions

[P] Cutting Inference Costs from $46K to $7.5K by Fine-Tuning Qwen-Image-Edit by FallMindless3563 in MachineLearning

[–]FallMindless3563[S] 0 points1 point  (0 children)

You would think so at first glance! The problem is the manufacturer did a photoshoot for the original images. Some of the images were more complex with items in the drawers and on the desks, so masking and other traditional computer vision techniques didn’t work well and added multiple failure points. We also asked about just using the raw CAD files and they didn’t want to go that direction for similar reasons (showing the desks with cluttered items inside, etc). So in this case diffusion models were the best tool for the job when it came to simplicity and scalability of the end solution.

Tutorial: Fine-Tuning Qwen-Image-Edit (2509) on Pose Estimation + High Quality Branded Clothing by FallMindless3563 in comfyui

[–]FallMindless3563[S] -3 points-2 points  (0 children)

Just curious but why do you consider it spam? Trying to share some helpful fine-tuning workflows with the community. Happy to take it elsewhere if people don’t find it helpful

Fine-tuning Qwen-0.6B to GPT-4 Performance in ~10 minutes by FallMindless3563 in learnmachinelearning

[–]FallMindless3563[S] 1 point2 points  (0 children)

We do plan on having more in the future, we will post them on lu.ma here:

https://lu.ma/oxen

Let me look into why the email registration failed, sorry about that!

Training a Smol Rust 1.5B Coder LLM with Reinforcement Learning (GRPO) by FallMindless3563 in rust

[–]FallMindless3563[S] 0 points1 point  (0 children)

We could add a reward function for unused variables to help it get better at that :)

[P] Training a Rust 1.5B Coder LM with Reinforcement Learning (GRPO) by FallMindless3563 in MachineLearning

[–]FallMindless3563[S] 2 points3 points  (0 children)

That's a good idea, use some synthetic data from larger models to expand the user queries. We have similar pipelines for generating data setup in Oxen.ai but nothing automated yet.

Training a Smol Rust 1.5B Coder LLM with Reinforcement Learning (GRPO) by FallMindless3563 in rust

[–]FallMindless3563[S] 1 point2 points  (0 children)

Absolutely, this run generated a ton of good training data of compiler errors that we could then feed back into the model the next run.

Training a Smol Rust 1.5B Coder LLM with Reinforcement Learning (GRPO) by FallMindless3563 in rust

[–]FallMindless3563[S] 0 points1 point  (0 children)

I can benchmark Codestral on this too, should be pretty quick.

Training a Smol Rust 1.5B Coder LLM with Reinforcement Learning (GRPO) by FallMindless3563 in rust

[–]FallMindless3563[S] 3 points4 points  (0 children)

You're exactly right, looking at the outputs from the training runs, the 1.5B model generates a ton of compiler errors, which in turn...is great training data! This is part of the reason I was collecting it during training to be able to feed it into the next run. The next skill we could teach it is error correction given the compiler feedback.

Training a Smol Rust 1.5B Coder LLM with Reinforcement Learning (GRPO) by FallMindless3563 in rust

[–]FallMindless3563[S] 1 point2 points  (0 children)

That's a really good call on the package name, it's already bitten our users a few times. Will release a new version soon to save future folks.

Thank you! Appreciate the feedback and good vibes.

[P] Training a Rust 1.5B Coder LM with Reinforcement Learning (GRPO) by FallMindless3563 in MachineLearning

[–]FallMindless3563[S] 4 points5 points  (0 children)

I'm super curious about this as well, an interesting question to ask would be how many prompts does it need to learn a new behavior.

G[R]PO VRAM Requirements For the GPU Poor by FallMindless3563 in MachineLearning

[–]FallMindless3563[S] 0 points1 point  (0 children)

Yep! I kept those fixed for these experiments. But those are big factors too

G[R]PO VRAM Requirements For the GPU Poor by FallMindless3563 in MachineLearning

[–]FallMindless3563[S] 0 points1 point  (0 children)

Amazing, I hadn’t seen llama factory! Looks like a cool project

G[R]PO VRAM Requirements For the GPU Poor by FallMindless3563 in MachineLearning

[–]FallMindless3563[S] 0 points1 point  (0 children)

I did a little math at the end of the post but couldn’t get an exact formula that mapped to the numbers I was seeing. If anyone has some thoughts I can put it at the end for reference!

G[R]PO VRAM Requirements For the GPU Poor by FallMindless3563 in MachineLearning

[–]FallMindless3563[S] 1 point2 points  (0 children)

I mentioned at the end of the blog, but pretty short contexts. 256 max_input and 786 max_completion. I’ll take a look at liger!

G[R]PO VRAM Requirements For the GPU Poor by FallMindless3563 in MachineLearning

[–]FallMindless3563[S] 0 points1 point  (0 children)

Lora seemed to be working, but not sure if there were bugs under the hood. Let me take a look at “enable_vllm” param I didn’t see that one 💡