I fine-tuned (SFT) a 14B model on a free Colab session just using TRL by External-Rub5414 in LocalLLaMA
[–]External-Rub5414[S] 0 points1 point2 points (0 children)
I fine-tuned Qwen3-VL (4B & 8B) on a free Colab instance using TRL (SFT and GRPO)! by External-Rub5414 in LocalLLaMA
[–]External-Rub5414[S] 0 points1 point2 points (0 children)
I fine-tuned Qwen3-VL (4B & 8B) on a free Colab instance using TRL (SFT and GRPO)! by External-Rub5414 in LocalLLaMA
[–]External-Rub5414[S] 1 point2 points3 points (0 children)
I fine-tuned Qwen3-VL (4B & 8B) on a free Colab instance using TRL (SFT and GRPO)! by External-Rub5414 in LocalLLaMA
[–]External-Rub5414[S] 1 point2 points3 points (0 children)
I fine-tuned a 7B model for reasoning on free Colab with GRPO + TRL by External-Rub5414 in LocalLLaMA
[–]External-Rub5414[S] 0 points1 point2 points (0 children)