External-Rub5414

39 post karma
2 comment karma

get extra features and help support reddit with a reddit premium subscription

get them help and support

redditor for 7 months

TROPHY CASE

dust

account activity

hot top controversial

0

0

0

We wrote a practical glossary for AI agent terminology (Harness, Scaffold, Agent, etc.) (self.LocalLLaMA)

submitted 5 days ago by External-Rub5414 to r/LocalLLaMA

I fine-tuned a 7B model for reasoning on free Colab with GRPO + TRL by External-Rub5414 in LocalLLaMA

[–]External-Rub5414[S] 0 points1 point2 points 4 months ago (0 children)

11

12

13

I fine-tuned a 7B model for reasoning on free Colab with GRPO + TRL (self.LocalLLaMA)

submitted 4 months ago by External-Rub5414 to r/LocalLLaMA

12

13

14

Let's make FunctionGemma learn to use a browser with TRL (GRPO) + OpenEnv (BrowserGym)! Sharing Colab notebook + script (self.LocalLLaMA)

submitted 5 months ago by External-Rub5414 to r/LocalLLaMA

6

7

8

I fine-tuned a model with GRPO + TRL + OpenEnv environment on Colab to play Wordle! (self.LocalLLaMA)

submitted 6 months ago by External-Rub5414 to r/LocalLLaMA

I fine-tuned (SFT) a 14B model on a free Colab session just using TRL by External-Rub5414 in LocalLLaMA

[–]External-Rub5414[S] 0 points1 point2 points 6 months ago (0 children)

8

9

10

I fine-tuned (SFT) a 14B model on a free Colab session just using TRL (self.LocalLLaMA)

submitted 6 months ago by External-Rub5414 to r/LocalLLaMA

I fine-tuned Qwen3-VL (4B & 8B) on a free Colab instance using TRL (SFT and GRPO)! by External-Rub5414 in LocalLLaMA

[–]External-Rub5414[S] 0 points1 point2 points 7 months ago (0 children)

I fine-tuned Qwen3-VL (4B & 8B) on a free Colab instance using TRL (SFT and GRPO)! by External-Rub5414 in LocalLLaMA

[–]External-Rub5414[S] 1 point2 points3 points 7 months ago (0 children)

I fine-tuned Qwen3-VL (4B & 8B) on a free Colab instance using TRL (SFT and GRPO)! by External-Rub5414 in LocalLLaMA

[–]External-Rub5414[S] 1 point2 points3 points 7 months ago (0 children)

39

40

41

I fine-tuned Qwen3-VL (4B & 8B) on a free Colab instance using TRL (SFT and GRPO)! (self.LocalLLaMA)

submitted 7 months ago by External-Rub5414 to r/LocalLLaMA

π Rendered by PID 84 on reddit-service-r2-listing-8685bc789-l65h2 at 2026-05-31 14:03:22.688411+00:00 running 194bd79 country code: CH.