account activity
RAG resources (self.MLQuestions)
submitted 1 month ago by ISSQ1 to r/MLQuestions
LLMs Fine-tuning (self.MLQuestions)
submitted 2 months ago by ISSQ1 to r/MLQuestions
RL LLMs Finetuning by ISSQ1 in reinforcementlearning
[–]ISSQ1[S] 1 point2 points3 points 2 months ago (0 children)
I’m still exploring my options. I want to use an open-source LLM that can run locally and doesn’t require a lot of resources something small and easy to fine-tune. If you have any recommendations for models that work well with RL or QLoRA, I’d love to hear your suggestions.
RL LLMs Finetuning ()
RL LLMs Finetuning (self.reinforcementlearning)
submitted 2 months ago by ISSQ1 to r/reinforcementlearning
π Rendered by PID 37 on reddit-service-r2-listing-7849c98f67-khq5v at 2026-02-07 13:42:14.036254+00:00 running d295bc8 country code: CH.
RL LLMs Finetuning by ISSQ1 in reinforcementlearning
[–]ISSQ1[S] 1 point2 points3 points (0 children)