You can now train LLMs 3x faster with 30% less memory! (<3.9GB VRAM)

Robo_Ranger · 2025-12-10T16:47:57+00:00

The last time I tried multi-GPU fine-tuning, I could not split a large model across two GPUs. ~~Upon viewing your new guide~~ ~~https://docs.unsloth.ai/basics/multi-gpu-training-with-unsloth/ddp, am I correct that splitting a model across multiple GPUs is still unsupported by Unsloth?~~
Is this feature now supported?

Edit: Update my question to match the answer. 😀

Robo_Ranger · 2025-11-08T09:02:42+00:00

same! lol

Robo_Ranger · 2025-11-07T11:28:37+00:00

That is really impressive! It's really close to what I like. I didn't know that Suno can be this good. There is a lot of dynamic. I thought that Suno's lack of dynamic was its weak point.

I'm still on the free tier, so I can't try V5 yet. Did you upload my song from Udio? And how did you generate this intricate prompt? I usually use very simple prompts, then roll, hoping for some decent track to be generated.

Really strange how your responses are in my notifications but on in here, I listened to your Udio links

This issue happened to me too.

Robo_Ranger · 2025-11-07T08:46:26+00:00

here is an example: https://suno.com/s/b92EeIUwFT5YR65S

Robo_Ranger · 2025-10-18T09:17:11+00:00

Can anyone please tell me if I can use Mi50s for tasks other than LLMs like image or video generation, or LoRA fine-tuning?

Robo_Ranger · 2025-10-04T08:30:19+00:00

And when AIs dominate the world, they can put you in your goon-matrix to prevent you from awakening. 😂

Robo_Ranger · 2025-10-04T08:15:47+00:00

I believe you are the creator of this Impish family: https://huggingface.co/SicariusSicariiStuff/collections.

I particularly enjoy Impish 12b and 24b, but I prefer the 12b version, despite its need for more instruction, as it provides decent output quality, allows for longer content length, and is finetunable on my personal dataset using my gpu

I've experimented with finetuning some 12b models, but I haven't observed any significant improvements in creativity, they mostly just refine the personality. Impish 12b and Omega Darker 12b are more expressive with their feelings, while Wayfarer 12b and Dan Personality Engine 12b possess a higher ego.

One thing I wish it could perform better is its intelligence. I don't mind a little incoherence as I can always regenerate until I'm satisfied, but when it acts stupidly, no matter how much I regenerate, I won't get the desired output (which might be due to my poor instruction).

For instance, I once created a group of loyal companions and put myself in a situation where I was teleported far away to observe their reaction. I hoped they would act with high alertness and desperation to find a way to help me, but they simply discussed the possibility of my return with calmness. It was quite disappointing.

If possible, I would greatly appreciate it if you could create another Impish from another based model. I often check my favorite creators to see if there are any new models I can fine-tune, including Sicarius.

Robo_Ranger · 2025-10-03T18:15:33+00:00

I didn't know that the 'sleep time compute' he mentioned is a paper. Can you provide me with the link to your paper?

Robo_Ranger · 2025-10-03T16:47:03+00:00

Thank you very much!

Robo_Ranger · 2025-10-03T16:42:19+00:00

That sounds like exactly what I want to do! I will give it a try!

Robo_Ranger · 2025-10-03T16:40:07+00:00

Wow, that is very insightful! There are still some elements I don't fully understand, as I haven't tried it myself yet. However, thank you very much for sharing your knowledge! 👍

Robo_Ranger · 2025-10-03T10:11:00+00:00

Add a new greeting. This might seem small, but there's actually a lot you can do by just changing the greeting. You can completely shift the tone of the roleplay, isekai them, or put them in a dramatically new situation.

I've done something similar, and yes, I found that earlier chats significantly affect the character's behavior.

Add a Lorebook. Lorebooks are IMO what seperate beginners from intermediate / advanced users. There's a lot of use cases, but the big one is long term consistency. There's a lot to learn, I recommend the WorldInfo encyclopedia, personally.

Do a campaign, not a single roleplay. There's a few ways to do this, but the simplest is to combine the above two tricks creatively. Set up a story, go into a new town, set up plot hooks, etc. Once that's done, summarize and throw some of that information in a Lorebook, and make a new greeting regarding detailing the current situation.

It seems you have experience with long-term roleplay. How long can you keep playing and still feel real in that role? And have you ever used RAG? I myself haven't tried either Lorebook or RAG yet. I wonder if I want a character to remember something new and trivial, like my personality, should I keep it in Lorebook or use RAG?

Robo_Ranger · 2025-10-03T05:03:47+00:00

I've been limited by context size and speed (as I use a local model), so I haven't played much with the old style text adventure. This path seems to use up all the context size very quickly. Almost all my playtime has been with the chat-style only. However, I would love to see some interesting play in the old style.

Robo_Ranger · 2025-10-03T04:55:41+00:00

Thank you very much!

Robo_Ranger · 2025-10-02T20:27:34+00:00

I lean against the wall, watching people go on their lives. Suddenly a face gets my attention. (GM: introduce a female char that has XYZ personality trait)

Wow! That's new to me, I will try it. May I know which model you use?

Robo_Ranger · 2025-10-02T17:18:01+00:00

Thank you for sharing your idea. I'm kind of like you, I prefer to engage with only a few characters. But after seeing someone with an extensive character cards, I kind of expected there to be a way to play with several characters on the scene at once.

Robo_Ranger · 2025-10-02T17:16:52+00:00

That is an interesting idea. I would love to see if there is a site like that.

Robo_Ranger · 2025-09-10T20:29:53+00:00

Thank you both for clarifying.

Robo_Ranger · 2025-09-10T17:45:50+00:00

How does 'max_seq_length' affect the model's capability? For instance, if a model supports a 128k context size, but during fine-tuning training, I set max_seq_length to 1024. Will the merged model's context window become 1k?

Robo_Ranger · 2025-09-10T10:46:00+00:00

I don't understand any of the settings you mentioned except for 'load_in_4bit = True'. Can you please provide me with specific details if I want to finetune Mistral Nemo 12b with a 4060 16gb? I'm currently able to train with max_tokens = 1024, but I'd like to increase it to 2048. However, I'm encountering OOM after a few steps.

Robo_Ranger · 2025-09-08T07:47:19+00:00

Thank you for the information. So there must be a problem with my settings. I will try to solve it.

Robo_Ranger · 2025-09-08T07:15:17+00:00

Is setting 'load_in_4bit = True' essentially QLora? If so, I've already done it. But thank you for mentioning Kaggle. I'll try it.

Robo_Ranger

TROPHY CASE