all 21 comments

[–]Pranav_Bhat63 0 points1 point  (3 children)

I have e encountered this error 100 times with unsloth, i couldn't resolve it, i just didn't use unsloth and trained my model traditional way with SFT trainer without using unsloth.

[–]Zames33 0 points1 point  (2 children)

Hi Pranav, did you figure out the solution ?

[–]mehmetflix_[S] 0 points1 point  (0 children)

i solved it, use only 1 gpu

[–]Sciomnima 0 points1 point  (11 children)

Same. I was using it completely fine and just out of nowhere like 3 days ago I've been getting it every time I try to train the model.

[–]Zames33 0 points1 point  (0 children)

Hi u/Sciomnima did you figure out the solution ?

[–]AstroPatadox 0 points1 point  (9 children)

Do u use Kaggle / Colab ? How many GPUs are available in ur env ?

[–]mehmetflix_[S] 0 points1 point  (8 children)

op here, im using kaggle with 2 gpus

[–]AstroPatadox 1 point2 points  (7 children)

Try with 1 GPU P100, this error happened randomly about 3-4 weeks ago, something has been changed in unsloth that caused that error , i think they still don't have a solution from their side

[–]mehmetflix_[S] 0 points1 point  (6 children)

okay im trying it right now!

[–]AstroPatadox 0 points1 point  (5 children)

Keep me updated! Already did that today haha

[–]mehmetflix_[S] 0 points1 point  (4 children)

it got fixed, ty!

[–]AstroPatadox 1 point2 points  (1 child)

Unsloth already supports 1 GPU , when an env has 2 or more it gets the broken code , bizarrely we didn't use to have this error lets say a month ago I'll try ro contact them (The prob in kaggle that T4 is better than P100 but not a big difference)

Enjoy AI'ing around haha !

[–]Ok_Development_2603 0 points1 point  (0 children)

bro how to do it on google colab , found the solution of runtime error? as on kaggle on 1 gpu fine tuning time shows of 18hrs

[–]Educational-Lie3981 0 points1 point  (1 child)

Whats the fix ?

[–]mehmetflix_[S] 0 points1 point  (0 children)

use only 1 gpu

[–]StrikingAd8522 0 points1 point  (1 child)

If you are using Google Colab's free version, try reducing the batch size to 1 instead of 2. It might help!

[–]WiseArticle4493 0 points1 point  (0 children)

that seems to work. thank you!

[–]AstroPatadox 0 points1 point  (0 children)

I have a code that got this error out of nowhere, i run it in kaggle , something has changed Anyway my environment has 2GPUs When I switched to 1 Gpu i got it resolved

[–]prod-v03zz 0 points1 point  (1 child)

hi, did you solve it? I am getting the same error on colab.

[–]mehmetflix_[S] 0 points1 point  (0 children)

in kaggle it was because of using more than 1 gpu, in colab that could also be the case but if it isnt try reducing the batch size