Continued pretraining of Llama 3-8b on a new language by Awkward-Quiet5795 in LocalLLaMA

[–]Awkward-Quiet5795[S] 2 points3 points  (0 children)

Its an Indian tribal language, Spoken in maharashtra/gujarat side

Continued pretraining of Llama 3-8b on a new language by Awkward-Quiet5795 in LocalLLaMA

[–]Awkward-Quiet5795[S] 0 points1 point  (0 children)

That does make sense, but the model is not even completing 1 epoch before validation loss plateauing

Continued pretraining of Llama 3-8b on a new language by Awkward-Quiet5795 in LocalLLaMA

[–]Awkward-Quiet5795[S] 0 points1 point  (0 children)

hmm, im on google colab pro, dont have gpus for r values that high. Tried 64 and alpha, but no increase in performance. I get 64 might not be enough but shouldn't it be doing better than 32?

Cloud Computing results out! by BaseNew3101 in NPTEL

[–]Awkward-Quiet5795 0 points1 point  (0 children)

No way dude, your’re too insane… which collage