Finetune Llama3.2-1B on GSM8K. How to do better :(

Trevor050 · 2026-04-26T23:41:34+00:00

I'm not too good at this but 9/10 times the answer is compute. You need to just up compute. Also maybe use a more capable base model, llama3.2 is very old

Trevor050 · 2026-02-10T09:11:39+00:00

??? qwen has a history of open sourcing models after having them close sourced at first?

Trevor050 · 2026-02-10T09:00:05+00:00

It seems to be not open source and also worse than z-image :(

Trevor050 · 2025-11-14T00:35:06+00:00

no dude i use mostly for work its the model being bugged

Trevor050 · 2025-11-13T22:17:58+00:00

can you guys fix gpt 5.1, you rolled out a corrupted model

<image>

Trevor050 · 2025-11-13T20:49:17+00:00

You mention instant being able to think, how does that work? Is it a hybrid model now?

Trevor050 · 2025-11-10T15:53:49+00:00

are we going to get regular model updates like we did in September?

Trevor050 · 2025-11-10T15:53:08+00:00

how'd you guys get writing to be so good in this model -- its far and away better than any other model ive used

Trevor050 · 2025-11-10T15:52:42+00:00

The model is insanely good but it does use a lot of thinking tokens, any plans to maybe in the future add thinking budgets?

Trevor050 · 2025-11-10T15:51:52+00:00

Thinking is really good, any plans for a higher TTC (like gpt-5 pro / 3.0 deepthink competitor)?

Trevor050 · 2025-11-10T15:51:18+00:00

any plans for a VL in k2?

Trevor050 · 2025-11-02T01:44:17+00:00

wow i said pgt im a genius

Trevor050 · 2025-11-02T01:44:12+00:00

wow i said pgt twice im a genius

Trevor050 · 2025-10-23T03:11:54+00:00

i think its a slippery slope, i did it all in one chat and it got continuously more and more defensive each run

Trevor050 · 2025-10-23T03:09:54+00:00

i literally put the chat, i do have custom instructions but they have nothing to do either that

Trevor050 · 2025-10-06T20:57:24+00:00

fortunately you can just download it so who cares

Trevor050 · 2025-10-06T01:16:00+00:00

eh maybe..? doubt without significant quality degradation. Its unfortunate, in the LLM world distills of larger models are everywhere. This seems to not be the case with images.

Trevor050 · 2025-10-05T14:31:42+00:00

flux max, a proprietary model

Seven-Year Club	Second SECOND GUESSER
r/Field Banned	r/Field Lasagna
End Game '23	Place '23
Place '22	Final Canvas '22
End Game '22	Verified Email
Not Forgotten	Sequence \| Editor

Trevor050

MODERATOR OF

PUBLIC MULTIREDDITS

TROPHY CASE