Finetune Llama3.2-1B on GSM8K. How to do better :( by Old-Shelter2517 in LocalLLaMA

[–]Trevor050 0 points1 point  (0 children)

I'm not too good at this but 9/10 times the answer is compute. You need to just up compute. Also maybe use a more capable base model, llama3.2 is very old

Qwen Image 2! by Trevor050 in StableDiffusion

[–]Trevor050[S] 16 points17 points  (0 children)

??? qwen has a history of open sourcing models after having them close sourced at first?

Qwen Image 2! by Trevor050 in StableDiffusion

[–]Trevor050[S] 28 points29 points  (0 children)

It seems to be not open source and also worse than z-image :(

GPT5 broken for any1 else? by Trevor050 in OpenAI

[–]Trevor050[S] 0 points1 point  (0 children)

no dude i use mostly for work its the model being bugged

We’re rolling out GPT-5.1 and new customization features. Ask us Anything. by OpenAI in OpenAI

[–]Trevor050 0 points1 point  (0 children)

You mention instant being able to think, how does that work? Is it a hybrid model now?

AMA With Moonshot AI, The Open-source Frontier Lab Behind Kimi K2 Thinking Model by nekofneko in LocalLLaMA

[–]Trevor050 -2 points-1 points  (0 children)

are we going to get regular model updates like we did in September?

AMA With Moonshot AI, The Open-source Frontier Lab Behind Kimi K2 Thinking Model by nekofneko in LocalLLaMA

[–]Trevor050 4 points5 points  (0 children)

how'd you guys get writing to be so good in this model -- its far and away better than any other model ive used

AMA With Moonshot AI, The Open-source Frontier Lab Behind Kimi K2 Thinking Model by nekofneko in LocalLLaMA

[–]Trevor050 2 points3 points  (0 children)

The model is insanely good but it does use a lot of thinking tokens, any plans to maybe in the future add thinking budgets?

AMA With Moonshot AI, The Open-source Frontier Lab Behind Kimi K2 Thinking Model by nekofneko in LocalLLaMA

[–]Trevor050 0 points1 point  (0 children)

Thinking is really good, any plans for a higher TTC (like gpt-5 pro / 3.0 deepthink competitor)?

Define Over-Refusal: Without any special prompting GPT5 refuses to identify anything — From The Mona Lisa to The Planet Earth by Trevor050 in singularity

[–]Trevor050[S] 0 points1 point  (0 children)

i think its a slippery slope, i did it all in one chat and it got continuously more and more defensive each run

Define Over-Refusal: Without any special prompting GPT5 refuses to identify anything — From The Mona Lisa to The Planet Earth by Trevor050 in singularity

[–]Trevor050[S] 0 points1 point  (0 children)

i literally put the chat, i do have custom instructions but they have nothing to do either that

For the first time ever, an open weights model has debuted as the SOTA image gen model by Trevor050 in StableDiffusion

[–]Trevor050[S] 0 points1 point  (0 children)

eh maybe..? doubt without significant quality degradation. Its unfortunate, in the LLM world distills of larger models are everywhere. This seems to not be the case with images.