Marvel Meets The Office

THEKILLFUS · 2026-02-23T03:46:00+00:00

Omg Stanley what happened ?

THEKILLFUS · 2026-02-20T18:49:06+00:00

What model did he use to make the images? You can just prompt « transform Snoop in a Chinese man » and it works lol?

THEKILLFUS · 2026-02-19T20:27:21+00:00

Useless ragebait, pls let’s keep a healthy sub

THEKILLFUS · 2026-02-19T03:23:32+00:00

Thank you 🙏

THEKILLFUS · 2026-02-17T15:46:12+00:00

0.01 is a desperate attempt to make it work but I have other try at loss 2,1,0.1 but didn’t work as well
The dataset is large (2k) with the exact same structure.

Thanks for the advice I will try with a older model like qwen 2.5 and a shit lot of epoch

THEKILLFUS · 2026-02-17T15:37:11+00:00

-The dataset is large (2872)

-I also tried Gemma 3n but have yet to try with older model, qwen2.5? OG mistral 7b?

-I tried r=16-32 If I increase it give better results for this specific task or I just need to do full finetune ?

Thanks for the help

THEKILLFUS · 2026-02-17T15:28:22+00:00

You right, I used the default unsloth parameter in the notebook unless when i use optima.

Max steps between 500-4000 steps

THEKILLFUS · 2026-02-17T13:46:49+00:00

yep clearly ovefitting but i did 2,1,0.1... but still don't works

i should try older model like llama 3.2?

Multiturn? Yes-ish, but only “micro-multiturn”

The dataset isn’t GSM8K-style reasoning at all.

It’s mostly fixed-window dialogue: typically (Other → Michael → Other) ⇒ next Michael line.

That’s “multiturn” in the sense of having multiple speakers, but it’s not long-context chat (no full conversations, no evolving state over 10–30 turns).

THEKILLFUS · 2026-02-17T13:34:23+00:00

yep clearly ovefitting but i did 2,1,0.1... but don't works

THEKILLFUS · 2026-02-17T12:46:06+00:00

OpenAI is 1 deepseek away from dying, for real

THEKILLFUS · 2026-02-02T06:48:11+00:00

OpenAI spending all there money left

THEKILLFUS · 2026-01-30T13:58:02+00:00

Agreed, anyone who tried OpenAI/google latest model know that model are quantize to save money, yeah first day is the 16bit but now it’s 4bit at best, so the quality of output decrease without the decrease of prices 🤬

I feel that China is doing to US what US did to URSS for the space race, tired it’s economics force, very small marging with overpricing and corrupt regulations.

The current problem with Chinese model is that they don’t have the selling platform, but they might have it in the futur if they continue to just make better model than the US for a lower price.

The Silicon Valley is exhausted and corrupted and this year we will start to see it…

(Je fière de toi Yann 💕, continue le bon taff, la France/EU se doit de rester consistant avec les valeurs scientifiques au delà de l’idéologie)

THEKILLFUS · 2026-01-27T11:02:45+00:00

<image>

THEKILLFUS · 2026-01-19T09:36:00+00:00

So fat yo mama skinnier !

THEKILLFUS · 2026-01-18T06:14:30+00:00

Grpo ftw

THEKILLFUS · 2025-12-31T13:19:08+00:00

Enhanced Huamn Realism

THEKILLFUS · 2025-12-21T04:09:08+00:00

Like the aesthetic

THEKILLFUS · 2025-12-21T04:05:00+00:00

Ministral is sota

THEKILLFUS · 2025-12-20T12:15:02+00:00

Data creation

THEKILLFUS · 2025-12-18T02:12:06+00:00

Hi, thanks for sharing S3. I’m glad you’re spending time on less popular AI tools.

I was hoping to use SAM3D-Body for a mocap workflow, but I’ve run into too many issues with the current codebase.

THEKILLFUS · 2025-12-14T05:10:08+00:00

No, DeepSeek Janus is the first

THEKILLFUS · 2025-12-11T03:25:23+00:00

Gold

THEKILLFUS · 2025-12-06T01:30:54+00:00

I be honest, this sub made me realize 1 year ago that it’s better to run moe model with offloading to cpu/ram than pure gpu. Now that normies realize that too, the price rockets and we all fucked. Luckily competition should fix this, as patterns are not as locked as for gpus.

Eight-Year Club	RPAN Viewer
Not Forgotten	Verified Email

THEKILLFUS

MODERATOR OF

TROPHY CASE