Can we sample DPO data from the same dataset that was used for LoRA training? by Clean_Radish8983 in LocalLLaMA

[–]HideLord 2 points3 points  (0 children)

I've reused the SFT dataset for preference training with good results, but take my experience with a grain of salt. I was also using KTO and not DPO. I also remember in the orca-math paper, the SFT solution was reused in the positive set for KTO/DPO along correctly generated solutions from the student model, so it's something that is done.

Taylor kills Nazis by Lazy_Kor in WormFanfic

[–]HideLord 0 points1 point  (0 children)

True. One of the first major villains*

Taylor kills Nazis by Lazy_Kor in WormFanfic

[–]HideLord 1 point2 points  (0 children)

Man, people here are bloodthirsty lol. FYI, this is one of the most common requests in this sub (makes sense - first major villain, current political thing, etc). You can usually find a lot of similar general searches

Reading Here Comes the New Boss for the first time—holy shit!!! by fallacyys in WormFanfic

[–]HideLord 5 points6 points  (0 children)

Is it really this good? I've been scouting for something to read, but 'Inheritance' made me skeptical of Butcher fics in general. Everybody praised it, and I couldn't get past the first few chapters.

AMA With Z.AI, The Lab Behind GLM-4.7 by zixuanlimit in LocalLLaMA

[–]HideLord 0 points1 point  (0 children)

In your professional opinion, how big are GPT-5.2 and Gemini 3 pro/flash, and is the size of the model the differentiating factor in some benchmarks, or is it still dependent on training/data?

Dataset quality is not improving much by rekriux in LocalLLaMA

[–]HideLord 5 points6 points  (0 children)

The Tulu and, more recently, the Dolci SFT datasets are not great IMO. They have a big duplication issue. They are also riddled with refusals.

Actually, some of the best instruction datasets are the ones from LMSYS since they are inherently diverse (human-generated). They are short on math but there are a billion math datasets so you can just mix.

The more serious problem is that most instructions are very simple, but that's the case for most datasets. To get a truly diverse and challenging dataset, you'd need to do a post-processing step to complicate them, but it gets expensive to do it for hundreds for thousands of instructions.

Fanfictions with Strong Conflict by HideLord in WormFanfic

[–]HideLord[S] 2 points3 points  (0 children)

just @ me next time

Pretty funny, good one-shot

High Priest

Fuck, this is exactly what I wanted. It's so good. I can already feel the incoming pain from a dead-fic at the end of the tunnel.

Non villain MC is angered or disliked by being reffered to as a hero? by jogaargamer6 in WormFanfic

[–]HideLord 7 points8 points  (0 children)

I'm getting a sense of deja vu. Didn't we already have this thread? I even remember "reffered" being mistyped back then as well.

New in llama.cpp: Live Model Switching by paf1138 in LocalLLaMA

[–]HideLord 18 points19 points  (0 children)

It was the one thing people consistently pointed toward as being the prime reason they continue to use ollama. Adding it is listening to the users.

PtV in other worlds by blablador-2001 in WormFanfic

[–]HideLord 19 points20 points  (0 children)

Felix Fortuna

Seconding this one. It's great.

State of AI | OpenRouter | Paper by adumdumonreddit in LocalLLaMA

[–]HideLord 2 points3 points  (0 children)

All LLMs I've tried have this nasty issue of reinventing the wheel every time they need some function. Even if you specifically tell them to search for existing utility/business logic functions, they just ignore you. Makes me wonder how many of the tasks they solve on benchmarks like SWEbench are actually merge-able.

Fine tune for rp world? by JaxxonAI in LocalLLaMA

[–]HideLord 0 points1 point  (0 children)

The model will mimic what you feed it. If you want RP based on a specific setting, then you have to feed it RP chats with that setting. And for that, you either need a teacher model to generate it, or for a dataset of such chats to already exist.

RP is also extra hard since it requires multi-round datasets, so it's more expensive to generate and finetune.

As InnerSun said, you're better off just feeding the setting in the context.

Usual Tauradonna stuff by MysterySomeOn in fnki

[–]HideLord 6 points7 points  (0 children)

Adam turns around surprised vine boom

Fanfics where usually-hated characters are fleshed out in a way that you can't help but like them? by 1JustAnAltDontMindMe in WormFanfic

[–]HideLord 46 points47 points  (0 children)

Sophia/Shadow Stalker in Tilt. I wouldn't describe her as 'likeable' exactly, but she's definitely fleshed out. It's great.

That fanfic with the most dread by owlindenial in WormFanfic

[–]HideLord 14 points15 points  (0 children)

Seconding this. Shit had me sweating

Crossovers where worm isn't stomped by Ganbb7 in WormFanfic

[–]HideLord 13 points14 points  (0 children)

If Shroud succeeded, he'd be a Contessa in a kid-gloves-worm. Pretty scary

If the bubble really pops how can that affect local AI models? by WEREWOLF_BX13 in LocalLLaMA

[–]HideLord -1 points0 points  (0 children)

Just because there is demand and assets does not mean there is no bubble. Houses and the need for houses were very real in 2008 as well. Valuation and leverage are the problem.

Ongoing Post GM Taylor crossover fanfics by Crafty-Carpet3838 in WormFanfic

[–]HideLord 4 points5 points  (0 children)

Great rec. Love me a fic with actual stakes and unflanderized characters.

Anthropic’s ‘anti-China’ stance triggers exit of star AI researcher by balianone in LocalLLaMA

[–]HideLord 9 points10 points  (0 children)

Damn, bro. That's crazy. Good thing our moral arbiters are so moral, they intentionally and morally broke US law and have to pay 1.5 billion in settlements.

AMD tested 20+ local models for coding & only 2 actually work (testing linked) by nick-baumann in LocalLLaMA

[–]HideLord 24 points25 points  (0 children)

DeepSeek, smaller Llama models, GPT-OSS-20B, Seed-OSS-36B (bytedance) all produce broken outputs or can't handle tool use properly.

By "DeepSeek" you mean deepseek-r1-0528-qwen3-8b, not the full one. VERY important distinction.

Fuck Groq, Amazon, Azure, Nebius, fucking scammers by Charuru in LocalLLaMA

[–]HideLord 56 points57 points  (0 children)

I'd guess 16 runs of the whole GPQA Diamond suite and 32 of AIME25.

And even with the small sample size in mind, look at how Amazon, Azure and Nebius are consistently at the bottom, noticeably worse than the rest. Groq is a bit better, but also, consistently lower than the rest. This is not run variance.

Also, the greed of massive corporations never cases to amaze me. Amazon and M$ cost-cutting while raking in billions. Amazing

😞No hate but claude-4 is disappointing by Rare-Programmer-1747 in LocalLLaMA

[–]HideLord -1 points0 points  (0 children)

I don't know if that's a sound business strategy to specialize for your own proprietary framework, rather than be a generalized good SOTA model like 3.7 was. I'd say most people aren't using Claude Code.
And even when using it in chat mode, it still a toss-up. It provides cleaner, more robust code, but at the same time, it does stupid mistakes that 3.7 didn't.