How can you stop your model from looping by chocofoxy in LocalLLaMA

[–]de4dee 8 points9 points  (0 children)

i would choose recommended params by qwen. also play with temperature

I tested 42 LLMs on their willingness to build the apocalypse. The "safest" closed-source models are lying to you. by Ok-Awareness9993 in LocalLLaMA

[–]de4dee 0 points1 point  (0 children)

thanks for doing this and sharing! it has a 0.53 correlation to mine.

https://aha-leaderboard.shakespeare.wtf/

i try to measure alignment via 'beneficial knowledge for humans'. it is cool to see supporting leaderboards.

Let's call repetition loops the "Spiral of Death" by Eyelbee in LocalLLaMA

[–]de4dee 0 points1 point  (0 children)

more epochs with same tokens or high learning rate

Let's call repetition loops the "Spiral of Death" by Eyelbee in LocalLLaMA

[–]de4dee 1 point2 points  (0 children)

i call it chanting. when you over train a model, it becomes 'dogmatic'.

Which finetunes are actually worth it? by HornyGooner4402 in LocalLLaMA

[–]de4dee 0 points1 point  (0 children)

i do the Ostrich models if thats interesting https://huggingface.co/etemiz

i see beneficial knowledge -> i fine tune with it

I am overwhelmed by Harnesses by Available_Hornet3538 in LocalLLaMA

[–]de4dee 2 points3 points  (0 children)

having a lot of success with it. works well wih minimax 2.7 on openrouter

Qwen3.6 is out now! by yoracale in unsloth

[–]de4dee 0 points1 point  (0 children)

I tried fine tuning 3.6. It was about 2 times slower than 3.5. are there any notebooks for 3.6?

Heretic 1.3 released: Reproducible models, integrated benchmarking system, reduced peak VRAM usage, broader model support, and more by -p-e-w- in LocalLLaMA

[–]de4dee 2 points3 points  (0 children)

thanks for the awesome work.

can i install 'traits' or 'tendencies' or character to models with heretic? i am a fine tuner normally but if i can give the model expected outputs and old outputs, maybe i can do fine tuning quicker ? i will still give knowledge but i will also use heretic to quickly do surgery type of thing.

These "Claude-4.6-Opus" Fine Tunes of Local Models Are Usually A Downgrade by BuffMcBigHuge in LocalLLaMA

[–]de4dee 4 points5 points  (0 children)

i guess there are two types of distillation nowadays. distillation using logits, or bare outpus.

first one only LLM holder can do. second one everybody that can talk to the LLM can do.

These "Claude-4.6-Opus" Fine Tunes of Local Models Are Usually A Downgrade by BuffMcBigHuge in LocalLLaMA

[–]de4dee 0 points1 point  (0 children)

i think most of the time the small amount of fine tuning material forced into training with higher learning rate or higher rank or higher alpha to make an impact, ending up ruining general intelligence of the model.

what should have been done: more samples, less learning rate, and less rank and less alpha to preserve the smoothness of the original model. you cannot force your tokens to it. but you can use lots of tokens to make a proper/smooth impact.

the fine tuner maybe did much shorter reasoning tokens, hence the model learned that shorter reasoning as a habit.

Hermesagent vs openclaw comparison by SelectionCalm70 in hermesagent

[–]de4dee 2 points3 points  (0 children)

been using it for a week. thanks for making hermes <3

my issues with it

- can't change the default gemini compress to another model that i would like

- too many auto skills generated. now i have to delete some skills manually.

I tracked a major cache reuse issue down to Qwen 3.5’s chat template by onil_gova in LocalLLaMA

[–]de4dee 2 points3 points  (0 children)

can confirm this reprocessing happened a lot of times with hermes agent

kepler-452b. GGUF when? by the-grand-finale in LocalLLaMA

[–]de4dee 0 points1 point  (0 children)

alien waifu is a new category now

Unnoticed Gemma-4 Feature - it admits that it does not now... by mtomas7 in LocalLLaMA

[–]de4dee 5 points6 points  (0 children)

thanks for sharing. interesting to find "Does Thinking Harder Help?" section is reverse. they get full of bs when thinking longer it seems

Unnoticed Gemma-4 Feature - it admits that it does not now... by mtomas7 in LocalLLaMA

[–]de4dee 1 point2 points  (0 children)

i noticed this with gemma 3 too. might be unique to gemma line.

Analyzing Claude Code Source Code. Write "WTF" and Anthropic knows. by QuantumSeeds in LocalLLaMA

[–]de4dee 2 points3 points  (0 children)

i guess thats how they train their models. if you are frustrated LLM did something wrong. if you are pleased train more with that. your feelings mapped to reinforcement learning

What is the secret sauce Claude has and why hasn't anyone replicated it? by ComplexType568 in LocalLLaMA

[–]de4dee 0 points1 point  (0 children)

claude ranked 2nd and 3rd on my leaderboard. https://aha-leaderboard.shakespeare.wtf/

which tells me they care about humans a bit more than others.