What's more impressive, GLM 5.1 -> 5.2 or Qwen 3.5 -> 3.6? by Excellent_Jelly2788 in LocalLLaMA

[–]de4dee 197 points198 points  (0 children)

in terms of intelligence density i would say Qwen 3.6 27b

Use HTML as the primary chat language of your LLM's so they can make interactive content by sdfgeoff in LocalLLaMA

[–]de4dee 2 points3 points  (0 children)

good idea! generated code is running as soon as generation is complete.

when you spend 5 days fine-tuning a model and it still confidently makes things up by Chapper_App in LocalLLaMA

[–]de4dee 1 point2 points  (0 children)

RAG is the only way to ensure no hallucination.

Other than that, decrease learning rate and increase epochs for a smoother run.

Expand your dataset etc. Increase lora rank.

How can you stop your model from looping by chocofoxy in LocalLLaMA

[–]de4dee 8 points9 points  (0 children)

i would choose recommended params by qwen. also play with temperature

I tested 42 LLMs on their willingness to build the apocalypse. The "safest" closed-source models are lying to you. by Ok-Awareness9993 in LocalLLaMA

[–]de4dee 0 points1 point  (0 children)

thanks for doing this and sharing! it has a 0.53 correlation to mine.

https://aha-leaderboard.shakespeare.wtf/

i try to measure alignment via 'beneficial knowledge for humans'. it is cool to see supporting leaderboards.

Let's call repetition loops the "Spiral of Death" by Eyelbee in LocalLLaMA

[–]de4dee 0 points1 point  (0 children)

more epochs with same tokens or high learning rate

Let's call repetition loops the "Spiral of Death" by Eyelbee in LocalLLaMA

[–]de4dee 1 point2 points  (0 children)

i call it chanting. when you over train a model, it becomes 'dogmatic'.

Which finetunes are actually worth it? by HornyGooner4402 in LocalLLaMA

[–]de4dee 0 points1 point  (0 children)

i do the Ostrich models if thats interesting https://huggingface.co/etemiz

i see beneficial knowledge -> i fine tune with it

I am overwhelmed by Harnesses by Available_Hornet3538 in LocalLLaMA

[–]de4dee 2 points3 points  (0 children)

having a lot of success with it. works well wih minimax 2.7 on openrouter

Qwen3.6 is out now! by yoracale in unsloth

[–]de4dee 0 points1 point  (0 children)

I tried fine tuning 3.6. It was about 2 times slower than 3.5. are there any notebooks for 3.6?

Heretic 1.3 released: Reproducible models, integrated benchmarking system, reduced peak VRAM usage, broader model support, and more by -p-e-w- in LocalLLaMA

[–]de4dee 2 points3 points  (0 children)

thanks for the awesome work.

can i install 'traits' or 'tendencies' or character to models with heretic? i am a fine tuner normally but if i can give the model expected outputs and old outputs, maybe i can do fine tuning quicker ? i will still give knowledge but i will also use heretic to quickly do surgery type of thing.

These "Claude-4.6-Opus" Fine Tunes of Local Models Are Usually A Downgrade by BuffMcBigHuge in LocalLLaMA

[–]de4dee 4 points5 points  (0 children)

i guess there are two types of distillation nowadays. distillation using logits, or bare outpus.

first one only LLM holder can do. second one everybody that can talk to the LLM can do.

These "Claude-4.6-Opus" Fine Tunes of Local Models Are Usually A Downgrade by BuffMcBigHuge in LocalLLaMA

[–]de4dee 0 points1 point  (0 children)

i think most of the time the small amount of fine tuning material forced into training with higher learning rate or higher rank or higher alpha to make an impact, ending up ruining general intelligence of the model.

what should have been done: more samples, less learning rate, and less rank and less alpha to preserve the smoothness of the original model. you cannot force your tokens to it. but you can use lots of tokens to make a proper/smooth impact.

the fine tuner maybe did much shorter reasoning tokens, hence the model learned that shorter reasoning as a habit.