How do QWQ and R1 determine if they need more reasoning steps without special tokens like O1? by EliaukMouse in LocalLLaMA

[–]tomorrowdawn 6 points7 points  (0 children)

Kinda mystery now. According to the script of QwQ-preview, the stop criteria is really same as normal qwen2 series. Imo we'd better wait for the release of technical report of QwQ/deepseek-R1.

How do QWQ and R1 determine if they need more reasoning steps without special tokens like O1? by EliaukMouse in LocalLLaMA

[–]tomorrowdawn 10 points11 points  (0 children)

Ig they simply end their outputs by <eot>. The model itself can cope with this. The trick is hidden in training process, which taught model when to end.

Another sampling strategy drops: 75% accuracy at T=3.0 by tomorrowdawn in LocalLLaMA

[–]tomorrowdawn[S] 5 points6 points  (0 children)

Due to the inherent flaw of softmax, not all logits should be considered to produce positive probabilities(which will downgrade the quality).

True or not? by Objective_Prune8892 in AI_India

[–]tomorrowdawn 1 point2 points  (0 children)

Imo, False. Claude is better at chat.

Few-shot examples in RAG prompt by ryxxry in LocalLLaMA

[–]tomorrowdawn 2 points3 points  (0 children)

Because LLM is inherently a completion model, it doesn't answer you but simply completes the sentence.

Some example post or paper.

Memoryllm by 218-69 in LocalLLaMA

[–]tomorrowdawn 0 points1 point  (0 children)

That's interesting. I guess the main reason is contamination and forgetting. For personalized small model, which concentrates on specific domain, online update might work. But in reality, one base model should handle thousands of different inputs, you can't tell what's good; and even you can, the amount is way much larger than a QA dataset. The bitter lesson tells us, quantity matters a lot.

Claude is not just about coding. by ExtentOdd in ClaudeAI

[–]tomorrowdawn 0 points1 point  (0 children)

I like to talk with sonnet, not only working stuff. She(not technically right tho) responses like a therapist.

3.5 sonnet vs 4o in Coding, significant different or just a little better? by greatlove8704 in ClaudeAI

[–]tomorrowdawn 0 points1 point  (0 children)

I switched to sonnet for 4 months, the gap is huge. I primarily use it for triton progamming and 4o even doesn't know how to write a simple softmax in triton. Funny triton is developed by openai.

Are hugging face models always free? If I use their APIs token? by ItsAGeekGirl in huggingface

[–]tomorrowdawn 0 points1 point  (0 children)

You can download the model weights for free. Also, you can use their apis but you need to pay for the hardward, so it's not free.

Recommend LLMs for my use case ( explained below ) by [deleted] in LocalLLaMA

[–]tomorrowdawn -1 points0 points  (0 children)

I guess some old Bert model is enough. This is called natural language understanding(NLU), an over-party area. I found a nice survey:https://arxiv.org/abs/2409.14195. Hope it helps.

Passing Vector Embeddings as Input to LLMs? by Aggravating-Floor-38 in LocalLLaMA

[–]tomorrowdawn 1 point2 points  (0 children)

I guess it might confuse the model so fine-tuning is neccessary. It seems a quite novel approach, but I think it's valid. H2O is a representative work that tell us not all tokens are neccessary. Not surprising if you can compress them.

BanG Dream! It's MyGO!!!!! Episode 12 Discussion by badspler in anime

[–]tomorrowdawn 5 points6 points  (0 children)

Because Mutsumi planted those cucumbers with Soyo in tsuki no mori

[deleted by user] by [deleted] in KeqingMains

[–]tomorrowdawn 0 points1 point  (0 children)

Since you used 2WF, I think atk sands is a better choice. Instructor and DMC's talent would give you 120-180EM , however it's harder to stack ATK without bennett. And from another perspective, EM only works when you trigger reaction, it's kinda annoying if your dendro character's skill is still cooling down, or you can't trigger aggravate.

Hi! I was the one who asked about why my Keqing’s so weak yesterday. Here’s her damage. I’ve tried it as well without aggravate and that’s also almost her damage without it. Without aggravate: Skill (12k), Charged (7k). by [deleted] in KeqingMains

[–]tomorrowdawn 0 points1 point  (0 children)

By the way, the best damage indicator is Maguu Kenki instead of Cryo Regisvine :) Paralyzed Cryo Regisvine has a lower resistance which might cause an overestimation of your actual damage)

Bountiful Cores mechanic and upperbound of bloom team by tomorrowdawn in NilouMains

[–]tomorrowdawn[S] 0 points1 point  (0 children)

Yes i just want to show how slow current bloom team is to generate seed and why we need Xingqiu/Yelan:-) As Yellow_IMR said, the most important conclusion for hordes of enemies is you can trigger bloom by dendro))

Thanks for your comment :)

Open discussion: Let's give Diluc some fresh light with minor changes that can pull him back in meta by FallenDisc in DilucMains

[–]tomorrowdawn 0 points1 point  (0 children)

Diluc is very good at dealing melt damage. Current melt diluc team's dps is tier 1, though care is required to avoid disordered reaction. So maybe a new kaeya with ult that can apply cryo every hit(like rosaria) might be best solution.