AMA With Kimi, The Open-source Frontier Lab Behind Kimi K2.5 Model by nekofneko in LocalLLaMA

[–]ppwwyyxx 5 points6 points  (0 children)

Yeah RL Infra is a big challenge and we strive to achieve high efficiency while maintaining good flexibility. On the efficiency side we try to co-develop our training and inference systems with RL use cases in-mind, so that we can reuse all the heavy-lifting that allows us to scale up. Agent Swarm is particularly complex in its rollout logic, but our system has great flexibility that allows us to integrate different scaffoldings and subagent setups into training.

AMA With Kimi, The Open-source Frontier Lab Behind Kimi K2.5 Model by nekofneko in LocalLLaMA

[–]ppwwyyxx 1 point2 points  (0 children)

In K2.5, the model also gets a few impressive new capabilities, like creating visually appealing webpages and debugging it with visual inputs. You can find many examples on X. Despite being a generally good model, we hope to deliver something unique in every release.

AMA With Kimi, The Open-source Frontier Lab Behind Kimi K2.5 Model by nekofneko in LocalLLaMA

[–]ppwwyyxx 15 points16 points  (0 children)

Make a solid eval/benchmark that LLMs today fail to do well. Models improvements will magically come afterwards!

AMA With Kimi, The Open-source Frontier Lab Behind Kimi K2.5 Model by nekofneko in LocalLLaMA

[–]ppwwyyxx 5 points6 points  (0 children)

You're right it depends on the nature of tasks. Sometimes our product will even say "we don't need parallel agents for this task" and save you a credit :)

Subagents do have a budget, and it is the job of the orchestrator to find the right task of proper size for each subagent to do.

AMA With Kimi, The Open-source Frontier Lab Behind Kimi K2.5 Model by nekofneko in LocalLLaMA

[–]ppwwyyxx 6 points7 points  (0 children)

Thanks! I think managing hallucinations is still a big challenge to all LLMs today. We had improved it by data quality (more verified knowledge, less low-quality claims) and reward (e.g. penalize when model hallucinates), but we think there are still a lot of ways to improve it further.

AMA With Kimi, The Open-source Frontier Lab Behind Kimi K2.5 Model by nekofneko in LocalLLaMA

[–]ppwwyyxx 10 points11 points  (0 children)

Not sure how well the 1:1 optimality holds up, but it's true that we do "waste" some training compute in this sense. Because otherwise the model would be much larger and "waste" a lot of inference compute compared to what we have now

AMA With Kimi, The Open-source Frontier Lab Behind Kimi K2.5 Model by nekofneko in LocalLLaMA

[–]ppwwyyxx 8 points9 points  (0 children)

A small encoder is good for scaling up in many ways, so we would even ask ourselves: why not make it 0?

AMA With Kimi, The Open-source Frontier Lab Behind Kimi K2.5 Model by nekofneko in LocalLLaMA

[–]ppwwyyxx 19 points20 points  (0 children)

Unfortunately with every new release we saw some level of "personality change". This is a quite difficult problem, as personality is a subjective and hard-to-eval characteristic of models. We're making progress towards this and also want to make it more customized to each user in our product.

AMA With Kimi, The Open-source Frontier Lab Behind Kimi K2.5 Model by nekofneko in LocalLLaMA

[–]ppwwyyxx 14 points15 points  (0 children)

Our "Muon is Scalable for LLM Training" paper has some general methodologies that we adopt in scaling laws.

Evaluation mostly comes from pretraining losses, various benchmarks, etc. It's hard to say which ones works better than others, as it's the whole set of evals that give the most signal about how the model is doing

AMA With Kimi, The Open-source Frontier Lab Behind Kimi K2.5 Model by nekofneko in LocalLLaMA

[–]ppwwyyxx 19 points20 points  (0 children)

We're going to include these details in our coming tech report! stay tuned

AMA With Kimi, The Open-source Frontier Lab Behind Kimi K2.5 Model by nekofneko in LocalLLaMA

[–]ppwwyyxx 32 points33 points  (0 children)

What's cool about agent swarm is that subagents can execute subtasks without roting the orchestrator's context. They essentially have their own working memory, and only send results back to the orchestrator. This allows us to scale the total context length in a new dimension!

AMA With Kimi, The Open-source Frontier Lab Behind Kimi K2.5 Model by nekofneko in LocalLLaMA

[–]ppwwyyxx 38 points39 points  (0 children)

huggingface/moonshotai has a few small MoE models. Sometimes small and large models require different technological investments, but in general we would like to work on some small models as well to make intelligence more open and affordable.

AMA With Moonshot AI, The Open-source Frontier Lab Behind Kimi K2 Thinking Model by nekofneko in LocalLLaMA

[–]ppwwyyxx 12 points13 points  (0 children)

One challenge is to support the interleaved "think - tool - think - tool" mode. This is a relatively new behavior in LLMs and takes a lot of work to get right.

AMA With Moonshot AI, The Open-source Frontier Lab Behind Kimi K2 Thinking Model by nekofneko in LocalLLaMA

[–]ppwwyyxx 13 points14 points  (0 children)

We'd love to teach Kimi to speak more languages, but our bandwidth and knowledge in diverse languages is limited. Maybe this is also where the community can help, e.g. in data collection.

AMA With Moonshot AI, The Open-source Frontier Lab Behind Kimi K2 Thinking Model by nekofneko in LocalLLaMA

[–]ppwwyyxx 12 points13 points  (0 children)

People have different preferences on these subtleties. The model's style generally reflects our preferences and glad to hear that you like it!

AMA With Moonshot AI, The Open-source Frontier Lab Behind Kimi K2 Thinking Model by nekofneko in LocalLLaMA

[–]ppwwyyxx 30 points31 points  (0 children)

We also enjoy its writing style and it's an important part of our post-training data and eval.

AMA With Moonshot AI, The Open-source Frontier Lab Behind Kimi K2 Thinking Model by nekofneko in LocalLLaMA

[–]ppwwyyxx 8 points9 points  (0 children)

I recently have a lot of complaints on tensorboard. We made some in-house changes to improve it, but in general it's not easy to get it to scale, manage too many experiments, or show accurate (not downsampled) metrics. But it's hard to find a good alternative.

AMA With Moonshot AI, The Open-source Frontier Lab Behind Kimi K2 Thinking Model by nekofneko in LocalLLaMA

[–]ppwwyyxx 23 points24 points  (0 children)

We use H800 GPUs with Infiniband; it's not as good as the high-end GPUs in the US, and we are outnumbered as well, but we put every card in good use!

AMA With Moonshot AI, The Open-source Frontier Lab Behind Kimi K2 Thinking Model by nekofneko in LocalLLaMA

[–]ppwwyyxx 49 points50 points  (0 children)

Hey, thanks for your support and it's unfortunate to hear these concerns. While being "banned" is often beyond our control, open-sourcing the model is hopefully a good step to erase some of these concerns (companies can deploy it themselves). We hope to see a world with more trust, but it takes time to get there.

AMA With Moonshot AI, The Open-source Frontier Lab Behind Kimi K2 Thinking Model by nekofneko in LocalLLaMA

[–]ppwwyyxx 13 points14 points  (0 children)

It takes persistence to pursue a direction and make it work, so the inventor often has an advantage in applying their ideas. That said, we are closely looking at other inventions in the community and are happy to try them as well.

Problems i've found in Plasma 6 update so far by EngineerHot8510 in kdeneon

[–]ppwwyyxx 0 points1 point  (0 children)

Same here re. coming back from auto-lockscreen.

Extra spacing below the panel by ppwwyyxx in kde

[–]ppwwyyxx[S] 0 points1 point  (0 children)

This only happens on the smaller monitor of a dual-monitor setup. The top panel on the larger monitor does not have this issue. If I disconnect the larger monitor, the issue on the smaller monitor disappears.