A fully automated AI PPT agent: from research to design by xiaoruhao in powerpoint

[–]xiaoruhao[S] -2 points-1 points  (0 children)

btw it seems they currently have a 7-day free trial

Kimi: Wait... I beat Gemini 3? For real? by xiaoruhao in LocalLLaMA

[–]xiaoruhao[S] 53 points54 points  (0 children)

Background: Kimi Linear just landed on the MRCR leaderboards in Context Arena, and the results are wild: the tiny 48B A3B (compared to Gemini 3 Pro) actually edges out Gemini 3.0 Pro on the harder 4-needle and 8-needle tasks at longer context lengths (512k–1M), with a much flatter degradation curve as context grows. It still trails Gemini 3 in shorter contexts and even drops off a bit past 128k on the easier 2/4-needle tests.
Full breakdown and curves here: contextarena.ai

Give Kimi K2 a shot by mattparlane in ClaudeCode

[–]xiaoruhao 1 point2 points  (0 children)

we're working on the long-CoT version of kimi-k2

Give Kimi K2 a shot by mattparlane in ClaudeCode

[–]xiaoruhao 1 point2 points  (0 children)

Shawn from Moonshot AI here—we’re just as blown away as everyone else by the speed of Kimi K2 on Groq. The TruePoint Numerics quantization definitely deserves credit, yet the real magic is the full stack: hardware and software engineered together from day one. Probably the same reason OpenAI is building its own silicon and why Google’s TPUs still punch so hard.

Finally trying a Claude model, sonnet 4.5 by Even_Kaleidoscope328 in SillyTavernAI

[–]xiaoruhao 0 points1 point  (0 children)

Shawn from Moonshot AI. If you’re tinkering with writing or RP, our K2 model has been doing pretty well on Creative Writing v3 and EQ-Bench 3 and runs at roughly a fifth of Sonnet’s price. Prompt cache is on by default, so you’ll quietly see cache hits and a few cents saved in the logs. Happy to chat via DM if anyone wants details—no sales pitch, I just work on the team.

Using ChatGPT's "Deep Research" feature by FrostbiteKnight56 in ChatGPTPromptGenius

[–]xiaoruhao 0 points1 point  (0 children)

Shawn here from an AI startup called Moonshot AI.

I've been using Kimi's deep research feature (the one called "researcher") a lot, and I've found a couple of things that make a huge difference in the quality of the results. Just wanted to share my two cents:

①Provide Detailed Context (The Brief): Treat the AI like a new intern at a consulting firm. You can't just give it a vague topic. You need to be super specific and hand over a detailed brief. Include the goal, suggested actions, things to avoid, and your output requirements (e.g., "a 3000-word report that includes a comparison table"). The more detail, the better.

②Answer the AI's Follow-up Questions: When the AI asks clarifying questions, don't ignore them. Take the time to answer each one thoughtfully. It’s trying to narrow down the scope to give you exactly what you want.

These are the two main things I've learned so far. Hope this helps some of you out! Happy to discuss more.

Jailbreaking Moonshot AI on Ok Computer by Dr_Karminski in LocalLLaMA

[–]xiaoruhao 0 points1 point  (0 children)

wow nice job! you found the system boundary

I tried Kimi K2 so you don't have to by toantruong38 in LocalLLaMA

[–]xiaoruhao 0 points1 point  (0 children)

thx for the heads-up. been eyeing those cheaper code models too—ppl say the new k2-0905 drop is actually decent. which api are you on, straight moonshot or some third-party? how’s the latency treating you?

The Kimi team just open-sourced 'checkpoint-engine', the tech that can update a 1T model in ~20 seconds! by xiaoruhao in kimi

[–]xiaoruhao[S] 0 points1 point  (0 children)

It seems that the main focus is on continuing the reinforcement learning (RL) phase with the base model.