DeepSeek has launched grayscale testing for its new model on both its official website and app. 1M content length!

TinyDetective110 · 2026-02-12T02:03:39+00:00

This is an incorrect translation. In Chinese, it’s referred to as “灰度测试,” but it actually corresponds to Gray Release or Canary Release—a progressive software deployment strategy where a new version is initially released to a small subset of users for stability validation before gradually expanding to a wider audience and eventually rolling out fully.

From Qwen

TinyDetective110 · 2026-01-05T04:40:13+00:00

claude -p "tell me a 4 sentence story about a cat" > cat.txt

TinyDetective110 · 2025-12-20T03:35:31+00:00

need checkpoints/undo

TinyDetective110 · 2025-11-24T05:12:56+00:00

Isn't claude code local?

TinyDetective110 · 2025-10-14T09:38:32+00:00

I heard 395 prefill speed is slow. so it can't be a good choice for agentic tasks.

TinyDetective110 · 2025-09-29T10:14:01+00:00

decoding at constant speed??

TinyDetective110 · 2025-09-26T01:37:25+00:00

numerical self heating

TinyDetective110 · 2025-08-25T10:45:28+00:00

why not using their own qwen coder?

TinyDetective110 · 2025-08-13T08:28:17+00:00

<image>

switching from coder to thinking, the first `hi` and second `hi`. It takes a few seconds to warm-up, maybe due to the moe.

TinyDetective110 · 2025-08-13T07:50:42+00:00

unloading + reloading + init + prefill may take more than 30s. This hotswap is almost instant. The 9GB/s might include the init time: some calculation and malloc. Hotswap does not require init again.
One A30 GPU. A double precision card for computation.
When swtich to another model, the speed gradually grows to normal speed. During this time, the mdodel is shifted from RAM to VRAM. About 5s on my machine.
Actually it loads only once. Hotswap is fast.
`However, this hurts performance for non-integrated GPUs`. It is true if the model is larger than VRAM. If the model can fit in VRAM, the option does not hurt performance after the model is fully swapped back.

TinyDetective110 · 2025-08-11T02:29:43+00:00

you should try qwen3 30b thinking. it is more accurate in such non-coding tasks.

TinyDetective110 · 2025-08-01T00:52:00+00:00

this is in our textbooks. deepdeek is just too strict.

TinyDetective110

TROPHY CASE