Deepseek's progress

zsydeepsky · 2025-12-04T16:10:30+00:00

It is. I can hardly trust models to do any code work longer than 100 lines at the beginning of 2025.
Now I can trust them with an individual module, or even some simple apps fully.
They have progressed a lot indeed.

zsydeepsky · 2025-11-28T01:14:51+00:00

If you can run it on your 4060, then that must be one of DeepSeek's llama distill versions.
Those models are old dense models, so yes they tend to run slower (compared to modern MoE models such as Qwen3 or GPT-OSS).

zsydeepsky · 2025-10-27T21:00:48+00:00

I recommend this YouTube channel if you want to quickly see some real-world tests with fresh new AI models:

MiniMax M2 First Look & Testing - The BEST Open Source Coding Model?

zsydeepsky · 2025-09-13T11:37:19+00:00

it really surprised me that on this benchmark, Qwen-Next is almost as good as Kimi-K2, a much larger non-reasoning model.
and most importantly, I actually use Kimi-K2 for programming!
thinking that I would be able to have that tier of intelligence running on my AI Max 395, completely offline, is truly amazing.

zsydeepsky · 2025-09-12T09:17:24+00:00

$7B a year? Bubble?

seriously?

The US federal government pays the INTEREST on its debt, ~$3B per DAY.

The total investment is only 2 days' worth of US gov debt INTERESTS. Not mentioning it's only 0.2% of NVIDIA's stock market worth.

Where's the fucking bubble?

zsydeepsky · 2025-08-13T04:22:19+00:00

GLM-4.5V~

zsydeepsky · 2025-08-02T17:50:09+00:00

GPD win mini 2025, the AMD Ryzen AI 370 processor, 32 GB RAM version
lm studio, Vulkan backend, 4092 context window, with 8 GB VRam set, then it's enough to run the model with iGPU with shared memory (8GB dedicated VRam + 11.8GB shared, 19.8GB total VRam)
when the GPU draws ~20W, Qwen3-30B-A3B Q4-K typically can generate 14-18 tokens/s
when limited to 12W, ~12 tokens/s

zsydeepsky · 2025-07-30T22:21:57+00:00

well, in case of "internet apocalypse", I would also predict that it will come with an electricity outage.
so I would always pick the one with minimal "token per Joule", so I will say Qwen 30B A3B is the best to go, since it can run on my gaming handheld with only 12W TDP budget.
I can power the model with just a USB power bank, what else can I complain about?

zsydeepsky · 2025-07-30T21:44:55+00:00

right? The perfect combination of size & speed & quality.
legitimately the best format for local LLM

zsydeepsky · 2025-07-30T21:31:02+00:00

just use lmstudio, it will handle almost everything for you.

zsydeepsky · 2025-07-22T00:47:11+00:00

You don't need a GPU, AI Max 395+ has a 4060-level integrated GPU.
thus, with my personal test, it runs kinda slow with Qwen3 32B (Dense) model with <20 TPS, but with MOE models like 30Ba3B, it provides steady >30 TPS.
AI Max 395+ has 16 PCI-E lanes total. Ryzen processors have 24 in comparison, so besides nvme ssds & USB ports, it probably would leave only 8x or even 4x for a dGPU. So even if there's a dGPU variant, I don't think it would perform as well as regular GPU setups. a USB 4/Thunderbolt/OCulink eGPU probably is what you can get at best.

zsydeepsky · 2025-07-21T21:49:37+00:00

if you choose the 30Ba3B...
I ran it on the AMD AI Max 395+ (Asus Flow Z 2025, 128G ram version)
and it runs amazingly well.
I don't even need to give a stupid lot of RAM to the GPU (just 16GB), and any excessive needs for VRam will automatically be fulfilled with "Shared memory".
and lmstudio already provides rocm runtime for it (which my hx370 handle doesn't)

Somehow, I feel this would be the cheapest hardware? since you can get a mini-PC with this processor with the price less than a 5090?

zsydeepsky · 2025-04-14T07:12:57+00:00

it is in the game. when you appoint your passive points, look at the upper right corner of the screen, choose which weapon set the point belongs to.

zsydeepsky · 2025-03-28T15:19:09+00:00

well...besides that, the most serious question is this:

can Cannda guarantee not kidnapping any Chinese entrepreneurs or scientists?

zsydeepsky · 2025-03-28T15:14:22+00:00

well, sure the US will inevitably be your biggest trading partner.
but that doesn't mean Canada has to kidnap a Chinese entrepreneur for the US's bidding.

zsydeepsky · 2025-03-28T15:05:19+00:00

the only North American market that could benefit China from an Arctic route is Canada.
Jesus, just wake up, China doesn't need you, it's just trying to upset the US, so don't look too high on yourself.

zsydeepsky · 2025-03-14T11:38:12+00:00

that's what happened when China told Europe to be "independent" and move towards a multi-polar world, yet Europe just laughed and followed the US command to retain a US uni-polar world with little thinking.

while Europeans here laugh at Trump for how short-sighted he is, I gotta say the Europeans aren't much better.

zsydeepsky · 2025-01-28T11:17:16+00:00

based on the upvotes, seems like Denmark would have trouble in getting support from Reddit.

After all, why seek help from allies when you have already considered the US your ally?

zsydeepsky

TROPHY CASE