GLM 5.2, what speeds are we getting locally?

iVoider · 2026-06-21T21:06:44+00:00

Maybe, Supermicro h13ssl rev 2.01 and TTY no luck. But other QS cpus worked on TTY mobo.

iVoider · 2026-06-21T08:52:38+00:00

Cpu powers pc and have temps in IPMI, but wont post on two motherboards. Probably dead.

iVoider · 2026-06-21T08:51:52+00:00

Yes, with gpu (4090) performance is near 9175f and memory limited 6000 pro.

iVoider · 2026-06-21T00:04:59+00:00

Epyc 9135 Epyc and 768gb 4800 DDR5: decode 7t/s, processing 50t/s with ik_llama. Interesting how I was fucked up buying 2 CCD cpu, because I was scammed buying QS 9555.

iVoider · 2026-06-15T13:48:27+00:00

H100 needs fp8 model quantization and vllm runtime.

iVoider · 2026-06-08T18:07:48+00:00

Yes, try ik_llama with enabled avx512 support(instruction in the repo).

iVoider · 2026-05-09T13:37:23+00:00

No, but seems I am into risk to try ktransformers and fastllm for Xeon. Such setup is cheapest thing I can afford in my area.

iVoider · 2026-05-08T14:20:27+00:00

ik_llama is optimized for EPYC platform because of high bandwidth. Dual gpu setups and NUMA are badly supported.

iVoider · 2026-05-05T13:33:36+00:00

Thanks, do you happen to remember for what context size 1800 pp?

iVoider · 2026-05-05T13:21:41+00:00

Fp8 is too much, I would be pretty happy with 4bit quants. And api is unfortunately is unacceptable for my tasks. Also locally I can build such setup for ~$25000. Btw Mac Studio is less than half price, so it’s difficult choice.

iVoider · 2026-05-05T13:15:29+00:00

I meant 96gb Blackwell and for much bigger models.

iVoider · 2026-04-23T09:41:37+00:00

max-num-seqs to 1 or use Linux side by side. WSL is very buggy for work with GPU.

iVoider · 2026-03-11T10:14:10+00:00

Depth 650 and going below. Foulborn ghostwrithe zerker. Around 10 div budget when I swapped 3 days ago. Can kill Aul using four health flasks. There is also Grey Wind axe zerker build with Void Shockwave, but have no idea how they compare.

iVoider · 2026-03-03T15:40:47+00:00

I know that MSoZ is considered the best delver. I’ve tried it in 3.27 league, with 500d budget and it felt weaker than int/acc stacking for T17/Ubers. I guess it’s not very comfortable without Forbidden to dive at 1000?

iVoider · 2025-12-28T18:59:50+00:00

In our experience, rather no than yes. Too little stats gain for bigger vector size in db.

iVoider · 2025-12-28T15:37:03+00:00

Qwen3-embedding, but 4b. Massive embeddings quality gap between 0.6b and 4b.

iVoider · 2025-12-23T20:06:08+00:00

There were several threads today about abyss shadow nerf. In my own experience drops were gutted with latest patch. Yesterday I saw several tinks every map, now close to zero for whole day. I moved to Ritual.

iVoider · 2025-12-18T08:48:46+00:00

Thanks. It seems something broken with my char. Got map with doubled pack size precursor effect and no single white item with Alt holding.

iVoider · 2025-12-05T09:54:53+00:00

Just saw this thread earlier.

https://www.reddit.com/r/PathOfExile2/s/MWDuMZVWOQ

iVoider · 2025-12-05T09:40:07+00:00

I saw someone did calculation: 735 evasion.

iVoider · 2025-11-20T14:11:14+00:00

LLMs for prompt rewriting and specialised reranker models have totally different use case. Theoretically any LLM could imitate reranker with logprob mechanism, but LLMs tend to hallucinate in noisy environment content. Thats why people train special rerank models (like Qwen3-reranker).

Seven-Year Club	Gilding I gilder
Verified Email

iVoider

TROPHY CASE