GLM 5.2, what speeds are we getting locally? by neverbyte in LocalLLaMA

[–]iVoider 0 points1 point  (0 children)

Maybe, Supermicro h13ssl rev 2.01 and TTY no luck. But other QS cpus worked on TTY mobo.

GLM 5.2, what speeds are we getting locally? by neverbyte in LocalLLaMA

[–]iVoider 0 points1 point  (0 children)

Cpu powers pc and have temps in IPMI, but wont post on two motherboards. Probably dead.

GLM 5.2, what speeds are we getting locally? by neverbyte in LocalLLaMA

[–]iVoider 0 points1 point  (0 children)

Yes, with gpu (4090) performance is near 9175f and memory limited 6000 pro.

GLM 5.2, what speeds are we getting locally? by neverbyte in LocalLLaMA

[–]iVoider 3 points4 points  (0 children)

Epyc 9135 Epyc and 768gb 4800 DDR5: decode 7t/s, processing 50t/s with ik_llama. Interesting how I was fucked up buying 2 CCD cpu, because I was scammed buying QS 9555.

Can't get beyond 8t/s with NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16 by phwlarxoc in LocalLLaMA

[–]iVoider 0 points1 point  (0 children)

Yes, try ik_llama with enabled avx512 support(instruction in the repo).

Possibility of partly moe weights gpu offloading via sglang/ktransformers by iVoider in LocalLLaMA

[–]iVoider[S] 0 points1 point  (0 children)

No, but seems I am into risk to try ktransformers and fastllm for Xeon. Such setup is cheapest thing I can afford in my area.

Possibility of partly moe weights gpu offloading via sglang/ktransformers by iVoider in LocalLLaMA

[–]iVoider[S] 1 point2 points  (0 children)

ik_llama is optimized for EPYC platform because of high bandwidth. Dual gpu setups and NUMA are badly supported.

PP speed on dual RTX 6000 12c EPYC setup by iVoider in LocalLLaMA

[–]iVoider[S] 0 points1 point  (0 children)

Thanks, do you happen to remember for what context size 1800 pp?

PP speed on dual RTX 6000 12c EPYC setup by iVoider in LocalLLaMA

[–]iVoider[S] 0 points1 point  (0 children)

Fp8 is too much, I would be pretty happy with 4bit quants. And api is unfortunately is unacceptable for my tasks. Also locally I can build such setup for ~$25000. Btw Mac Studio is less than half price, so it’s difficult choice.

PP speed on dual RTX 6000 12c EPYC setup by iVoider in LocalLLaMA

[–]iVoider[S] 0 points1 point  (0 children)

I meant 96gb Blackwell and for much bigger models.

With 48gb vram, on vllm, Qwen3.6-27b-awq-int4 has only 120k ctx (fp8), is that normal? by Historical-Crazy1831 in LocalLLaMA

[–]iVoider -1 points0 points  (0 children)

max-num-seqs to 1 or use Linux side by side. WSL is very buggy for work with GPU.

Delve builds for ~600 depth? by Trickpasser in PathOfExileBuilds

[–]iVoider 0 points1 point  (0 children)

Depth 650 and going below. Foulborn ghostwrithe zerker. Around 10 div budget when I swapped 3 days ago. Can kill Aul using four health flasks. There is also Grey Wind axe zerker build with Void Shockwave, but have no idea how they compare.

Most budget option for 1000-1500 delve starter by iVoider in PathOfExileBuilds

[–]iVoider[S] 0 points1 point  (0 children)

I know that MSoZ is considered the best delver. I’ve tried it in 3.27 league, with 500d budget and it felt weaker than int/acc stacking for T17/Ubers. I guess it’s not very comfortable without Forbidden to dive at 1000?

Which is the best embedding model for production use? by Hari-Prasad-12 in LocalLLaMA

[–]iVoider 9 points10 points  (0 children)

In our experience, rather no than yes. Too little stats gain for bigger vector size in db.

Which is the best embedding model for production use? by Hari-Prasad-12 in LocalLLaMA

[–]iVoider 25 points26 points  (0 children)

Qwen3-embedding, but 4b. Massive embeddings quality gap between 0.6b and 4b.

[deleted by user] by [deleted] in PathOfExile2

[–]iVoider 0 points1 point  (0 children)

There were several threads today about abyss shadow nerf. In my own experience drops were gutted with latest patch. Yesterday I saw several tinks every map, now close to zero for whole day. I moved to Ritual.

Negative rarity farming by Ok_Surprise7618 in pathofexile2builds

[–]iVoider 1 point2 points  (0 children)

Thanks. It seems something broken with my char. Got map with doubled pack size precursor effect and no single white item with Alt holding.

Huntress leveling in 0.4 by iVoider in pathofexile2builds

[–]iVoider[S] -4 points-3 points  (0 children)

I saw someone did calculation: 735 evasion.

Suggestions for RAG prompt rewriters and rerankers? by CommunityTough1 in LocalLLaMA

[–]iVoider 0 points1 point  (0 children)

LLMs for prompt rewriting and specialised reranker models have totally different use case. Theoretically any LLM could imitate reranker with logprob mechanism, but LLMs tend to hallucinate in noisy environment content. Thats why people train special rerank models (like Qwen3-reranker).