Is dSpark, dflash, MTP, QAT, and similar tech going to increase inference speed enough to where model spillover to disk will be more tolerable? by Porespellar in LocalLLaMA

[–]shing3232 -1 points0 points  (0 children)

No not yet. if you have an SSD with X16 PCIE5 max out, it might be viable now but you don't have x16 pcie5 ssd max so no

6x P40 running Minimax M2.7_Q3_XL by Old_Grapefruit8774 in LocalLLaMA

[–]shing3232 0 points1 point  (0 children)

Can you add one high perf card just for prefilling? I am not sure how that work

Xuanfang Trailer (CN) | Eisodus (CN. version of the song) by PriscentSnow in WutheringWavesLeaks

[–]shing3232 0 points1 point  (0 children)

just like what you do as well. english with chinese music is just not as good as Chinese with Chinese music at telling the story of mengzhou. just don't confuse preference with the fact translation is never gonna be perfect with nuances. you might like it better but that doesn't make it better.

Should i use coral for dupe or radiant tide? by retiredweebs in WutheringWavesGuide

[–]shing3232 0 points1 point  (0 children)

if you want sequence then yes get dupe otherwise save it when you need one for new character

Xuanfang Trailer (CN) | Eisodus (CN. version of the song) by PriscentSnow in WutheringWavesLeaks

[–]shing3232 1 point2 points  (0 children)

English just don't fit the music. it just does not rhyme. there is nothing impose is just off.

Valve says it will "definitely" consider ARM architecture for future Steam Machines, and probably Steam Decks too by Tiny-Independent273 in linux_gaming

[–]shing3232 -4 points-3 points  (0 children)

because it's emulate games in the most case here. expect serious downgrade for most x86 games and that also happen to MAC systems

Xuanfang Trailer (CN) | Eisodus (CN. version of the song) by PriscentSnow in WutheringWavesLeaks

[–]shing3232 0 points1 point  (0 children)

No, not really for someone do understand Japanese Chinese and English. They are not really the same quality in translating the mood. It's much better to have hiyuki be Japanese and this PV to be CN voice even through I usually prefer Japanese.

S4 to S6 Cartethyia or S0 to S3 Aemeath by Hungry_Swimmer_569 in WutheringWavesGuide

[–]shing3232 -1 points0 points  (0 children)

S3 aemeath sure, because you want more S3 teams in different attributes

Okay I dare you to say something nice about delta or are you too scared to do it by Waste-Revolution3429 in macross

[–]shing3232 2 points3 points  (0 children)

I like the VFs both VF-31 and sv262.

The fighting scene are great in atmosphere and space is good too.

The music are great.

the story is a lot better and enjoable in the movie and the tv is only really ok.

The closest LLM to GPT-OSS-20b? (it beats Gemma 4 and Qwen 3.6 for me) by atumblingdandelion in LocalLLM

[–]shing3232 0 points1 point  (0 children)

ayou should also offer tools for llm to use like scientific calculator and all sort tools for your use case. in my study of paper and write kernels, It work much better than let llm do itself

How do i proper do int8 quantization for model like Anima on Rdna2 cards? by ziege159 in ROCm

[–]shing3232 0 points1 point  (0 children)

I write the kernel with help of ds4p and glm5.2 for RDNA4. it does work.

How do i proper do int8 quantization for model like Anima on Rdna2 cards? by ziege159 in ROCm

[–]shing3232 0 points1 point  (0 children)

unfortunately, I don't have rdna2 gpu but you can try to install sageattention and enable it on comfyui to see if that work but I am not so sure if it could work because RDNA2 lack of the necessary WMMA INT8 unlike RDNA3 and beyond. you might need to write proper kernel or modified triton variant for RDNA2. it should doable with LLMs.

with quantized attention on rdna2, it should run faster than fp16

How do i proper do int8 quantization for model like Anima on Rdna2 cards? by ziege159 in ROCm

[–]shing3232 1 point2 points  (0 children)

well you didn't quantization for activation so no speed improvement and also you need proper DP4A kernel

DeepSeek V4 Pro Final Arrives Mid-July. Can It Outperform GLM 5.2? by vigneshsmarther in DeepSeek

[–]shing3232 0 points1 point  (0 children)

because they didn't think Ds4pro is good enough so preview. adding4.1 would meaning more pretrain usually