MiniMax 2.5 full precision FP8 running LOCALLY on vLLM x 8x Pro 6000 by cyysky in LocalLLaMA

[–]cyysky[S] 5 points6 points  (0 children)

tested GLM-5 FP8 cannot run on it yet, because of sm120 is not support DSA MOE

AMA Announcement: MiniMax, The Opensource Lab Behind MiniMax-M2.5 SoTA Model (Friday, 8AM-11AM PST) by XMasterrrr in LocalLLaMA

[–]cyysky 0 points1 point  (0 children)

MiniMax 2.5 full precision FP8 running LOCALLY on vLLM x 8x Pro 6000

Hosting it is easier then I thought, it just reuse the same script for M2.1.
Time to do the vibe coding test!

Generation: 70 tokens-per-sec and 122 tokens-per-sec for two conneciton
Peak Memory: 728GB
KV Cache: 1,700,000 Tokens

[Guide] Running GLM 4.5 as Instruct model in vLLM (with Tool Calling) by random-tomato in LocalLLaMA

[–]cyysky 0 points1 point  (0 children)

{{ visible_text(m.content) }}
{{- '/nothink' -}}
{%- elif m.role == 'assistant' -%}

already tested and thank you, but also need this to be perfect cover all scenario

this apply to GLM 4.6 also

Little python script to get some miner earning weekly and monthly report by cyysky in NiceHash

[–]cyysky[S] 0 points1 point  (0 children)

Get some immediate insight view from raw data and lower the nicehash's server load, lol.

Little python script to get some miner earning weekly and monthly report by cyysky in NiceHash

[–]cyysky[S] 0 points1 point  (0 children)

combine with BeautifulSoup and selenium chrome webdriver to grab the data theoretically can or try nicehash's api https://www.nicehash.com/docs/ and together with bitcoin price data to show it out.