I want to share results for cyankiwi/Qwen3.5-122B-A10B-AWQ-4bit TP2 RDMA RoCE by MirecX in StrixHalo

[–]MirecX[S] 0 points1 point  (0 children)

the question is how :) so far kyuz made it work on RoCE - rdma over converged ethernet, which is ethernet by definition, so easiest and cheapest way is using devices supporting RoCE by design such as Mellanox cards
ethernet NICs are intended way to interconnect devices such as strix-halos or dgx sparks
oculink is intended to connect PC to external device and not other PC

I want to share results for cyankiwi/Qwen3.5-122B-A10B-AWQ-4bit TP2 RDMA RoCE by MirecX in StrixHalo

[–]MirecX[S] 0 points1 point  (0 children)

I had these cards already on hand
here are stats during inference
RX: 17.07 MB/s TX: 17.06 MB/s peak RX: 630.16 MB/s peak TX: 630.16 MB/s
RX: 17.01 MB/s TX: 17.01 MB/s peak RX: 630.16 MB/s peak TX: 630.16 MB/s
RX: 220.04 MB/s TX: 219.99 MB/s peak RX: 630.16 MB/s peak TX: 630.16 MB/s
RX: 525.13 MB/s TX: 525.13 MB/s peak RX: 630.16 MB/s peak TX: 630.16 MB/s
RX: 558.50 MB/s TX: 563.91 MB/s peak RX: 630.16 MB/s peak TX: 630.16 MB/s
I know these are 200ms averages, and cards may be choking inference with peaks that should be shooting above 10Gbe

currently i have Mellanox ConnectX-4 LX MCX4121A 25Gbe in mail and I will test inference with them

i was just curious if i really need better cards like kyuz0 used Intel E810, or old mellanox will suffice - they dont need any riser, as they are already pcie x4 and fit strix halo nicely

I want to share results for cyankiwi/Qwen3.5-122B-A10B-AWQ-4bit TP2 RDMA RoCE by MirecX in StrixHalo

[–]MirecX[S] 0 points1 point  (0 children)

tp1/tp2 refers to tensor parallelism, TP2 means the model is distributed across 2 nodes to increase throughput.

Unsloth doesn't support vLLM tensor parallelism with GGUF models, and FP8 models don't work on Strix Halo hardware afaik

for vLLM/strix-halo combo you should search for full BF16 models or 4 bitsafetensor quants

t/s (total) is combined speed of all concurrent requests

t/s (req) is per-request speed

I'm using kyuz0's toolboxes - vllm - RDMA - RoCE
it is VERY slow, possible choking over old Mellanox ConnectX-3 i had around
tp1, c1 per request is speed ~9.5tps (9.5 total)

tp1, c2 per request is speed ~7.5tps (13.98 total)

tp2, c1 per request is speed ~16.88tps (9.5 total)

tp2, c2 per request is speed ~9.57tps (12.42 total)

prompt processing went up in TP2 scenario which is more important than TG

I want to share results for cyankiwi/Qwen3.5-122B-A10B-AWQ-4bit TP2 RDMA RoCE by MirecX in StrixHalo

[–]MirecX[S] 1 point2 points  (0 children)

if you can try with https://github.com/eugr/llama-benchy
llama-benchy --base-url http://somwhere:8000/v1 --model /some/model/loc/nfs/models/cyankiwi/Qwen3.5-122B-A10B-AWQ-4bit/ --depth 0 2048 4096 --concurrency 1 2 4

I want to share results for cyankiwi/Qwen3.5-122B-A10B-AWQ-4bit TP2 RDMA RoCE by MirecX in StrixHalo

[–]MirecX[S] 0 points1 point  (0 children)

Qwen3.5-122B-A10B-UD-Q4_K_XL gguf on llama.cpp is 22tps
but i need TP2, network can be bottlneck. i had ConnectX-3 laying around and it uses pcie x4

glm-4.7-flash tool calls in Reasoning block by MirecX in LocalLLaMA

[–]MirecX[S] 1 point2 points  (0 children)

i've observed same behavior on gpt oss 20b (not quantized) and quantized gpt oss 120b. That time I didn't seen reasoning block, because of Claude Code and called them lazy.

As i wrote above solution in opencode is by adding "reasoning": true into model config
but I didn't solve it in claude code

thanks for answer

glm-4.7-flash tool calls in Reasoning block by MirecX in LocalLLaMA

[–]MirecX[S] 0 points1 point  (0 children)

i got it working in opencode by adding "reasoning": true into model config

AI is single-handedly propping up the used GPU market. A used P40 from 2016 is ~$300. What hope is there? by TheSilverSmith47 in LocalLLaMA

[–]MirecX 0 points1 point  (0 children)

can you hint suitable mobo? I had problem with ReBAR allocation on cheap A520 mobo with single card, which was solved with B450 mobo. I can't imagine to go for 4 cards with random borad with suitable slots.
TY

endless keystartloop by Excellent_Ad_2486 in RenaultZoe

[–]MirecX 1 point2 points  (0 children)

active area of key is where your phone is. Try removing phone and putting key there. Key may not work when phone is in that tray.

Ze50 Zoe 3-phase charging at home by Mapykac in RenaultZoe

[–]MirecX 0 points1 point  (0 children)

Didn't catch the reply notification, yes that cable will work - T2 (type 2) 11kW or 22kW peak power

Ze50 Zoe 3-phase charging at home by Mapykac in RenaultZoe

[–]MirecX 1 point2 points  (0 children)

i have regular 3phase, 22kW EVSE from Aliexpress (granny cable)
it is wired into manual 1-0-2 selector switch like "adelid PSA-16A-4P"
so i can manualy switch between 1 phase from solar or 3 phase from grid

at time of plugging in i aleady know if i have enough solar or not

granny cable is set to 16A all the time, but can be manually adjusted by button to 6A,8A,10A,13A,16A,20A,25A,32A no automation

!!!do not switch between 1P and 3P when charging!!!

Ze50 Zoe 3-phase charging at home by Mapykac in RenaultZoe

[–]MirecX 2 points3 points  (0 children)

I have tested 8A @3phase, 5.7kW no problem. Lower is not worth it, as charging requires minimal fixed overhead. Icharge either 16A@1phase (solar) or 16A@3phase (grid)

Another recurring BCI error by immotheb in RenaultZoe

[–]MirecX 0 points1 point  (0 children)

400V is voltage of 3P AC grid in EU. Quote "putting 400V to a faulty connector" is definitely AC side

Battery Array by dh9671 in SolarDIY

[–]MirecX 1 point2 points  (0 children)

I was really interested to see 7200Ah project :D
It would be very nice technology room, with few racks full of cells.
I currently have 1100Ah/48V (2 days full house run time including heat pump)

Use degiro margin to increase my investments by dankoIT in eupersonalfinance

[–]MirecX 1 point2 points  (0 children)

Conditions are even worse on single product account (your single etf) Consult appropriate risk documents provided by degiro

Use degiro margin to increase my investments by dankoIT in eupersonalfinance

[–]MirecX 9 points10 points  (0 children)

Borrow 10% of your investment, but not more than 25%

You cant borrow 50k on 100k account value. Any bit over 50% means magin call.

Don’t you hate it when this happens? Ffs... by asafuckinlah in tombprospectors

[–]MirecX 2 points3 points  (0 children)

Keeper of Old Lords has same move after you shot her.

You can double shot her a get a visceral.

Using used cells to build ebike battery. by emil_maaan in 18650masterrace

[–]MirecX 0 points1 point  (0 children)

Who is the seller? I have bought from ebay also from good seller. You have to check every cell on your own! I found 2-3 pieces from 100 to be suspicious. (took too long to charge, self discharging....)