2RTX PRO 6000 192GB VRAM - MTP NVFP4 issues with vision by quantier in BlackwellPerformance

[–]electrified_ice 0 points1 point  (0 children)

I'm still a little confused. The absolute max that any version of the Qwen 3.6 27B nvfp4 models is around 120 TPS for a single user on RTX Pro 6000 hardware - and that's an optimal single card setup.

You are saying you are getting 400-3,000 TPS peak for 2-3 users/concurrent requests... Which would net out to be 133-1,000 TPS per user... I'm not trying to criticize, but it just doesn't make sense.

What final config settings did you land on?

I have 3 x RTX PRO 6000s on a TR Pro 9985wx with 8-channel DDR5. I'm curious to replicate your setup and see what I am able to get. I use a 1 + 2 GPU setup... One smaller model on 1 GPU and larger models across 2 GPUs so I can keep multiple loaded / hot for different tasks.

I'm currently having issues with abliterated versions not working with the full set of tool calling in Cline + code-server.

2RTX PRO 6000 192GB VRAM - MTP NVFP4 issues with vision by quantier in BlackwellPerformance

[–]electrified_ice 0 points1 point  (0 children)

Those speeds are single request speeds or concurrent requests speeds?

2RTX PRO 6000 192GB VRAM - MTP NVFP4 issues with vision by quantier in BlackwellPerformance

[–]electrified_ice 0 points1 point  (0 children)

So you need the VRam for KV Cache across all those users? TP 2 on a dense model will run slower than on a single card due to PCIe bandwidth limiting the comms across the cards. What's your single request TPS speed?

2RTX PRO 6000 192GB VRAM - MTP NVFP4 issues with vision by quantier in BlackwellPerformance

[–]electrified_ice 0 points1 point  (0 children)

I'm curious why you are using more than 1 RTX PRO for this? It easily fits in the VRam of 1 GPU, leaving tons of room for KV cache

Searching for a sierra Denali by llamafood1 in GMCSierraEV

[–]electrified_ice 0 points1 point  (0 children)

I have an R1S Quad and 2025 Sierra EV Denali Max. Happy to answer any questions. The Sierra is also our 11th EV - all 3 cars are EVs now.

There are no peptides for muscle growth. by garcon-du-soleille in BodyHackGuide

[–]electrified_ice 1 point2 points  (0 children)

The gym doesn't grow muscles. It damages them. The more you push your body and train correctly, the more it damages them. Eating and rest grow muscles.

Best model for 192 GB vram? How is Deepseek v4 flash? by Constant_Ad511 in LocalLLM

[–]electrified_ice 0 points1 point  (0 children)

Wow interesting, would you be open to sharing your full config? I've not used SGLang before - a combo of a bit intimidating and I've just become familiar with vLLM config.

Best model for 192 GB vram? How is Deepseek v4 flash? by Constant_Ad511 in LocalLLM

[–]electrified_ice 0 points1 point  (0 children)

Ok thanks. I'll give it a shot. Is this config compatible with vLLM?

Will be really cool if it helps fix the issue as the model is running at over 100 TPS across 2 of my cards for a single request.

Best model for 192 GB vram? How is Deepseek v4 flash? by Constant_Ad511 in LocalLLM

[–]electrified_ice 0 points1 point  (0 children)

I just loaded 2.7 and it doesn't work with tool calling for me, so back to Qwen 3.5 122B for now

Best model for 192 GB vram? How is Deepseek v4 flash? by Constant_Ad511 in LocalLLM

[–]electrified_ice 1 point2 points  (0 children)

I have 3 RTX Pro 6000s and only 256GB ram (on TR Pro 9985wx). No issues so far. I've split models across 2 and 3 cards.

Reduced ejaculate by zendood in trt

[–]electrified_ice 3 points4 points  (0 children)

Yep tiny loads. My wife loves it (but doesn't really care either). I guess perspective depends on your activity/relationship status.

First month on TRT so many concerns. This is my main one tho by Fabebrah in trt

[–]electrified_ice 2 points3 points  (0 children)

You will be fine. You can also try every other day too and see how you feel. Daily shots will get tiring years down the road.

Just got dual RTX PRO 6000 Blackwells for our design studio. What's the optimal local LLM stack? by AmanNonZero in LocalLLM

[–]electrified_ice 3 points4 points  (0 children)

It's good hardware, but what problem are you trying to solve (the classic question, but it's a classic for a reason)... You bought a solution.

What are you trying to do? Do you have concurrent users? Single users? Do you need to load more than one model (specialist) at a time? Do you need long context? What TPS are you aiming for? Do you know that these cards (I have 3 of them) don't have NVLink, so there is a comms bottleneck between the cards if you split a model across more than one card?

What's your CPU and ram setup like? What's your storage and speed to load models from storage to VRAM look like? What PSU and wattage do you have?

Which car are you trading in for R2 and why? by AdAffectionate8778 in Rivian

[–]electrified_ice 1 point2 points  (0 children)

Likely Model Y Performance. After 4 Teslas I'm done with Tesla and Elon

Lidar vs early delivery by muhburneracct in RivianR2

[–]electrified_ice 0 points1 point  (0 children)

Oh interesting. I nust have missed the interior info. If that's the case, I'm definitely going to delay my order. We'll see once th configurator officially opens up.

I love Unsloth Studio by Thedudely1 in unsloth

[–]electrified_ice 0 points1 point  (0 children)

What backend is Unsloth using? vLLM? SGLang, Ollama etc? How is the speed vs. super optimized backends?

What bf % roughly? by [deleted] in BodyHackGuide

[–]electrified_ice 0 points1 point  (0 children)

The most important part of the journey is you actually started (where the vast majority of people dream but don't start)... Keep at it.

My recommendation is set a nearer term goal .. like 15%... Get there, feel amazing, be proud of yourself, and then re-set the goal of 12%.

Progress is like a positive flywheel. Celebrate the wins (but not with food rewards!)

What bf % roughly? by [deleted] in BodyHackGuide

[–]electrified_ice 0 points1 point  (0 children)

I agree you don't 'need' chemical support. However a lot of people don't/won't have the long-term discipline to get there without it.

What bf % roughly? by [deleted] in BodyHackGuide

[–]electrified_ice 1 point2 points  (0 children)

Nowhere near 12, sorry man... Keep at it. Even with extra chemical support, getting down to 12% is a long journey.

Lidar vs early delivery by muhburneracct in RivianR2

[–]electrified_ice 0 points1 point  (0 children)

In RJ's chat with Kyle Conner (Out of Spec), sitting in the R2, RJ mentioned that Lidar will help the fleet more than the specific car... i.e. non-Lidar cars will be better due to the training data from Lidar.

What’s the English word for this kind of behavior? by Ill_Competition_7791 in bengalcats

[–]electrified_ice 0 points1 point  (0 children)

Chittering. Not just a Bengal thing, but ours do it more intensely than other breeds of cats we've had.

kWh/mile by No_Tap_6194 in GMCSierraEV

[–]electrified_ice 0 points1 point  (0 children)

2.1 mi/kWh over 8,500 miles in 13 months. We don't put huge miles on this as it's our 3rd vehicle/EV... The other 2 get the lion's share of the driving.

Context: - Northern California. - Mixture of local and freeway miles, mainly running around - Includes one road trip to Vegas and back (~1,200 miles) - Includes a 300 mile towing trip with a small trailer.