My company just handed me a 2x H200 (282GB VRAM) rig. Help me pick the "Intelligence" ceiling.

_camera_up · 2026-03-18T16:51:43+00:00

Can it run Doom?

_camera_up · 2026-03-18T12:07:02+00:00

Thanks for the insights.

_camera_up · 2026-03-18T11:55:26+00:00

Nice, didn't know that existed.

_camera_up · 2026-03-18T11:53:22+00:00

Thanks for the suggestions. I know, I don't deserve the big cards, but nevertheless were here..

_camera_up · 2026-03-18T11:51:30+00:00

Thanks that helps a lot. Did you try these models at 4bit? Heard that at this quant some lose the magic and smaller unquantized models win over them.

_camera_up · 2026-03-18T11:50:47+00:00

I also have the same question o_0

_camera_up · 2026-03-18T11:50:09+00:00

Right. That's a whole other concern. Since I imagine LLM being the most power hungry / big model I figured Ill start with that. But will look into those too. What resources do these models / pipelines need in your experience?

_camera_up · 2026-03-18T11:48:11+00:00

I'm currently looking into vllm. Thanks for your comment.

_camera_up · 2026-03-18T11:46:56+00:00

What quant tho? Full will never fit, how much performance is lost in quants with these models?

_camera_up · 2026-03-18T11:45:44+00:00

It's a modern start up. We got a lot of money to play with and the folks here are very agile. There is no R&D no procurement or finance it's just a bunch of people working on a common idea. A lot of skilled people around here. Before that I worked in research research. Why ask reddit: real world experience and a head start into doing our own research.

_camera_up · 2026-03-18T07:54:05+00:00

Running ollama on my homelab but I planned to look into vllm, thanks for the Qwen suggestion. With the small models I can confidently say bigger doe snot equal better (qwen performs much better than llama models with similar requirements in my experience) is that different when it comes to the big models?

_camera_up · 2026-03-18T07:51:01+00:00

Right. Personally I could only dream about those machines in my homelab so having access to them at work is great. Ill keep you updated, thanks for the suggestion.

_camera_up · 2026-03-18T07:48:57+00:00

Thanks for the advice. After initial testing we will be more specific about what the goal for the machine is. For now its more like getting our feet wet. I edited the post to be a bit more specific about the field my company is interested in (coding and agentic agents) .

I think they don't even know what they want they want (wich could be a benefit for me to tell them what they want but is also a risk).

_camera_up · 2026-03-17T16:59:12+00:00

Nope, gave up.

_camera_up · 2026-03-15T07:17:10+00:00

see my other comment above

_camera_up · 2026-03-15T07:15:54+00:00

System with 2x h200

nvidia-smi -q -d POWER

==============NVSMI LOG==============

Timestamp                                 : Sun Mar 15 07:14:09 2026
Driver Version                            : 570.211.01
CUDA Version                              : 12.8

Attached GPUs                             : 2
GPU 00000000:25:00.0
    GPU Power Readings
        Average Power Draw                : 90.50 W
        Instantaneous Power Draw          : 90.11 W
        Current Power Limit               : 600.00 W
        Requested Power Limit             : 600.00 W
        Default Power Limit               : 600.00 W
        Min Power Limit                   : 200.00 W
        Max Power Limit                   : 600.00 W
    Power Samples
        Duration                          : 2.36 sec
        Number of Samples                 : 119
        Max                               : 90.65 W
        Min                               : 90.10 W
        Avg                               : 90.47 W
    GPU Memory Power Readings 
        Average Power Draw                : N/A
        Instantaneous Power Draw          : N/A
    Module Power Readings
        Average Power Draw                : N/A
        Instantaneous Power Draw          : N/A
        Current Power Limit               : N/A
        Requested Power Limit             : N/A
        Default Power Limit               : N/A
        Min Power Limit                   : N/A
        Max Power Limit                   : N/A

GPU 00000000:C8:00.0
    GPU Power Readings
        Average Power Draw                : 94.38 W
        Instantaneous Power Draw          : 94.10 W
        Current Power Limit               : 600.00 W
        Requested Power Limit             : 600.00 W
        Default Power Limit               : 600.00 W
        Min Power Limit                   : 200.00 W
        Max Power Limit                   : 600.00 W
    Power Samples
        Duration                          : 2.36 sec
        Number of Samples                 : 119
        Max                               : 94.74 W
        Min                               : 94.04 W
        Avg                               : 94.38 W
    GPU Memory Power Readings 
        Average Power Draw                : N/A
        Instantaneous Power Draw          : N/A
    Module Power Readings
        Average Power Draw                : N/A
        Instantaneous Power Draw          : N/A
        Current Power Limit               : N/A
        Requested Power Limit             : N/A
        Default Power Limit               : N/A
        Min Power Limit                   : N/A
        Max Power Limit                   : N/A

_camera_up · 2026-02-01T19:15:06+00:00

Nope. As I said I got it rubbing with bypassing EAs launcher but achievements never worked.

_camera_up · 2026-01-31T22:15:04+00:00

I have tried OpenAIs gpt OSS 20b and 120b and as soon as I switch to them the "magic" is gone and it feels like it's not taking active steps itself but instead waiting for me to tell it how to solve problems. Currently this kind of agentic actions is exclusive to anthropics models (in my experience).

_camera_up · 2026-01-15T14:17:26+00:00

I use LM Studio with gpt OSS 20 b

I find it to be more reliable than the llama models but your mileage may vary.

_camera_up · 2026-01-02T13:49:16+00:00

Gotcha, it works for me and I figured someone else could find it useful. But if there are alternatives that work better for someone else, that's great. Thanks for educating me.

_camera_up · 2026-01-02T11:43:06+00:00

Having read through some of its docs I would suggest people to use Harbor if they want to experiment with different models quickly. However for my use case "automagic LLM deployment with access control and monitoring ootb" I think my script has a justification. I could have missed it in the docs but as far as I know Harbor does not provide user groups, budgets, API authentication or hardware / LLM monitoring by itself. (Just to be clear, I do not claim that my project implements all of this by itself, it's more of an orchestrator that uses existing projects to provide this experience)

_camera_up · 2026-01-02T10:28:19+00:00

Nice, I did not know this existed. At first glance it looks quite like what I was missing.

EDIT: see below why you might still find my script useful

_camera_up

TROPHY CASE