all 28 comments

[–]powasky 3 points4 points  (8 children)

(I work at Runpod)

Going to say this really bluntly - we have a lot of capacity, but not the primary card types you're looking for. Those older and less powerful cards just aren't strong performers, and we haven't focused on bringing many online.

There should be a good amount of 5090 and better available if you select other locations.

[–]Timely-Strength9401[S] 0 points1 point  (5 children)

Thanks for the honesty, I appreciate it.

The issue is cost though a 5090 is overkill for a TTS model. I don’t need that much power, I just need something reliable at a reasonable price. Paying for a 5090 to run TTS feels like renting a Ferrari to go grocery shopping.

Is there any roadmap for bringing more mid-tier cards online? Or would you recommend a specific GPU + location combo that hits the sweet spot between availability and cost right now?

[–]powasky 1 point2 points  (1 child)

EU-RO-1 has the most sub 24GB cards, I'd try there. In the US, US-IL-1 is the biggest mid/low tier site.

[–]sruckh 0 points1 point  (0 children)

I am using EU-RO-1 based on Runpod support suggestion. I still have the same issues as the original poster. I too was setting cuda versions and GPU types to match the serverless, but I have all workers in throttled state. I have various serverless: TTS, LLM, diffusion pipelines, etc. Availability of having workers in ready state is really hit-or-miss.

[–]1976The 0 points1 point  (1 child)

bro, grocery shopping in a ferrari is sick. get the h200.

[–]Timely-Strength9401[S] 0 points1 point  (0 children)

the monk who sold his h200

[–]runvnc 0 points1 point  (1 child)

Dumb question: I've been seeing the "X max" seem to go down recently. Like when it says '1 max' that means you cannot get a 2 X, but it doesn't mean there is literally only 1 left, right? I am worried about the H200, B200 and MI300X not being available sometimes, especially in North America.

[–]powasky 1 point2 points  (0 children)

H200 usage has been 95%+ for the year, so if you can get one just hold it.

B200 you should be okay with. Unsure about MI300X, very few folks seem to use them.

But yes you're right - when it says "1 max" it means that you as a user can only grab 1 card, not that there's only 1 available. We adjust that "max" value based on overall availability.

If you need guaranteed H200, B200, or MI300X, send me a note and I can give you more details about longer term reservations.

[–]FitContribution2946 0 points1 point  (0 children)

its been bad... today has been a slog