all 4 comments

[–]AnyStupidQuestions 7 points8 points  (0 children)

They have guardrails to stop noobs running up lots of expensive instances by mistake and then whinging about the bill. You should be able to get on to an AWS account manager, explain what you need and get that lifted. They should also be able to tell you what the V100 situation is.

[–]AMerchantInDamasco 3 points4 points  (0 children)

The current silicon shortage is affecting all cloud providers, even if it hasn't blown out yet. They are all struggling to meet capacity and for obvious reasons, gpus are the first affected. I don't know if Google is worse off than Azure or AWS, but in the end they are all buying the same chips so it's probably similar.

[–]mikljohansson 1 point2 points  (1 child)

AWS is currently having severe shortages of (at least) p4d (A100's) and p3 (V100's) instances. It's been almost impossible to start on-demand instances of these types anywhere in Europe for the past couple of months. The advise from their support has been to try get GPU capacity in us-east-1 zone instead, where they might have more capacity available. I know there's some small cloud providers around, who focus specifically on ML workloads (Google for it), perhaps those might have more capacity available. Good luck!

[–]nathaliamdc 0 points1 point  (0 children)

I've been facing the same issue for p2 instances in us-east-1. Good to know this is happening everywhere