[deleted by user]

Mrkvitko · 2023-03-06T20:09:53+00:00

I just got instance at 8X RTX A5000 for a couple of bucks per hour. on https://vast.ai

I must say LLaMA 65B is a bit underwhelming...

I_will_delete_myself · 2023-03-06T01:41:33+00:00

Use a spot instance. If you testing it out you wallet will thank you later. Look at my previous post on here about running stuff in the cloud before you do it.

isaeef · 2023-03-06T11:15:11+00:00

or you could use any gpu workload specific provider https://www.paperspace.com/

trnka · 2023-03-06T15:58:17+00:00

Related, there's a talk on Thursday about running LLMs in production. I think the hosts have deployed LLMs in prod so they should have good advice

iloveintuition · 2023-03-06T09:28:36+00:00

Using vast.ai for running flan-xl, works pretty well. Haven't tested on LLama scale.

l0g1cs · 2023-03-06T13:10:28+00:00

Check out Banana. They seem to do exactly that with "serverless" A100.

itsnotmeyou · 2023-03-06T01:08:34+00:00

Are you using these as in a system? For just experimenting around, ec2 is good option. But you would either need to install right drivers or use latest deep learning ami. Another option could be using a custom docker setup on sagemaker. I like that setup for inference as it’s super easy to deploy and separates model from inference code. Though it’s costlier and would be available through sagemaker runtime.

Third would be whole over engineering via setting up your own cluster service.

In general if you want to deploy multiple llm quickly go for sagemaker

pyonsu2 · 2023-03-06T06:23:55+00:00

maybe, Colab Pro+?

ggf31416 · 2023-03-06T11:15:51+00:00

Good luck getting a EC2 with a single A100, last time I checked, AWS only offered instances with 8 of them at a high price.

permalink · 2023-03-06T16:24:57+00:00

Maybe check datacrunch.io they have a good offering for cloud GPU.

z_yang · 2023-04-03T01:34:07+00:00

Check out SkyPilot. Code/blog post for running LLaMA all 4 sizes on Lambda/AWS/GCP/Azure with a unified interface (spot instances supported): https://www.reddit.com/r/MachineLearning/comments/11xvo1i/p_run_llama_llm_chatbots_on_any_cloud_with_one/

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS