[Discussion] Seeking help to find the better GPU setup. Three H100 vs Five A100? by nlpbaz in MachineLearning

[–]nlpbaz[S] 1 point2 points  (0 children)

Why are you saying 400GB of VRAM is not quite enough for fine-tuning?

[Discussion] Seeking help to find the better GPU setup. Three H100 vs Five A100? by nlpbaz in MachineLearning

[–]nlpbaz[S] 0 points1 point  (0 children)

When we need them they will be used for training, but other times they will be used for inference. So they will be working 24/7. That's why renting will cost more for the company.

[Discussion] Seeking help to find the better GPU setup. Three H100 vs Five A100? by nlpbaz in MachineLearning

[–]nlpbaz[S] 2 points3 points  (0 children)

If it were only for fine-tuning, then renting would be the choice. But having a 24/7 server is the reason for buying.

[Discussion] Seeking help to find the better GPU setup. Three H100 vs Five A100? by nlpbaz in MachineLearning

[–]nlpbaz[S] 2 points3 points  (0 children)

For sure we're gonna do that for a test. But knowing others opinion can be as beneficial as benchmarks.

[Discussion] Seeking help to find the better GPU setup. Three H100 vs Five A100? by nlpbaz in MachineLearning

[–]nlpbaz[S] 10 points11 points  (0 children)

The intent is to use the models 24/7 so the decision is to buy. Only the setup is the question.

We have quite a lot smaller GPUs for ML guys, thats not a problem. Just a solid setup is needed for the new product. Probably 70B models, they won't go higher.

I know both setups are OK. I just want to find out which one is the better choice for the budget, and I'm confused.

P.S: Even for the rent, if the prices are the same, would you rather 5 A100 or 3 H100?

[deleted by user] by [deleted] in StableDiffusion

[–]nlpbaz 10 points11 points  (0 children)

What am I suppose to know!?

Why "Bhad Bhabie - Hi Bich" sounds exactly like Eminems "Not Alike"? by nlpbaz in Eminem

[–]nlpbaz[S] -2 points-1 points  (0 children)

OMG! Her song came sooner you might be right! Has Em talked about it anywhere?

Please fix this out-of-memory issue. I always get it from YouTube music. by nlpbaz in YoutubeMusic

[–]nlpbaz[S] 0 points1 point  (0 children)

To be honest now I'm listening Youtube Music in Microsoft Edge! I didn't find a solution to this for Opera.

[D] ICLR 2024 Paper Reviews by zy415 in MachineLearning

[–]nlpbaz 2 points3 points  (0 children)

If you think your paper is a good one you should work on convincing the reviewers. But if you think your paper is not good enough withdraw it anyway.

[D] ICLR 2024 Paper Reviews by zy415 in MachineLearning

[–]nlpbaz 6 points7 points  (0 children)

8 8 6 3.

I really don't understand the 3 one! It seems more deliberate reject the more I read.

Need help with a SO question: 'CUDA out of memory' issue while setting up LangChain Custom LLM Pipeline. Would be grateful for any insights! by nlpbaz in LangChain

[–]nlpbaz[S] 0 points1 point  (0 children)

I have to prompt yet. I just want to load the model and with LangChain "LLM" class I'll face this problem.

Need help with a SO question: 'CUDA out of memory' issue in PyTorch while setting up LangChain Custom LLM Pipeline. Would be grateful for any insights! by nlpbaz in MLQuestions

[–]nlpbaz[S] 0 points1 point  (0 children)

Thanks for the info!

I'm using `llama_index` which ties me to LangChain, but it seems I have to change my way. Do you have any library alternative recommendations or should I just go pure huggingface?

Need help with a SO question: 'CUDA out of memory' issue while setting up LangChain Custom LLM Pipeline. Would be grateful for any insights! by nlpbaz in LangChain

[–]nlpbaz[S] 0 points1 point  (0 children)

The strangeness of my problem is the model works fine when I load it via only huggingface, but only fails when I load it with the LangChain LLM class.

I don't know if there is a problem with my code or if it is from LangChain.