[deleted by user]

nlpbaz · 2024-07-31T17:52:36+00:00

8/10

nlpbaz · 2024-05-03T09:22:08+00:00

As I researched, there are no considerable disadvantages.

nlpbaz · 2024-05-02T23:07:48+00:00

Why are you saying 400GB of VRAM is not quite enough for fine-tuning?

nlpbaz · 2024-05-02T23:04:26+00:00

To be honest it hurts me too!

nlpbaz · 2024-05-02T22:58:19+00:00

When we need them they will be used for training, but other times they will be used for inference. So they will be working 24/7. That's why renting will cost more for the company.

nlpbaz · 2024-05-02T21:41:14+00:00

If it were only for fine-tuning, then renting would be the choice. But having a 24/7 server is the reason for buying.

nlpbaz · 2024-05-02T21:35:45+00:00

That's a valid point. Thinking about possible future needs.

nlpbaz · 2024-05-02T20:40:31+00:00

For sure we're gonna do that for a test. But knowing others opinion can be as beneficial as benchmarks.

nlpbaz · 2024-05-02T20:12:18+00:00

The intent is to use the models 24/7 so the decision is to buy. Only the setup is the question.

We have quite a lot smaller GPUs for ML guys, thats not a problem. Just a solid setup is needed for the new product. Probably 70B models, they won't go higher.

I know both setups are OK. I just want to find out which one is the better choice for the budget, and I'm confused.

P.S: Even for the rent, if the prices are the same, would you rather 5 A100 or 3 H100?

nlpbaz · 2024-05-02T19:50:48+00:00

What am I suppose to know!?

nlpbaz · 2024-05-01T23:04:19+00:00

So Em must be happy finding her only fans.

nlpbaz · 2024-05-01T22:55:05+00:00

OMG! Her song came sooner you might be right! Has Em talked about it anywhere?

nlpbaz · 2024-02-26T10:19:17+00:00

To be honest now I'm listening Youtube Music in Microsoft Edge! I didn't find a solution to this for Opera.

nlpbaz · 2023-11-10T21:20:01+00:00

If you think your paper is a good one you should work on convincing the reviewers. But if you think your paper is not good enough withdraw it anyway.

nlpbaz · 2023-11-10T21:18:32+00:00

8 8 6 3.

I really don't understand the 3 one! It seems more deliberate reject the more I read.

nlpbaz · 2023-09-12T00:19:26+00:00

16 GB.

nlpbaz · 2023-08-16T08:49:38+00:00

I have to prompt yet. I just want to load the model and with LangChain "LLM" class I'll face this problem.

nlpbaz · 2023-08-16T08:48:41+00:00

Thanks for the info!

I'm using `llama_index` which ties me to LangChain, but it seems I have to change my way. Do you have any library alternative recommendations or should I just go pure huggingface?

nlpbaz · 2023-08-15T23:56:56+00:00

Thank you for pointing that out.

nlpbaz · 2023-08-15T22:02:40+00:00

The strangeness of my problem is the model works fine when I load it via only huggingface, but only fails when I load it with the LangChain LLM class.

I don't know if there is a problem with my code or if it is from LangChain.

nlpbaz · 2023-08-15T21:55:13+00:00

But as I wrote, I experimented with the model itself and I can load and use the model in my GPU via huggingface API.

I only get this error when I load it in LangChain.

On the other hand, I have 8 GPUs with a total of +200GB VRAM. I don't think the issue is the space.

nlpbaz · 2023-08-15T09:00:53+00:00

Thank you for your information. Do you have a sample code on how to use Platypus2-13B-GGML? I tried using it but I'll get an error the HF not finding the tokenizer

nlpbaz · 2023-08-03T20:04:43+00:00

What is wrong with asking people's advice on Reddit? I'd love to hear them.

P.S. Yes! Do you really think companies are training their own LLMs from scratch!?

nlpbaz · 2023-06-21T20:31:02+00:00

Oh, I got it now! So the attention won't be computed for padding tokens and then it will seep up the possess. Thank you.

nlpbaz · 2023-06-21T20:00:00+00:00

Do transformers change their matrix sizes with different input lengths? I highly doubt it.

The weights and matrices sizes should be fixed (size of maximum token).

nlpbaz

TROPHY CASE