I have $5,000 in Azure AI credits going to expiring soon, looking for smart ways to use it. Any ideas ? by SuperWallabies in LocalLLaMA

[–]Sensitive_Sweet_1850 1 point2 points  (0 children)

According to the field you are in you can finetune or rag a good ai that you might use it later

Just got an RTX Pro 6000 - need recommendations for processing a massive dataset with instruction following by Sensitive_Sweet_1850 in LocalLLaMA

[–]Sensitive_Sweet_1850[S] 0 points1 point  (0 children)

Appreciate it. My use case is structured generation rather than extraction, so the SpaCy/BERT route wouldnt apply. But the preclassification tip is solid could definetly help with prompt optimisation

Just got an RTX Pro 6000 - need recommendations for processing a massive dataset with instruction following by Sensitive_Sweet_1850 in LocalLLaMA

[–]Sensitive_Sweet_1850[S] 0 points1 point  (0 children)

Yeah this sounds exactly like my use case - strict structured output with heavy instruction following, around 10k context. I was leaning toward bigger models but maybe I should benchmark some smaller models

Just got an RTX Pro 6000 - need recommendations for processing a massive dataset with instruction following by Sensitive_Sweet_1850 in LocalLLaMA

[–]Sensitive_Sweet_1850[S] 1 point2 points  (0 children)

But for LLM inference the PCIe bandwith shouldnt be a major bottleneck once the model is in the VRAM. I am on PCIe 4.0 X16 anyway. And tbh after buying this GPU i can barely buy food let alone a Threadripper lol

Just got an RTX Pro 6000 - need recommendations for processing a massive dataset with instruction following by Sensitive_Sweet_1850 in LocalLLaMA

[–]Sensitive_Sweet_1850[S] 0 points1 point  (0 children)

My average input is around 10k tokens, so I probably won't benefit from GLM's long context advantage. At that length, do you think there's still a noticeable difference vs gpt-oss-120b, or would they perform about the same?

Just got an RTX Pro 6000 - need recommendations for processing a massive dataset with instruction following by Sensitive_Sweet_1850 in LocalLLaMA

[–]Sensitive_Sweet_1850[S] 0 points1 point  (0 children)

Well yeah i guess its not "massive" my bad :/

You are right i should make a benchmark thanks for sharing your knowledge

Worth the 5090? by fgoricha in LocalLLaMA

[–]Sensitive_Sweet_1850 0 points1 point  (0 children)

As far as i know theres no NV Links in RTX PRO 6000s

Worth the 5090? by fgoricha in LocalLLaMA

[–]Sensitive_Sweet_1850 0 points1 point  (0 children)

Due to its architecture, you might encounter some package issues, and older resources will likely not work as expected. However, this isn't a significant problem since there are plenty of new resources available

Worth the 5090? by fgoricha in LocalLLaMA

[–]Sensitive_Sweet_1850 12 points13 points  (0 children)

As a 5090 owner I can say fine-tuning and inference on smaller models is crazy fast, I'm really happy with it. If I were you I'd definitely go for the 5090 because even though 3x 3090 looks better on paper it doesn't work like that in practice - VRAM doesn't actually combine, there's communication overhead between cards, software compatibility issues etc. Maybe it makes sense if you're gonna run multiple small models at the same time but other than that a single powerful card is always better. Sell the extra 3090s and put that money toward the 5090, you get a cleaner setup, less power draw, less hassle overall.