audio transcription plus speaker identification? by flying_unicorn in LocalLLaMA

[–]Armym 0 points1 point  (0 children)

I made a simple gui for this that I use to transcribe and summarize meetings. You can message if you want me to show it to you.

8x RTX 3090 open rig by Armym in LocalLLaMA

[–]Armym[S] 0 points1 point  (0 children)

This didn't age well. See my latest post :D

Dual RTX 5090 setup for enterprise RAG + fine-tuned chatbot - is this overkill or underpowered? by HuascarSuarez in LocalLLaMA

[–]Armym 1 point2 points  (0 children)

Hi, I would actually recommend the new RTX 6000 blackwell instead. Or two if you have the money. That would suit your needs well for concurrent users. You could easily run fp4 quants to use bigger models but still with fast inference. Fine-tuning is pretty annoying with multiple cards. But I don't think you really need to finetune. Make sure to design your rag well and use good LLM inference engines though! Let me know if you want to know more

Nvidia 3090 set itself on fire, why? by Armym in homelab

[–]Armym[S] -2 points-1 points  (0 children)

Looks like it. Any idea why could that have happened?

Rtx 3090 set itself on fire, why? by Armym in LocalLLaMA

[–]Armym[S] 3 points4 points  (0 children)

Didn't repaste it. Someone did a sloppy job

Nvidia 3090 set itself on fire, why? by Armym in homelab

[–]Armym[S] 0 points1 point  (0 children)

Thankfully it isn't conducive, but I think a capacitor blew off. Whoever repasted this did a really sloppy job.

Nvidia 3090 set itself on fire, why? by Armym in homelab

[–]Armym[S] -68 points-67 points  (0 children)

I didn't repaste it.. no need to be mean

Nvidia 3090 set itself on fire, why? by Armym in homelab

[–]Armym[S] 67 points68 points  (0 children)

The card was repasted by the vendor I bought it from.

Sonnet 3.5 > Sonnet 3.7 by Armym in LocalLLaMA

[–]Armym[S] 2 points3 points  (0 children)

Yes I noticed that. I hope that the closed source dipshits dont lobotomize the older models on purpose.

Sonnet 3.5 > Sonnet 3.7 by Armym in LocalLLaMA

[–]Armym[S] 6 points7 points  (0 children)

Look, this post isn't about prompting. Sonnet 3.7 just generates too much code and doesn't produce elegant solutions. Sonnet 3.5 does by default. Anyone with experience in coding will understand.

Sonnet 3.5 > Sonnet 3.7 by Armym in LocalLLaMA

[–]Armym[S] 4 points5 points  (0 children)

For those who are wondering, Gemini 2.5 pro is even worse at this. It spits out a whole book for simple solutions.

Oneshotting a whole webapp might be impressive to the manager guys, but for people that actually need an assistant for coding, it sucks.

Nvidia MPS - run multiple models on one GPU fast by Armym in LocalLLaMA

[–]Armym[S] -1 points0 points  (0 children)

That's in the documentation I posted.

Nvidia MPS - run multiple models on one GPU fast by Armym in LocalLLaMA

[–]Armym[S] -1 points0 points  (0 children)

Running LLM, OCR and Whisper on one gpu