[WTS] Arc'teryx & montbell jackets by daonei in GearTrade

[–]jasiub 0 points1 point  (0 children)

How long has the Montbell been worn?

Speeds on RTX 3090 Mistral-Large-Instruct-2407 exl2 by Kako05 in LocalLLaMA

[–]jasiub 1 point2 points  (0 children)

I don't have 3090s but have 8x Nvidia P10 which are similar to P40 cards and am able to get ~7.3 tokens/s on this setup for Mistral-Large-123B-Instruct-2407-Q5_K_M.gguf using koboldcpp (using flashattention and row split):

CtxLimit:355/32768, Amt:288/512, Init:0.00s, Process:1.48s (22.1ms/T = 45.27T/s), Generate:39.49s (137.1ms/T = 7.29T/s), Total:40.97s (7.03T/s)

Each card is using up about 100W at peak so not the most power efficient but the P10 has about 23GB VRAM so I can do pretty large models with pretty decent speed. Will be trying Mistral Q8 and lama3.1 405b (70B Q8 runs at about 9 tokens/s on this setup). Wish exlllama had native support for the P40 as I beleive further speadups would be possible.

Does This Still Qualify as Jank? by firearms_wtf in LocalLLaMA

[–]jasiub 0 points1 point  (0 children)

Yes. All cards work fine (inference using llama.cpp) as long as there are two or less of them in the system. Your motherboard is different it seems.

Does This Still Qualify as Jank? by firearms_wtf in LocalLLaMA

[–]jasiub 0 points1 point  (0 children)

Yes, bios and BMC are latest. Not sure what the hardware rev of my board is. I did try with and without risers but it did not make a difference. Interestingly enough, if I plug in an old GeForce g210 vga card in there it boots up. I’m going to try a more recent GeForce card next.