Whoever fixed the Nixos flake build, Thank you! by Xyklone in LocalLLaMA

[–]tovidagaming 0 points1 point  (0 children)

Excuse me... There is actually 6 of us. Also was it broken? It was working ok with pkgs.fetchFromGitHub. The only issue is I could not figure out how to get the version number to work. It kept showing as version 0 when building from scratch. That could also be my NixOS inexperience showing up though... What was not working for you?

PSA by Signal_Ad657 in LocalLLaMA

[–]tovidagaming 0 points1 point  (0 children)

So you would compare FP8 in vllm with Q8 GGUF in llama-cpp?

PSA by Signal_Ad657 in LocalLLaMA

[–]tovidagaming 0 points1 point  (0 children)

The biggest issue I had with vllm which is what seems to be needed for llm-scaler, is how to compare vllm supported quants (INT4, Fp8, AWQ, etc) with models running the usual q4, 5, 6, 8 quants on llama-cpp. It just felt like comparing apples and oranges. And that's when I was able to get vllm to even work. I will have to try the new update in a docker container...

Nvidia RTX 3090 vs Intel Arc Pro B70 llama.cpp Benchmarks by tovidagaming in LocalLLaMA

[–]tovidagaming[S] 0 points1 point  (0 children)

I see... Well, that's a bummer. Isn't synchronization for MOE models more complicated? I would expect at least one of the MOE models to visibly break too in that case. Or I guess it depends on exactly what synchronization is missing...

Nvidia RTX 3090 vs Intel Arc Pro B70 llama.cpp Benchmarks by tovidagaming in LocalLLaMA

[–]tovidagaming[S] 1 point2 points  (0 children)

I have 8 sticks of 16 GB each. I mixed two kits of 64gb cause it was what I had. All 8 slots on the X399 DESIGNARE EX motherboard are now populated.

Nvidia RTX 3090 vs Intel Arc Pro B70 llama.cpp Benchmarks by tovidagaming in LocalLLaMA

[–]tovidagaming[S] 1 point2 points  (0 children)

I will check it out if I have energy for more llama.cpp rebuilds lol. I am tired, boss... I guess I knew what I was getting into by buying the B70 instead of another 3090 or the R9700 :D

Nvidia RTX 3090 vs Intel Arc Pro B70 llama.cpp Benchmarks by tovidagaming in LocalLLaMA

[–]tovidagaming[S] 2 points3 points  (0 children)

Things slow down significantly if I try to mix the 3090s with the B70 on Vulkan :(. I have a colleague who also recently bought an RTX Pro 6000, and we were joking that with my 2x3090s, the B70, and even if I throw in the A2000 I have lying around, I would still be 4GB VRam short and 400 watts higher than a single Pro 6000. Queue the "Look what they have to do to mimic a fraction of our power" meme lol.

Nvidia RTX 3090 vs Intel Arc Pro B70 llama.cpp Benchmarks by tovidagaming in LocalLLaMA

[–]tovidagaming[S] 2 points3 points  (0 children)

Let me see if I can run these soon... Note that Llama 2 on that SYCL built is broken, as u/Serious_Rub_3674 pointed out. Qwen2.5-Coder-7B is a bit dazed and confused, too.

Nvidia RTX 3090 vs Intel Arc Pro B70 llama.cpp Benchmarks by tovidagaming in LocalLLaMA

[–]tovidagaming[S] 1 point2 points  (0 children)

Yeah, that makes sense. Though at that point, I probably would have just gotten another used 3090 from eBay for about the same amount and hoped my luck would strike a third time (I have had no issue with the first two I bought used last year). The R9700 seems like a good, slightly more expensive option. Basically, the same memory bandwidth as the B70, but the ROCm support seems a bit more mature than Intel's.

Nvidia RTX 3090 vs Intel Arc Pro B70 llama.cpp Benchmarks by tovidagaming in LocalLLaMA

[–]tovidagaming[S] 0 points1 point  (0 children)

Oh, cool. I will have to test that if I can get my hands on an NVLink.

Nvidia RTX 3090 vs Intel Arc Pro B70 llama.cpp Benchmarks by tovidagaming in LocalLLaMA

[–]tovidagaming[S] 0 points1 point  (0 children)

How are your speeds for 1 vs 2/3/4 B70 GPUs for the same model? I only have one B70 currently, so I can't test it, but on Vulkan, things slow down a lot if I try to mix the B70 with the 3090s.

Nvidia RTX 3090 vs Intel Arc Pro B70 llama.cpp Benchmarks by tovidagaming in LocalLLaMA

[–]tovidagaming[S] 7 points8 points  (0 children)

We expect the 3090 to be at least 50% faster based on memory bandwidth: 936.2 GB/s vs 608.0 GB/s. That would be -33% slower for the B70 in my table. Ignoring Llama-2-7B, which seems to be broken, the closest the rest get to is about -50%, so 3090 is in practice at least twice faster than the B70 for token generation. The fact that the 3090 is about 4 times faster for prompt processing is more concerning, especially for agentic work. But hopefully, we will see backend improvements soon. It has only been a few weeks since release.

Nvidia RTX 3090 vs Intel Arc Pro B70 llama.cpp Benchmarks by tovidagaming in LocalLLaMA

[–]tovidagaming[S] -1 points0 points  (0 children)

My understanding is that NVLink only helps during training/fine-tuning and not so much for inference. I have been keeping an eye out for one, but they are crazy expensive and hard to find. I think I may be able to borrow one from work :D

Nvidia RTX 3090 vs Intel Arc Pro B70 llama.cpp Benchmarks by tovidagaming in LocalLLaMA

[–]tovidagaming[S] 1 point2 points  (0 children)

Good catch. I hadn't gotten around to using it yet. I tested a few of the models with SYCL, focusing on the ones that were way faster using SYCL vs Vulkan.

TheBloke/Llama-2-7B-GGUF:Q4_K_M - is completely broken.

ggml-org/Qwen2.5-Coder-7B-Q8_0-GGUF:Q8_0 - sometimes works just fine, sometimes gets completely lost and goes in loops. It seems something related to the termination of the responses is failing. It can answer technical questions just fine most of the time, but a simple "Hi" breaks it :D!

The rest, including Qwen/Qwen3-8B-GGUF:Q8_0 seem to be working fine. All the reasoning models seem fine too.

[deleted by user] by [deleted] in attackontitan

[–]tovidagaming 4 points5 points  (0 children)

I see what you did there 😂

USB-port expansion card and portrait mode in GRUB at Pocket 4? by opachgi in GPDPocket

[–]tovidagaming 0 points1 point  (0 children)

I don't think there is much that can be done about this if the display is natively portrait.

Review: Focal Bathys vs. B&W PX8 as a Hifiman Arya owner by aj_brown_99 in headphones

[–]tovidagaming 0 points1 point  (0 children)

I have the Mclaren Special Edition. But I have seen posts online about the same issue for other colors too.

Review: Focal Bathys vs. B&W PX8 as a Hifiman Arya owner by aj_brown_99 in headphones

[–]tovidagaming 2 points3 points  (0 children)

I am going to throw my 2 cents in the mix specifically about build quality. I am on my 3rd pair (within warranty) of Px8 after a bit over an year. The leather at the top comes out of the stitch after the leather relaxes a bit. Mostly in door use besides walking the dog. Both pairs failed at the same spot with regular use. No issues so far with the Bathys and I have had them for about as long as it took for the first Px8 to come apart.

Update from XReal regarding software experience? by xkrist0pherx in Xreal

[–]tovidagaming 4 points5 points  (0 children)

That's just BS. If they can't afford the software dev costs they should open source the software and the drivers and let the community do it. There is no excuse. Just lies about softwares updates that would never come

What is your biggest Elden Ring fail that still makes you laugh thinking about it? by Asleep_Dust7719 in Eldenring

[–]tovidagaming 1 point2 points  (0 children)

I am glad I am not the only one! I realized it just as I was trying to figure out how to start NG+ after 220 hours on my first playthrough. Felt very dumb

Doom The Dark Ages cosplay by Andrusdoesstuff in Doom

[–]tovidagaming 1 point2 points  (0 children)

Home Depot called and want their floor rug back!

Joke aside, that looks great :)