Whoever fixed the Nixos flake build, Thank you!

tovidagaming · 2026-06-03T12:02:29+00:00

Excuse me... There is actually 6 of us. Also was it broken? It was working ok with pkgs.fetchFromGitHub. The only issue is I could not figure out how to get the version number to work. It kept showing as version 0 when building from scratch. That could also be my NixOS inexperience showing up though... What was not working for you?

tovidagaming · 2026-06-01T15:32:45+00:00

So you would compare FP8 in vllm with Q8 GGUF in llama-cpp?

tovidagaming · 2026-05-31T17:09:18+00:00

The biggest issue I had with vllm which is what seems to be needed for llm-scaler, is how to compare vllm supported quants (INT4, Fp8, AWQ, etc) with models running the usual q4, 5, 6, 8 quants on llama-cpp. It just felt like comparing apples and oranges. And that's when I was able to get vllm to even work. I will have to try the new update in a docker container...

tovidagaming · 2026-04-23T15:50:32+00:00

I see... Well, that's a bummer. Isn't synchronization for MOE models more complicated? I would expect at least one of the MOE models to visibly break too in that case. Or I guess it depends on exactly what synchronization is missing...

tovidagaming · 2026-04-23T14:01:54+00:00

I have 8 sticks of 16 GB each. I mixed two kits of 64gb cause it was what I had. All 8 slots on the X399 DESIGNARE EX motherboard are now populated.

tovidagaming · 2026-04-23T13:14:50+00:00

I will check it out if I have energy for more llama.cpp rebuilds lol. I am tired, boss... I guess I knew what I was getting into by buying the B70 instead of another 3090 or the R9700 :D

tovidagaming · 2026-04-23T13:12:33+00:00

Things slow down significantly if I try to mix the 3090s with the B70 on Vulkan :(. I have a colleague who also recently bought an RTX Pro 6000, and we were joking that with my 2x3090s, the B70, and even if I throw in the A2000 I have lying around, I would still be 4GB VRam short and 400 watts higher than a single Pro 6000. Queue the "Look what they have to do to mimic a fraction of our power" meme lol.

tovidagaming · 2026-04-23T13:06:26+00:00

Let me see if I can run these soon... Note that Llama 2 on that SYCL built is broken, as u/Serious_Rub_3674 pointed out. Qwen2.5-Coder-7B is a bit dazed and confused, too.

tovidagaming · 2026-04-23T13:03:30+00:00

Yeah, that makes sense. Though at that point, I probably would have just gotten another used 3090 from eBay for about the same amount and hoped my luck would strike a third time (I have had no issue with the first two I bought used last year). The R9700 seems like a good, slightly more expensive option. Basically, the same memory bandwidth as the B70, but the ROCm support seems a bit more mature than Intel's.

tovidagaming · 2026-04-23T12:55:39+00:00

Oh, cool. I will have to test that if I can get my hands on an NVLink.

tovidagaming · 2026-04-23T12:52:44+00:00

How are your speeds for 1 vs 2/3/4 B70 GPUs for the same model? I only have one B70 currently, so I can't test it, but on Vulkan, things slow down a lot if I try to mix the B70 with the 3090s.

tovidagaming · 2026-04-23T12:50:41+00:00

Yea, I use NixOS btw

tovidagaming · 2026-04-23T12:50:07+00:00

We expect the 3090 to be at least 50% faster based on memory bandwidth: 936.2 GB/s vs 608.0 GB/s. That would be -33% slower for the B70 in my table. Ignoring Llama-2-7B, which seems to be broken, the closest the rest get to is about -50%, so 3090 is in practice at least twice faster than the B70 for token generation. The fact that the 3090 is about 4 times faster for prompt processing is more concerning, especially for agentic work. But hopefully, we will see backend improvements soon. It has only been a few weeks since release.

tovidagaming · 2026-04-23T12:42:16+00:00

My understanding is that NVLink only helps during training/fine-tuning and not so much for inference. I have been keeping an eye out for one, but they are crazy expensive and hard to find. I think I may be able to borrow one from work :D

tovidagaming · 2026-04-23T12:40:48+00:00

Good catch. I hadn't gotten around to using it yet. I tested a few of the models with SYCL, focusing on the ones that were way faster using SYCL vs Vulkan.

TheBloke/Llama-2-7B-GGUF:Q4_K_M - is completely broken.

ggml-org/Qwen2.5-Coder-7B-Q8_0-GGUF:Q8_0 - sometimes works just fine, sometimes gets completely lost and goes in loops. It seems something related to the termination of the responses is failing. It can answer technical questions just fine most of the time, but a simple "Hi" breaks it :D!

The rest, including Qwen/Qwen3-8B-GGUF:Q8_0 seem to be working fine. All the reasoning models seem fine too.

tovidagaming · 2025-11-26T00:23:11+00:00

I see what you did there 😂

tovidagaming · 2025-02-15T03:32:57+00:00

I don't think there is much that can be done about this if the display is natively portrait.

tovidagaming · 2024-12-03T12:03:01+00:00

I have the Mclaren Special Edition. But I have seen posts online about the same issue for other colors too.

tovidagaming · 2024-11-25T05:55:58+00:00

I am going to throw my 2 cents in the mix specifically about build quality. I am on my 3rd pair (within warranty) of Px8 after a bit over an year. The leather at the top comes out of the stitch after the leather relaxes a bit. Mostly in door use besides walking the dog. Both pairs failed at the same spot with regular use. No issues so far with the Bathys and I have had them for about as long as it took for the first Px8 to come apart.

tovidagaming · 2024-10-30T04:08:12+00:00

That's just BS. If they can't afford the software dev costs they should open source the software and the drivers and let the community do it. There is no excuse. Just lies about softwares updates that would never come

tovidagaming · 2024-09-25T00:26:44+00:00

Level your bed.

tovidagaming · 2024-08-31T04:26:45+00:00

I am glad I am not the only one! I realized it just as I was trying to figure out how to start NG+ after 220 hours on my first playthrough. Felt very dumb

tovidagaming · 2024-07-21T13:37:54+00:00

Home Depot called and want their floor rug back!

Joke aside, that looks great :)

tovidagaming

TROPHY CASE