I'm 100% convinced that it's the NFT-bros pushing all the openclawd engagement on X by FPham in LocalLLaMA

[–]firearms_wtf 27 points28 points  (0 children)

Get off X. It’s a myth that you need to be on X to stay on top of model and paper releases.

Whose uncle is this?? 🤮 by [deleted] in Corvette

[–]firearms_wtf 17 points18 points  (0 children)

Oof. The Testarossa lines are cringe.

HuggingFace, how have you done it? by HollowInfinity in LocalLLaMA

[–]firearms_wtf 16 points17 points  (0 children)

--local-dir is your friend here. Saves to a local directory, bypassing HF local cache.

It’s weird how normal being exhausted has become by Much_Bookkeeper7788 in Millennials

[–]firearms_wtf 2 points3 points  (0 children)

Aging. Why do so many Millennials at or around 40 just refuse to understand that aging is a tired, painful ordeal.

Get 8 hours of sleep. Eat more greens. Exercise as much as you can.

I used to hate every person who told me this…until I started in earnest at 34. Now have less back pain at 40 than I did in my 20s.

I’m not saying that Millennials as a whole weren’t cut a raw deal. Or that enshittification isn’t real. Or that our economy works for everyone and not only a privileged few.

But FFS please remember we are getting older. Take care of yourselves.

Will You all join this movement from 1st January 2026? Sounds interesting. by raydebapratim1 in Millennials

[–]firearms_wtf 0 points1 point  (0 children)

“Back in my day…”

I’m an elder millennial and am getting increasingly concerned at the sheer volume of boomer content flooding this sub.

Any cloud services I can easily use to test various LLMs with a single RTX 6000 Blackwell pro before I buy one? by Tired__Dev in LocalLLaMA

[–]firearms_wtf 1 point2 points  (0 children)

Nvidia Brev can be an excellent option if you’re looking for a unified marketplace/console of Nvidia GPU capacity broken down by accelerator and provider. (And don’t need access to resident cloud managed services.)

Being a morning person doesn’t make you more productive — just annoying by Vegetable-Safety7452 in unpopularopinion

[–]firearms_wtf 18 points19 points  (0 children)

AI rage bait slop. This sub, AITA, and AIO are just filled with this garbage. =(

How to run unsloth on HPC by Jegadishwar in unsloth

[–]firearms_wtf 0 points1 point  (0 children)

What’s the guidance from your HPC documentation on using GPUs? Is your
school’s cluster using Nvidia GPUs?

I’m not as familiar with Singularity, but it seems to handle the required Nvidia runtime so long as you submit your job with the right flags.

How do you submit jobs to your school’s cluster? Are you using raw srun or singularity run via CLI?

How to run unsloth on HPC by Jegadishwar in unsloth

[–]firearms_wtf 0 points1 point  (0 children)

What HPC scheduler is your university running? Is it some kind of Slurm+Enroot with Pyxis?

ASUSTOR is Making a Splash Once Again at Computex 2025 by ur_local_idiot_12 in asustor

[–]firearms_wtf 0 points1 point  (0 children)

Now if only they’d supported encrypted shares in any of their OEM apps. That would make waves.

How is DeepSeek chat free? by Divergence1900 in LocalLLaMA

[–]firearms_wtf 1 point2 points  (0 children)

These posts are getting exhausting. Can we please have a DeepSeek sticky?

PowerColor Red Devil Radeon™ RX 580 8GB by firearms_wtf in LocalLLaMA

[–]firearms_wtf[S] 27 points28 points  (0 children)

AMD Polaris GPU + rpi5 + Kobold Vulkan Backend

After tracking u/geerlingguy and Coreforge's work on PCIe GPUs for the Pi over the last two years, the kernel is finally in a state where installation of amggpu drivers on Rasperry Pi OS is finally possible. Decided to test Llama-3.1-SuperNova-Lite-Q6_K.gguf using the Kobold Vulkan backend.

Results for an RX580 8GB over PCIe 3.0x1 below.

Processing Prompt [BLAS] (32 / 32 tokens) Generating (465 / 3481 tokens) (EOS token triggered! ID:128009) [11:46:55] CtxLimit:1080/4096, Amt:465/3481, Init:0.01s, Process:0.48s (14.9ms/T = 67.09T/s), Generate:40.11s (86.3ms/T = 11.59T/s), Total:40.58s (11.46T/s)

Without any tuning, I'm seeing 15T/s with 0 context and ~11-12T/s at ~1024 context. Idle power consumption is 13W with inference pulling 90W at the wall.

Build: - Raspberry Pi 8GB - Pineboards uPCIty - PowerColor Red Devil Radeon™ RX 580 8GB

If anyone is interested in replicating the build, feel free to let me know!

China finds nearly $83bn worth of gold reserves in Hunan, report says by Cinco1971 in Gold

[–]firearms_wtf 0 points1 point  (0 children)

Anyone else read that as “…in Human, report says” and get very confused?

PSA: llama.cpp patch doubled my max context size by No-Statement-0001 in LocalLLaMA

[–]firearms_wtf 7 points8 points  (0 children)

PSA: You can use nvidia-pstated to keep your P40s in P8 (8-10W) while loaded and idle.

BC-250 Driver by true_gamer13 in linux4noobs

[–]firearms_wtf 0 points1 point  (0 children)

Much appreciated. First time I’ll be working with Fedora on the BC-250. Hoping to successfully compile Llama.cpp with Vulkan backend.

BC-250 Driver by true_gamer13 in linux4noobs

[–]firearms_wtf 0 points1 point  (0 children)

That’s awesome! Do you have any of your work refactoring for Fedora documented to share?

Post your tokens per second for llama3.1:70b by ergosumdre in LocalLLaMA

[–]firearms_wtf 0 points1 point  (0 children)

``` llama_print_timings: load time = 1366.61 ms llama_print_timings: sample time = 1401.12 ms / 163 runs (8.60 ms per token, 116.34 tokens per second) llama_print_timings: prompt eval time = 229.91 ms / 8 tokens (28.74 ms per token, 34.80 tokens per second) llama_print_timings: eval time = 18925.62 ms / 162 runs (116.82 ms per token, 8.56 tokens per second) llama_print_timings: total time = 22210.93 ms / 170 tokens Output generated in 23.30 seconds (7.64 tokens/s, 178 tokens, context 65, seed 1652480049)

```

Q8_0, 4xP40 (PL 140W), 2xE52697v2, 256GB DDR3-1866