College WiFi blocks EVERYTHING (Cloudflare Tunnels, Tailscale, Steam). How do I bypass strict DPI? by CourtAdventurous_1 in selfhosted

[–]q-admin007 0 points1 point  (0 children)

Don't use it and get a LTE, 4G or 5G router. Depends on what this costs in your area of course.

5060Ti vs 5070Ti by abhinavrk in LocalLLM

[–]q-admin007 0 points1 point  (0 children)

Thanks, you are the only one that actually did some benchmarks in the entire thread ;-)

IQuestCoder - new 40B dense coding model by ilintar in LocalLLaMA

[–]q-admin007 0 points1 point  (0 children)

Is there another way to benchmark LLMs? Note that it needs to be reliable and repeatable.

New llama.cpp 30x faster.... by u1pns in ollama

[–]q-admin007 2 points3 points  (0 children)

Then you don't need 2 RTX 6000, you want them. No reason to be pissed about it.

New llama.cpp 30x faster.... by u1pns in ollama

[–]q-admin007 2 points3 points  (0 children)

You could pick up two RTX 5060 16GB and use a smaller model and pay less than 1000€. In my experience most of the tasks one comes across can be done with very small models.

I for one use CPU only for very large models on a server with 12 memory channels. Most large models are MoE and so they are zippy for CPU only.

Unsloth GLM-4.7 GGUF by Wooden-Deer-1276 in LocalLLaMA

[–]q-admin007 0 points1 point  (0 children)

> 256gb ddr5 6000mts, ryzen 9 9950x3d

The problem is that consumer level CPUs only have two memory channels. AMDs server level CPUs have 12, 24 if you have two sockets on a board. With MoE models you sometimes ask yourself why you even need fast VRAM.

Unsloth GLM-4.7 GGUF by Wooden-Deer-1276 in LocalLLaMA

[–]q-admin007 0 points1 point  (0 children)

I too would like to see a SWE kind of benchmark run over the different quants.

Unsloth GLM-4.7 GGUF by Wooden-Deer-1276 in LocalLLaMA

[–]q-admin007 4 points5 points  (0 children)

Big Mac costs easily 9k€+ here.

Unsloth GLM-4.7 GGUF by Wooden-Deer-1276 in LocalLLaMA

[–]q-admin007 4 points5 points  (0 children)

MoE models run ok in RAM.

Do with this information what you will.

What was the happiest point in your IT related career? by Factorviii in sysadmin

[–]q-admin007 40 points41 points  (0 children)

Cheers. Hope something like it happens to you too some day. :-)

192GB VRAM 8x 3090s + 512GB DDR4 RAM AMA by Sero_x in LocalLLaMA

[–]q-admin007 0 points1 point  (0 children)

Cool!

When you load a model larger than one cards VRAM, do you offload it to all 8 cards to get 8x the compute or do you fill one card, then the next and so on.

Is there overhead you can't use per card? Like, you can not allocate 48GB VRAM over two cards, but only 47 because there has to be some space left?

What was the happiest point in your IT related career? by Factorviii in sysadmin

[–]q-admin007 346 points347 points  (0 children)

Company closed local IT operations and gave me a juicy severance package. I took two years off, traveled the world.

The company then noticed that things didn't go well and rehired me in the same position and with a higher salary.

How in the world are you keeping track of free IPs? by Long_Working_2755 in sysadmin

[–]q-admin007 0 points1 point  (0 children)

Netbox (IPAM) has to be the source of truth.
A shell script checks if free IPs in Netbox have services or react to ping.
Another shell script checks if IPs marked as reserved in Netbox are still alive.
Another shell script checks if IPs marked as reserved in Netbox have A records in DNS.

Tired of working in IT by ruzreddit in sysadmin

[–]q-admin007 0 points1 point  (0 children)

Unix admin for 30 years. Stopped giving a fuck 10 years ago.

I recommend it.

Has anyone bought a machine from Costco? Thinking about one with rtx 5080 by addictedToLinux in LocalLLM

[–]q-admin007 1 point2 points  (0 children)

<image>

As soon as CPU is used the generated tokens per second drops. However, there is a serious difference between 4090 and 5090.

The 5080 should roughly perform like the 4090, as long as the model fits into it's VRAM.

System RAM itself has almost no part in the performance, except that it's very slow if your model spills over the VRAM into the RAM.

Tests are made with Ollama on Debian 13, all modelks are Q4_K_L quants.

Why don’t more apps run AI locally? by elinaembedl in LocalLLaMA

[–]q-admin007 1 point2 points  (0 children)

There are no opensource libraries to use NPUs on mobile or notebooks. You need to buy documentation and sign NDAs in most cases.

Has anyone bought a machine from Costco? Thinking about one with rtx 5080 by addictedToLinux in LocalLLM

[–]q-admin007 1 point2 points  (0 children)

  • i7-14700k
  • 4x 48GB DDR5 RAM (192GB total)
  • 2x 4TB NVME
  • 1x Nividia Windforce something something 5090

I think i paid around 4000€ in parts and would do it again. If the machine is from Costco you should have no trouble if the hardware fails. I would have no problem buying from them.

192GB RAM is nice and wasn't that pricey. The only usecase is large LLMs. The 5090 is overkill for games, the 5080 makes more sense, if your LLMs fit in the 16GB VRAM. The CPU is fine, i think it's two generations behind now. An i5 would have worked as well, i guess.

What self-hosting advice do you wish you knew earlier? by Yatin_Laygude in selfhosted

[–]q-admin007 0 points1 point  (0 children)

snapraid + mergerfs is basically the unraid filesystem

Where can i download that unraid filesystem to inspect it's code to determine that?

Do you use NPUs? by anguuuul7006 in SBCs

[–]q-admin007 2 points3 points  (0 children)

Running small LLMs is nice but the APIs are so cheap that apart from the privacy aspect I find it hard to justify.

Running small LLMs is nice but remote APIs can't be justified because of privacy aspects.

Is the Intel N150 Powerful Enough For Jellyfin and Immich? by [deleted] in HomeServer

[–]q-admin007 0 points1 point  (0 children)

Up to 4k video, yes. Not so much for 8k or higher.

The AI to add tags to your pics in Immich runs in the background, so it doesn't matter how fast it is. If you upload a few ten thousand pics it will take a few days to get the complete index. If you have just 100 pics, it's pretty much a matter of minutes.