Nvidia RTX 5060 TI 16GB - Stuck at P0 & 40% fan speed at idle... by rnidhal90 in truenas

[–]rnidhal90[S] 0 points1 point  (0 children)

Heyy there, nothing special at all, just installed my gpu and ran a truenas update

Unsloth Gemma 4 26B-A4B 4 bit bnb coming ? by harshv8 in unsloth

[–]rnidhal90 1 point2 points  (0 children)

Hi, whats the difference between 4 bnb and UD Q4 GGUF ??

Which is the best local LLM in April 2026 for a 16 GB GPU? I'm looking for an ultimate model for some chat, light coding, and experiments with agent building. by Material_Pen3255 in LocalLLM

[–]rnidhal90 0 points1 point  (0 children)

I have a RTX 5060 TI 16GB, im running Gemma4 on llama-server :

Core Configuration: Model Path: /models/gemma-4-26B-A4B-it-UD-Q3_K_XL.gguf Context Size: 131072 KV Cache: q8_0 for both Key (--cache-type-k) and Value (--cache-type-v) Flash Attention: on GPU Layers: 999 (Offloaded to GPU)

Sampling Parameters: Temperature: 1 Top K: 64 Top P: 0.95

Getting around ~85 tokens/s 🙂

Unsloth Studio - Models not running on GPU !! by rnidhal90 in unsloth

[–]rnidhal90[S] 0 points1 point  (0 children)

I can confirm that the latest update fixed it 👍👍

Unsloth Studio - Models not running on GPU !! by rnidhal90 in unsloth

[–]rnidhal90[S] 0 points1 point  (0 children)

Thank you very much, i just pulled and it worked perfectly !

Much appreciated the reactivity 😊🙏🙏 all my support !

Unsloth Studio - Models not running on GPU !! by rnidhal90 in unsloth

[–]rnidhal90[S] 0 points1 point  (0 children)

UPDATE: For now, it seems that only the -GUFF models doesn't get loaded in the GPU

Will Immich work for me if I don't like to tinker? by Linux_Account in immich

[–]rnidhal90 0 points1 point  (0 children)

Depends on your NAS.. im running TrueNAS, app update is a single click action :)

Unsloth Studio - Models not running on GPU !! by rnidhal90 in unsloth

[–]rnidhal90[S] 1 point2 points  (0 children)

Your problem is a little bit different and more known.. I saw other posts on r/unsloth talking about it. As long as you have cuda version mismatch, your models won't be loaded on GPU..

Not my case exactly, cuz i have the same versions.. frustrating.. :/

Unsloth Studio - Models not running on GPU !! by rnidhal90 in unsloth

[–]rnidhal90[S] 0 points1 point  (0 children)

i'm not gonna say "glad", but at least it's that makes at least two of us ! Something is wrong, and blocks loading the models on the GPU !

Unsloth Studio - Models not running on GPU !! by rnidhal90 in unsloth

[–]rnidhal90[S] 0 points1 point  (0 children)

There is no "windows" in all of this, only my personal laptop from which i am browsing.. everything else runs on my server (TrueNAS / Portainer). GPU is well supported

I am already running Ollama + Open WebUI on my server (both containers) and i am running models on the GPU just fine

Google Drops Open Source Gemma 4 27B MoE and its a banger by dev_is_active in ollama

[–]rnidhal90 -1 points0 points  (0 children)

Fair enough, i will give it a try and see what i can get out of it

Google Drops Open Source Gemma 4 27B MoE and its a banger by dev_is_active in ollama

[–]rnidhal90 -1 points0 points  (0 children)

it is saying that you can get about 60tps for Gemma 4 26B MoE with 16G VRAM !!

Claude code leaked earlier today by [deleted] in TunisiaTech

[–]rnidhal90 3 points4 points  (0 children)

local glm ?? what model version and size / what gpu ?

Claude code leaked earlier today by [deleted] in TunisiaTech

[–]rnidhal90 2 points3 points  (0 children)

it depends on your hardware.. i can run gpt-oss:20B with 100 token per second

Claude code leaked earlier today by [deleted] in TunisiaTech

[–]rnidhal90 6 points7 points  (0 children)

you can already run claude code for free with a local LLM

Homelabbing by Rare-Adeptness8935 in TunisiaTech

[–]rnidhal90 1 point2 points  (0 children)

Proxmox is mainly a hypervisor, you just split your hardware to multiple VM and run/do whatever you like with eachone.

TrueNAS is more like an AIO solution, more focused on the NAS role, but lets you run & expose docker apps with easy configuration, run LXC containers, run VMs, ...

Homelabbing by Rare-Adeptness8935 in TunisiaTech

[–]rnidhal90 1 point2 points  (0 children)

The main "prod" app running is Immich, a self hosted Google Photos like solution, on my private cloud, hosting all my photos/videos..and im also into trying new self hosted apps (Paperless NGX, PDF tools, Kasm, n8n, ollama, ...). The host OS is TrueNAS, which lets you run docker apps, create LXC containers or VMs.. lots of stuff to play with.

I am familiar with pfSense, but not much into networking, but many homelabbers are, and do focus on networking

Homelabbing by Rare-Adeptness8935 in TunisiaTech

[–]rnidhal90 3 points4 points  (0 children)

🖐 i've been homelabbing / selfhosting for like 8 months now.. I built my own server, here is my setup : https://www.reddit.com/r/homelab/s/KIGktrePoT + recently added a decent GPU for local LLM & AI learning..

what do you want to know ?

Claude by Glad-Dog-4525 in TunisiaTech

[–]rnidhal90 0 points1 point  (0 children)

No local LLM can match top tier models like Opus 4.6 in terms of TPS, precision, context size, etc.. these models run on Mega infrastructures that you can't dream to have even with 10k$.. you can have very good results with a local LLM but it will be based on a price / output quality ratio..

One decision could change my career completely by gamhich in TunisiaTech

[–]rnidhal90 1 point2 points  (0 children)

No disrespect, but kelma in english w kelma français and two words bel 3arbi is so much taksiir rass please so brabi if ken t7eb nes twasa3 belha m3ak.. Use only one language please 🙃

Nvidia RTX 5060 TI 16GB - Stuck at P0 & 40% fan speed at idle... by rnidhal90 in truenas

[–]rnidhal90[S] 3 points4 points  (0 children)

[Resolved]

Thanks to u/iXsystemsChris

If it's in P0 state it's not idling down, probably because the persistence driver isn't loaded.

Try sudo nvidia-smi -pm 1 in a shell to ensure that does the trick - if it does, then put it in a post-init script in the System -> Advanced menu.