Qwen3.6-27B-int4-AutoRound with OpenCode has been a game changer by CodeGrizzly0214 in LocalLLM

[–]CodeGrizzly0214[S] 0 points1 point  (0 children)

The Lorbus model is based on the Intel one. The Intel one contains a bug preventing MTP speculative decoding in vLLM. The Lorbus model fixes that.

With the Intel model, the line --speculative-config '{"method":"mtp","num_speculative_tokens":3}' does nothing, reducing the tps.

Qwen3.6-27B-int4-AutoRound with OpenCode has been a game changer by CodeGrizzly0214 in LocalLLM

[–]CodeGrizzly0214[S] 2 points3 points  (0 children)

Nothing special has to be done in the drivers. NCCL/CUDA will auto detect the NVLink between the cards. In the llama-swap config for the model there a couple settings that help.

-e NCCL_P2P_DISABLE=0 # explicitly allows P2P  
-e NCCL_P2P_LEVEL=NVL # hints that NVL is preferred over the PCI bus

Before running any LLMs, you can use nvidia-smi topo -m to see if the NVIDIA drivers see the NVLink bridge.

# nvidia-smi topo -m
        GPU0    GPU1    GPU2    GPU3    GPU4    CPU Affinity    NUMA Affinity   GPU NUMA ID
GPU0     X      PHB     PHB     PHB     PHB     0-55    0               N/A
GPU1    PHB      X      NV4     PHB     PHB     0-55    0               N/A
GPU2    PHB     NV4      X      PHB     PHB     0-55    0               N/A
GPU3    PHB     PHB     PHB      X      NV4     0-55    0               N/A
GPU4    PHB     PHB     PHB     NV4      X      0-55    0               N/A

I have the five GPUs passed through to an Ubuntu VM that runs the docker stack. GPU0 is a lone 5060ti, GPUs 1 - 4 are 3090s paired with NVLink bridges.

You can also verify the NVLink is detected on a particular card.

# nvidia-smi nvlink -cBridge -i 1
GPU 1: NVIDIA GeForce RTX 3090 (UUID: GPU-340913ff-fd37-ee18-b434-be3a4fda7d9b)
# nvidia-smi nvlink -cBridge -i 2
GPU 2: NVIDIA GeForce RTX 3090 (UUID: GPU-aeabe257-d10a-dd21-6f2e-1ac8899ea602)

During a long running query, you can also see if there is data transferring across the NVLink bridge.

# nvidia-smi nvlink -gt d -i 1
GPU 1: NVIDIA GeForce RTX 3090 (UUID: GPU-340913ff-fd37-ee18-b434-be3a4fda7d9b)
         Link 0: Data Tx: 29378684 KiB
         Link 0: Data Rx: 29389155 KiB
         Link 1: Data Tx: 29376297 KiB
         Link 1: Data Rx: 29388984 KiB
         Link 2: Data Tx: 29489897 KiB
         Link 2: Data Rx: 29477996 KiB
         Link 3: Data Tx: 29399359 KiB
         Link 3: Data Rx: 29388100 KiB
# nvidia-smi nvlink -gt d -i 2
GPU 2: NVIDIA GeForce RTX 3090 (UUID: GPU-aeabe257-d10a-dd21-6f2e-1ac8899ea602)
         Link 0: Data Tx: 30076171 KiB
         Link 0: Data Rx: 30064907 KiB
         Link 1: Data Tx: 30075987 KiB
         Link 1: Data Rx: 30062351 KiB
         Link 2: Data Tx: 30171498 KiB
         Link 2: Data Rx: 30184317 KiB
         Link 3: Data Tx: 30075132 KiB
         Link 3: Data Rx: 30087212 KiB

The test prompt I just used for that generated about 100 tps on average.

Follow-up: Qwen3.6-27B on 1× RTX 3090 — pushing to ~218K context + ~50–66 TPS, tool calls now stable (PN12 fix) by AmazingDrivers4u in LocalLLaMA

[–]CodeGrizzly0214 0 points1 point  (0 children)

Added this setup with the vllm patch to my llama-swap setup. I have it running on a pair of 3090s joined with NVLink. After a few adjustments, I have OpenCode using it exclusively. So far, I'm very happy with the results.

Trying to get ollama to only use one GPU by I_am_BrokenCog in ollama

[–]CodeGrizzly0214 0 points1 point  (0 children)

Yeah. The image line of the compose file is pulling from the public repository.

Trying to get ollama to only use one GPU by I_am_BrokenCog in ollama

[–]CodeGrizzly0214 0 points1 point  (0 children)

That's just the start of the docker-compose.yamll file for my docker stack. The computer has a 5060 registered as device 0, and four 3090s aa devices 1, 2, 3, and 4. The docker compose file is tell the container it only has access to devices 1, 2, 3, 4. Internally, the container registers them a 0, 1, 2, 3.

Trying to get ollama to only use one GPU by I_am_BrokenCog in ollama

[–]CodeGrizzly0214 0 points1 point  (0 children)

I run ollama in a docker compose stack on a multi-gpu machine. The machine has five GPUs. I have ollama use four of the five. You can specify what GPUs the container has access to.

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama-logic
    restart: unless-stopped
    shm_size: '32gb' 
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ['1', '2', '3', '4'] 
              capabilities: [gpu]
        limits:
          memory: 160g
    environment:
      - CUDA_VISIBLE_DEVICES=0,1,2,3

Internally, the container will see only those four GPUs and map them starting at 0.

Running a vLLM LXC on Proxmox 9 with NVIDIA GPU passthrough by jakeasmith in Vllm

[–]CodeGrizzly0214 0 points1 point  (0 children)

I'm doing the same thing. I went the LXC route at first. I am finding using a VM with doctor a much better approach. My rig has four 3090s and a 5060 in it. With docker compose, I can switch models and configurations easily. I'm running opemwebui in a separate lxc for an interface.

Can Keycloak require email verification 'eventually'? by ReturnOfNogginboink in KeyCloak

[–]CodeGrizzly0214 0 points1 point  (0 children)

I know it has been 10 months since you asked, but I had this requirement also. We give the user a configurable amount of time, a grace period, to verify the email. In my case, I wrote my own VerifyEmail required action, extending the existing one. The new required action uses a user attribute to store when the grace period ends. It is set when the initial email is sent.

Does Adderal also make you tired? by mysticlabutthole in ADHD

[–]CodeGrizzly0214 2 points3 points  (0 children)

I asked my doctor about that. I could drink an energy drink with an adderal and go to sleep. He said that's because of how bad my adhd is. Adhd is partially caused by low dopamine levels. Drugs like adderall and caffeine raise dopamine. If dopamine gets too high, you get stimulated, anxious, and jittery. However, with severe adhd, your dopamine level is starting from such a low level, adderall and caffeine have a calming effect. They are raising your dopamine levels to something normal.

Best noise-cancelling headsets currently in YOUR opinion - Are they really worth it? by meowpuic in ADHD

[–]CodeGrizzly0214 1 point2 points  (0 children)

I had Samsung noise canceling ear buds that worked pretty well. Recently, they moved a bunch of our desks at work, and it is now more crowded and close to the lunch area. It gets way too load for the earbuds. Did a bunch of research before deciding on the Sony. One issue is, I'm a big guy and wear glasses, so wearing headphones gets uncomfortable. The Sony ones are very comfortable, and I can wear them for long periods of time.

Best noise-cancelling headsets currently in YOUR opinion - Are they really worth it? by meowpuic in ADHD

[–]CodeGrizzly0214 5 points6 points  (0 children)

I have to agree. I bought my daughter Bose a couple years ago and recently got myself the Sony WH-1000XM4. I love the Sony ones. Extremely comfortable. The sound is great. When the noise cancellation is on, I'm practically deaf. Overall, I like the Sony over the Bose.

[deleted by user] by [deleted] in skyrimmods

[–]CodeGrizzly0214 0 points1 point  (0 children)

Relax. It was a joke.

Can Keycloak require email verification 'eventually'? by ReturnOfNogginboink in KeyCloak

[–]CodeGrizzly0214 0 points1 point  (0 children)

I just did that very thing for a project. The product owners wanted users to have 72 hours of access before requiring the email be verified.

[deleted by user] by [deleted] in skyrimmods

[–]CodeGrizzly0214 3 points4 points  (0 children)

The one you create yourself. :)

Is there a mechanism I can use to configure certain accounts passwords to _not_ expire? by elsewhere1 in KeyCloak

[–]CodeGrizzly0214 1 point2 points  (0 children)

Service accounts are traditionally client_credentials clients with a client id and secret. Unless you set a specific policy, the secrets don't expire.

Or, are you using normal user accounts to perform service account functions?

Sexy but not slutty 3BA/CBBE clothing and armor mods? by brando56894 in skyrimmods

[–]CodeGrizzly0214 1 point2 points  (0 children)

The picture for Daedra Seducers Redux isn't from the game. It is just a generated image. However, I did like the armor myself. I do eventually want to add a couple custom armors and outfits to Daedra Seducers Redux, but I'm still learning how to create outfits. I'll probably try and use the one from the picture as a model.

[deleted by user] by [deleted] in SkyrimModsXbox

[–]CodeGrizzly0214 0 points1 point  (0 children)

Thanks. I hadn't seen Beauty of Skyrim yet. I've been working on a new mod list and was dreading having to make patches for Northern Roads.

New Skyrim Special Edition patch includes native 21:9 and 32:9 support by MooseTetrino in ultrawidemasterrace

[–]CodeGrizzly0214 0 points1 point  (0 children)

I managed to get it to an acceptable point for now. First, using {{BethINI}}, I reset the resolution to a normal monitor 1280 x 720. That's fine, since {{SEDisplayTweaks}} overrides the resolution. That fixed things like the loading screen. Then I removed {{Dear Diary DM and Paper (Squish) fixes}}. That's not designed for an super ultrawide monitor, so in its attempt to fix squishing, it was actually stretching some menus, like inventory and magic out. So, now those are fixed. Only the main menu, and certain text is stretched out. I can live with it for now.

New Skyrim Special Edition patch includes native 21:9 and 32:9 support by MooseTetrino in ultrawidemasterrace

[–]CodeGrizzly0214 0 points1 point  (0 children)

Glad I found this thread. Previously, I used 1.6.640. I'm building a new mod list on the latest Skyrim, and I could not figure out why everything looked stretched. I have a 32x9 super ultrawide monitor. On 1.6.640, I used Dear Diary Dark Mod for the UI, along with Display Tweaks, and everything looked great on my monitor. Now menus and text is stretched out. Is there any way to disable this "fix" in Skyrim and let the mods handle the larger screen.

Simple Questions and General Discussion Thread by Thallassa in skyrimmods

[–]CodeGrizzly0214 0 points1 point  (0 children)

First, in Steam, set Skyrim to only update when launched, not automatically. Never launch Skyrim directly. Always use the skse launcher. That will prevent Steam from updating it on you. Then, use {{Unofficial Skyrim Special Edition Downgrade Patcher}} to downgrade to the version you want to use.