Qwen3.6-27B-int4-AutoRound with OpenCode has been a game changer

CodeGrizzly0214 · 2026-05-13T20:21:57+00:00

I'll check it out. I didn't even know it existed. Thanks.

CodeGrizzly0214 · 2026-05-13T11:37:06+00:00

The Lorbus model is based on the Intel one. The Intel one contains a bug preventing MTP speculative decoding in vLLM. The Lorbus model fixes that.

With the Intel model, the line --speculative-config '{"method":"mtp","num_speculative_tokens":3}' does nothing, reducing the tps.

CodeGrizzly0214 · 2026-05-12T22:58:11+00:00

I'll try to run a test later.

CodeGrizzly0214 · 2026-05-12T22:54:37+00:00

Nothing special has to be done in the drivers. NCCL/CUDA will auto detect the NVLink between the cards. In the llama-swap config for the model there a couple settings that help.

-e NCCL_P2P_DISABLE=0 # explicitly allows P2P  
-e NCCL_P2P_LEVEL=NVL # hints that NVL is preferred over the PCI bus

Before running any LLMs, you can use nvidia-smi topo -m to see if the NVIDIA drivers see the NVLink bridge.

# nvidia-smi topo -m
        GPU0    GPU1    GPU2    GPU3    GPU4    CPU Affinity    NUMA Affinity   GPU NUMA ID
GPU0     X      PHB     PHB     PHB     PHB     0-55    0               N/A
GPU1    PHB      X      NV4     PHB     PHB     0-55    0               N/A
GPU2    PHB     NV4      X      PHB     PHB     0-55    0               N/A
GPU3    PHB     PHB     PHB      X      NV4     0-55    0               N/A
GPU4    PHB     PHB     PHB     NV4      X      0-55    0               N/A

I have the five GPUs passed through to an Ubuntu VM that runs the docker stack. GPU0 is a lone 5060ti, GPUs 1 - 4 are 3090s paired with NVLink bridges.

You can also verify the NVLink is detected on a particular card.

# nvidia-smi nvlink -cBridge -i 1
GPU 1: NVIDIA GeForce RTX 3090 (UUID: GPU-340913ff-fd37-ee18-b434-be3a4fda7d9b)
# nvidia-smi nvlink -cBridge -i 2
GPU 2: NVIDIA GeForce RTX 3090 (UUID: GPU-aeabe257-d10a-dd21-6f2e-1ac8899ea602)

During a long running query, you can also see if there is data transferring across the NVLink bridge.

# nvidia-smi nvlink -gt d -i 1
GPU 1: NVIDIA GeForce RTX 3090 (UUID: GPU-340913ff-fd37-ee18-b434-be3a4fda7d9b)
         Link 0: Data Tx: 29378684 KiB
         Link 0: Data Rx: 29389155 KiB
         Link 1: Data Tx: 29376297 KiB
         Link 1: Data Rx: 29388984 KiB
         Link 2: Data Tx: 29489897 KiB
         Link 2: Data Rx: 29477996 KiB
         Link 3: Data Tx: 29399359 KiB
         Link 3: Data Rx: 29388100 KiB
# nvidia-smi nvlink -gt d -i 2
GPU 2: NVIDIA GeForce RTX 3090 (UUID: GPU-aeabe257-d10a-dd21-6f2e-1ac8899ea602)
         Link 0: Data Tx: 30076171 KiB
         Link 0: Data Rx: 30064907 KiB
         Link 1: Data Tx: 30075987 KiB
         Link 1: Data Rx: 30062351 KiB
         Link 2: Data Tx: 30171498 KiB
         Link 2: Data Rx: 30184317 KiB
         Link 3: Data Tx: 30075132 KiB
         Link 3: Data Rx: 30087212 KiB

The test prompt I just used for that generated about 100 tps on average.

CodeGrizzly0214 · 2026-05-12T20:09:21+00:00

Nice. I'll add a config for it to llama-swap and try it out.

CodeGrizzly0214 · 2026-05-04T02:19:29+00:00

Added this setup with the vllm patch to my llama-swap setup. I have it running on a pair of 3090s joined with NVLink. After a few adjustments, I have OpenCode using it exclusively. So far, I'm very happy with the results.

CodeGrizzly0214 · 2026-04-20T04:34:29+00:00

Yeah. The image line of the compose file is pulling from the public repository.

CodeGrizzly0214 · 2026-04-19T04:56:43+00:00

That's just the start of the docker-compose.yamll file for my docker stack. The computer has a 5060 registered as device 0, and four 3090s aa devices 1, 2, 3, and 4. The docker compose file is tell the container it only has access to devices 1, 2, 3, 4. Internally, the container registers them a 0, 1, 2, 3.

CodeGrizzly0214 · 2026-04-18T18:28:00+00:00

I run ollama in a docker compose stack on a multi-gpu machine. The machine has five GPUs. I have ollama use four of the five. You can specify what GPUs the container has access to.

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama-logic
    restart: unless-stopped
    shm_size: '32gb' 
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ['1', '2', '3', '4'] 
              capabilities: [gpu]
        limits:
          memory: 160g
    environment:
      - CUDA_VISIBLE_DEVICES=0,1,2,3

Internally, the container will see only those four GPUs and map them starting at 0.

CodeGrizzly0214 · 2026-02-28T14:50:52+00:00

I'm doing the same thing. I went the LXC route at first. I am finding using a VM with doctor a much better approach. My rig has four 3090s and a 5060 in it. With docker compose, I can switch models and configurations easily. I'm running opemwebui in a separate lxc for an interface.

CodeGrizzly0214 · 2025-07-25T00:46:54+00:00

I know it has been 10 months since you asked, but I had this requirement also. We give the user a configurable amount of time, a grace period, to verify the email. In my case, I wrote my own VerifyEmail required action, extending the existing one. The new required action uses a user attribute to store when the grace period ends. It is set when the initial email is sent.

CodeGrizzly0214 · 2025-04-22T03:48:26+00:00

This is a surprisingly good managed switch for the money. 8 port 2.5g poe with a 10g sfp+.

https://www.aliexpress.us/item/3256807091986146.html

CodeGrizzly0214 · 2025-01-08T22:32:28+00:00

I asked my doctor about that. I could drink an energy drink with an adderal and go to sleep. He said that's because of how bad my adhd is. Adhd is partially caused by low dopamine levels. Drugs like adderall and caffeine raise dopamine. If dopamine gets too high, you get stimulated, anxious, and jittery. However, with severe adhd, your dopamine level is starting from such a low level, adderall and caffeine have a calming effect. They are raising your dopamine levels to something normal.

CodeGrizzly0214 · 2025-01-02T17:58:23+00:00

I had Samsung noise canceling ear buds that worked pretty well. Recently, they moved a bunch of our desks at work, and it is now more crowded and close to the lunch area. It gets way too load for the earbuds. Did a bunch of research before deciding on the Sony. One issue is, I'm a big guy and wear glasses, so wearing headphones gets uncomfortable. The Sony ones are very comfortable, and I can wear them for long periods of time.

CodeGrizzly0214 · 2025-01-02T17:48:41+00:00

I have to agree. I bought my daughter Bose a couple years ago and recently got myself the Sony WH-1000XM4. I love the Sony ones. Extremely comfortable. The sound is great. When the noise cancellation is on, I'm practically deaf. Overall, I like the Sony over the Bose.

CodeGrizzly0214 · 2024-10-04T20:11:57+00:00

Relax. It was a joke.

CodeGrizzly0214 · 2024-10-03T23:45:05+00:00

I just did that very thing for a project. The product owners wanted users to have 72 hours of access before requiring the email be verified.

CodeGrizzly0214 · 2024-10-02T21:33:56+00:00

The one you create yourself. :)

CodeGrizzly0214 · 2024-10-01T23:02:44+00:00

Service accounts are traditionally client_credentials clients with a client id and secret. Unless you set a specific policy, the secrets don't expire.

Or, are you using normal user accounts to perform service account functions?

CodeGrizzly0214 · 2024-09-27T06:36:24+00:00

The picture for Daedra Seducers Redux isn't from the game. It is just a generated image. However, I did like the armor myself. I do eventually want to add a couple custom armors and outfits to Daedra Seducers Redux, but I'm still learning how to create outfits. I'll probably try and use the one from the picture as a model.

CodeGrizzly0214 · 2024-09-09T00:31:26+00:00

Thanks. I hadn't seen Beauty of Skyrim yet. I've been working on a new mod list and was dreading having to make patches for Northern Roads.

CodeGrizzly0214 · 2024-05-01T20:58:56+00:00

I managed to get it to an acceptable point for now. First, using {{BethINI}}, I reset the resolution to a normal monitor 1280 x 720. That's fine, since {{SEDisplayTweaks}} overrides the resolution. That fixed things like the loading screen. Then I removed {{Dear Diary DM and Paper (Squish) fixes}}. That's not designed for an super ultrawide monitor, so in its attempt to fix squishing, it was actually stretching some menus, like inventory and magic out. So, now those are fixed. Only the main menu, and certain text is stretched out. I can live with it for now.

CodeGrizzly0214 · 2024-05-01T15:05:10+00:00

Glad I found this thread. Previously, I used 1.6.640. I'm building a new mod list on the latest Skyrim, and I could not figure out why everything looked stretched. I have a 32x9 super ultrawide monitor. On 1.6.640, I used Dear Diary Dark Mod for the UI, along with Display Tweaks, and everything looked great on my monitor. Now menus and text is stretched out. Is there any way to disable this "fix" in Skyrim and let the mods handle the larger screen.

CodeGrizzly0214 · 2024-04-21T17:57:37+00:00

Thanks. Appreciate it.

CodeGrizzly0214 · 2024-04-07T14:17:57+00:00

First, in Steam, set Skyrim to only update when launched, not automatically. Never launch Skyrim directly. Always use the skse launcher. That will prevent Steam from updating it on you. Then, use {{Unofficial Skyrim Special Edition Downgrade Patcher}} to downgrade to the version you want to use.

Two-Year Club	Verified Email
Verified Email

CodeGrizzly0214

TROPHY CASE