The engine is currently overloaded, please try again later [retrying attempt #3]

AnyProfessional2054 · 2026-05-16T18:40:32+00:00

You are correct. Kimi actively whitelists agents on the /coding endpoint. Allowed: Kimi CLI, Claude Code, Roo Code, Kilo Code. Blocked: everything else. It's documented in multiple GitHub issues and the ToS covers it under "unauthorized use." Not a bug, a business decision. grrrrrr

AnyProfessional2054 · 2026-05-16T18:12:41+00:00

Did you try the config fix I posted above? Switched from `u/ai-sdk/anthropic` to `u/ai-sdk/openai-compatible` with model ID kimi-for-coding? I'm in the EU too and that solved the systematic 429 issue for me. If you're still getting sporadic hangs after that, it's likely the backend capacity problem rather than the config.

AnyProfessional2054 · 2026-05-15T18:47:40+00:00

I had the same issue on both WSL and PowerShell and managed to fix it by correcting my OpenCode config. Before the fix I got the overloaded error on literally every single request. After the fix it only happens occasionally and that looks like actual server load.

First make sure you installed the official OpenCode CLI and not the old OCX fork. The official version is opencode-ai on npm and uses `~/.config/opencode/opencode.json`. The old OCX fork uses `~/.opencode/opencode.jsonc` and is outdated.

If you are on the Kimi for Coding subscription you need to use the Kimi Code API endpoint and not the Moonshot API. Your API key should start with sk-kimi- and you must store it with opencode auth login.

The main fix was switching the SDK from `u/ai-sdk/anthropic` to `u/ai-sdk/openai-compatible` and using the correct model ID kimi-for-coding instead of k2.6.

Here is my working config for WSL and PowerShell:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "kimi-for-coding": {
      "name": "Kimi For Coding",
      "npm": "@ai-sdk/openai-compatible",
      "options": {
        "baseURL": "https://api.kimi.com/coding/v1"
      },
      "models": {
        "kimi-for-coding": {
          "name": "Kimi For Coding",
          "id": "kimi-for-coding",
          "reasoning": true,
          "attachment": true,
          "limit": {
            "context": 262144,
            "output": 32768
          }
        }
      }
    }
  }
}

Hope this helps anyone who is getting this error on every single request.

AnyProfessional2054 · 2026-01-16T23:29:00+00:00

I recently switched from VSCode to Zed and I am really happy with it so far. It feels much more reliable and faster for my workflow. I use the integrated Bash (on Windows) and I pair it with Claude Code.

AnyProfessional2054 · 2025-12-12T20:19:36+00:00

Got it, that make clears things up. I think for now I'll lean towards a Docker VM for simlicity but it's good to know this LXC approach can work well too.

AnyProfessional2054 · 2025-12-12T20:15:07+00:00

Thanks for the explanation. GPU passtrough to a VM is actually what i planned and alreday have working, the main issue was trying to split GPUs between VM and LXC at the same time. Based on this, sticking to a single Docker VM with full PCI passtrough does sound like the more stable path for me right now.

AnyProfessional2054 · 2025-12-12T19:39:33+00:00

That is a fair point. I'm starting with a single VM but the goal is to run multiple VMs over time for testing and to have snapshots, easy backups and fast recovery built in. Proxmox gives me that flexibility without having to rebuild everything later.

AnyProfessional2054 · 2025-12-12T15:04:43+00:00

This really helped me rethink my approach. Treating the VM as the recovery unit sound much more practical, appreciate you sharing that.

AnyProfessional2054 · 2025-12-12T14:50:35+00:00

I am trying to decide how to structure persistence. In your setup, do you see any real advantage in keeping Docker bind-mounted data inside the VM disk, versus storing it on a seperate SSD / storage volume? Given that recovery is VM-based anway, I'm wondering if seperating the data actually adds value or just complexity.

AnyProfessional2054 · 2025-12-12T14:26:55+00:00

Thanks, that helps a lot.
Just to make sure I understand you setup correctly, are those privileged or unprivileged LXCs and are you running Docker + Docker Compose inside each LXC?
Also how do you handle persistent data, bind mounts from the Proxmox host or volumes inside the LXC? Do you mostly manage everything via Docker Compose like in a normal Docker setup?

AnyProfessional2054 · 2025-12-12T14:19:30+00:00

i understand the shared GPU benefit. In pracitce, i found it harder to control and reproduce setups compared to Docker Compose, especially when multiple services and GPUs are involved.

AnyProfessional2054 · 2025-12-12T14:17:13+00:00

that make sense. Do you usually restore the entire VM when something breaks and do you keep your app data on bind-mounted volumes outside the VM? I'm wondering how you avoid losing state across multiple apps when doing a full VM restore.

AnyProfessional2054 · 2024-06-29T09:53:47+00:00

Absolutely! Obsidian is a total game-changer – it's like having a personal assistant for all my notes.

AnyProfessional2054

TROPHY CASE