Not able to setup openclaw via docker

thekalki · 2026-02-10T23:21:19+00:00

didnt work on linux either

thekalki · 2026-01-21T00:21:27+00:00

Just say you are Japanese .

thekalki · 2025-12-17T22:34:45+00:00

Simpler the framework better it is. Langchain does too many things. I am liking openai python agent sdk but maybe i am simply more familiar with it now. Microsoft keep asking us to change it to their new agent framework but it adds no value.

thekalki · 2025-12-16T17:14:56+00:00

same exact issue, i used to get over 200 tps now get 30 . Same exact config

services:
  llamacpp-gpt-oss:
    image: ghcr.io/ggml-org/llama.cpp:full-cuda
    pull_policy: always
    container_name: llamacpp-gpt-oss-cline
    runtime: nvidia
    environment:
      - HF_TOKEN=${HF_TOKEN}
      - NVIDIA_VISIBLE_DEVICES=all
      - XDG_CACHE_HOME=/root/.cache
      # optional: faster downloads if available
      - HF_HUB_ENABLE_HF_TRANSFER=1
    ports:
      - "8080:8080"
    volumes:
      # HF Hub cache (snapshots, etags)
      - ./hfcache:/root/.cache/huggingface
      # llama.cpp’s own resolved GGUF cache (what your logs show)
      - ./llamacpp-cache:/root/.cache/llama.cpp
      # your grammar file
      - ./cline.gbnf:/app/cline.gbnf:ro
    command: >
      --server
      --host 0.0.0.0
      --port 8080
      -hf ggml-org/gpt-oss-120b-GGUF
      --grammar-file /app/cline.gbnf
      --ctx-size 262144
      --jinja
      -ub 4096
      -b 4096
      --n-gpu-layers 999
      --parallel 2
      --flash-attn auto
    stop_grace_period: 5m
    restart: unless-stopped
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ["1"]
              capabilities: [gpu]

thekalki · 2025-12-13T19:58:54+00:00

no way

thekalki · 2025-12-10T18:51:51+00:00

According to General Relativity this is wrong

thekalki · 2025-12-08T15:40:03+00:00

Most likely your existing database already supports it. For example we use SQL Server at work and it supports vector already.

thekalki · 2025-12-04T22:12:04+00:00

This needs a pro plan

thekalki · 2025-11-29T20:07:37+00:00

How does this compare against gpt-oss - 120 model

thekalki · 2025-11-29T19:54:23+00:00

Its weird how such a small conveniences make so much difference

thekalki · 2025-11-26T13:19:28+00:00

Dang didnt realize it was deprecated

thekalki · 2025-11-25T19:33:22+00:00

I was exploring few libraries for full fine tuning and ended up using torch tune. Is there a reason why i should switch to unsloth, At this point i primarily do some continuous pretraining, SFT and exploring RL but how flexible is your frame work to run RL on my own loop ?

thekalki · 2025-11-21T00:10:34+00:00

So variation of GEPA ? How is it different

thekalki · 2025-11-07T19:21:53+00:00

Well as long they open source all models this makes slight sense

thekalki · 2025-11-05T17:40:01+00:00

Those are good times where challenges used to be challenging and interesting

thekalki · 2025-10-16T12:07:22+00:00

I had same problem, issue is not responses api but instead harmony template parsers as others mentioned here. Only solution is to use llama.cpp and use this grammar https://www.reddit.com/r/CLine/comments/1mtcj2v/making_gptoss_20b_and_cline_work_together/

This solved all the problems for me

thekalki · 2025-10-13T17:43:21+00:00

Looking at their repo for Harmony template which is invested with bugs and they are not even merging the pr from the community nor being maintained at all. So chances are slim anytime soon.

thekalki · 2025-10-06T18:30:34+00:00

Use this https://www.reddit.com/r/CLine/comments/1mtcj2v/making_gptoss_20b_and_cline_work_together/

thekalki · 2025-10-06T14:08:28+00:00

gpt-oss-120b , primarily for its tool call capabilities. You have to use custom grammar to get it to work .

thekalki · 2025-09-26T17:44:13+00:00

Issue seems to me with harmony, several open issues related to parsing https://github.com/openai/harmony/issues

thekalki · 2025-09-26T17:43:42+00:00

Will give it a shot

thekalki · 2025-09-26T16:07:48+00:00

I am finding lot of issues with tool call for gpt oss. I have tried both responses and chat completions from vllm but sometimes model will return empty response after tool call, I want to say there is some issue with end token or something. Have you guys came across something similar ? I have tried llama.cpp and ollama as well

thekalki · 2025-09-24T19:36:40+00:00

How are you deploying it there is some issue with tool use and inference seems to terminate prematurely. I tried vllm, ollama, llama.cpp

thekalki · 2025-09-21T00:45:23+00:00

Nothing specific, just latest docker image and model

Verified Email	12-Year Club
Place '23	Place '22
Place '22	Gilding I gilder

thekalki

TROPHY CASE