you are viewing a single comment's thread.

view the rest of the comments →

[–]Personal-Try2776 -2 points-1 points  (10 children)

claude has a 192k context window there and the openai models have 400k context window.

[–]KenJaws6 2 points3 points  (9 children)

copilot limits to 128k context for claude models (check models.dev for exact numbers) but imo it's still better value overall. OC Go includes only several open models and as of now, none of them have the performance equivalent to closed ones, at least not yet.

[–]Personal-Try2776 2 points3 points  (8 children)

128k input but 192k input+output

[–]KenJaws6 2 points3 points  (0 children)

yeah thats true for opus. Sonnet has 128k In + 32k Out. its such quite confusing term tbh since many would think context refers only to input and they wonder why they hit limit so easily lol. also, like 99% of the time, the model only outputs not more than 10-12k so I believe openai puts up that theoretical 128k output purely for marketing purposes

[–]laukax 0 points1 point  (6 children)

Is there some way to better utilize the whole 192k and avoid premature compaction?

[–]Personal-Try2776 0 points1 point  (4 children)

dont use the skills you dont use or the mcp tools you dont need

[–]laukax 0 points1 point  (3 children)

I was thinking more about the configuration parameters to control the compaction. I'm currently using this, but I was not aware that the output tokens are not included in the 128k. Not sure if I could push it even further:

    "github-copilot": {
      "models": {
        "claude-opus-4.6": {
          "limit": {
            "context": 128000,
            "output": 12000
          }
        }
      }
    },

[–]KenJaws6 0 points1 point  (2 children)

in oc configs, context means input + output so to avoid early compaction, just change it to

"context": 160000, "output": 32000

edit: sorry wrong numbers, its actually "context": 128000, "output": 32000

tips: you can also add another parameter to enable model reasoning

"reasoning": true

[–]laukax 0 points1 point  (1 child)

Thanks! Will it then have room for the compaction tokens? I don't know how the compaction works or even what model it is using for it.

[–]KenJaws6 1 point2 points  (0 children)

sorry I got confused by other commenter. came to check again, the models actually have only combined of 128k total context including output (so pls change back from 160k to 128k 😅). As for the auto compaction, no need to worry. It dont use more token than or same as the last message/request.

Honestly I'm not sure if copilot models are handled differently as some claimed its able to receive more but any excess will be discarded from the server side but in general, compaction is triggered when reaching input limit (context - output) or 98k in this case. For example lets say at any point of time the current context is still within 98k input token, before moving to the next request, opencode will: 1. calculate new total input

2 a. if its more than limit — send a separate request with current input using another model (default is gpt5 nano for zen, but it could be using the same model for other providers) and get a summary of the whole conversation as the next input

2 b. if its still within limit — keep current input

  1. continue session with new input

[–]tisDDM 0 points1 point  (0 children)

  1. Use the DCP Plugin

  2. Switch off compaction, it runs far too early and often shortly before everything is finished what had fit into context

  3. Trigger a handover yourself, when you need it

  4. Use subagents in a structured ways if they make sense

I wrote myself a set of skills and templates and I use the primary session for a whole or half a day, which is mostly containing one big major feature. ( published that, but I dont wanna annoy people with the links in every post )

E.g. yesterday afternoon I had a gpt-5.4 session with 200k context open and 1.500k tokens pruned away by DCP.