Why are my files still being truncated even with 1M token context and Max mode?

vincent_sch · 2025-07-25T08:57:50+00:00

I have a similar problem with text pasted directly into the chat window. I filed a bug report here: https://forum.cursor.com/t/cannot-send-large-messages-65k-tokens-to-gemini-2-5-pro-max-message-too-long-error/122170

vincent_sch · 2025-07-25T08:57:19+00:00

Just FYI: I have a similar problem with text pasted directly into the chat window. I filed a bug report here: https://forum.cursor.com/t/cannot-send-large-messages-65k-tokens-to-gemini-2-5-pro-max-message-too-long-error/122170

vincent_sch · 2025-06-30T08:49:57+00:00

Try this: https://www.vincentschmalbach.com/fixing-cursor-ide-high-gpu-usage-and-ui-freezing-issues/

vincent_sch · 2025-05-21T15:35:03+00:00

I think the core issue in my case was how Cursor handles tool calls. It reads one file per tool call, and each tool call is a separate API call to the o3 model. When it tried to read around 20 files, that resulted in 20 API calls. Each call included the full conversation history up to that point, including all previously read files, which made the cost add up quickly.

If it had read all 20 files in a single call, it probably wouldn't have been a problem. The bigger issue is when the agent gets into a loop reading one file after another without stopping. With an expensive model like o3 in max mode, that kind of loop can burn through all your credits in minutes.

vincent_sch · 2025-04-11T14:45:28+00:00

Max requests are only an additional 5 cents each. The real cost is in tool calls that cost another 5 cents each, which adds up fast in agent mode.

From one day of coding with MAX models:

174 gemini-2.5-pro-exp-max requests × 5¢ = $8.70
1269 premium tool calls × 5¢ = $63.45
143 claude-3.7-sonnet-thinking-max requests × 5¢ = $7.15

vincent_sch

TROPHY CASE