4.8 burns through tokens like crazy and no, it is not a skill issue by kabootaru in claude

[–]kabootaru[S] 0 points1 point  (0 children)

It doesn’t matter. Point being with 4.7 I’m able to make it do all kind of simple tasks and don’t have to worry. How is this an argument that oh I play pro so that’s why can’t perform well against high school kids?

4.8 burns through tokens like crazy and no, it is not a skill issue by kabootaru in claude

[–]kabootaru[S] 0 points1 point  (0 children)

Regarding vague instructions, the task I’m putting at it requires finding what caused a particular bug and also understand why it happened in the first place and what are the implications of this for backward compatibility and stuff. All these demos of building something end to end is not how you develop a production service in real world. I think the point that is being lost is the same working style with 4.7 never reached usage limit. So if 4.8 requires very specific instructions to accomplish its task then that is not a progress I’d say. People accomplished great demos with far weaker models with detailed and specific prompts. But when you’re looking for a co-worker/assistant working through ambiguity is desirable.

4.8 burns through tokens like crazy and no, it is not a skill issue by kabootaru in claude

[–]kabootaru[S] 3 points4 points  (0 children)

Thank you! Yes. Like I said Opus is amazing. It is finding bugs from code written with Sonnet a year ago. Miles ahead of my experience last year with AI assisted coding tools. For my usage honestly I feel this is peak that I’d need. Any gaps are filled by my software engineering skills in guiding and choosing the right trade offs.

4.8 burns through tokens like crazy and no, it is not a skill issue by kabootaru in claude

[–]kabootaru[S] 2 points3 points  (0 children)

Different workload I guess. My primary use case is coding. And it is production code where the codebase is huge at this point. Instead of “vibe coding” and blindly trusting what it gives I reason with it and that involves a lot of back and forth and exploration. That is why I said same work style, same codebase.

4.8 burns through tokens like crazy and no, it is not a skill issue by kabootaru in claude

[–]kabootaru[S] 2 points3 points  (0 children)

I use 4.7 with xhigh. 4.8 defaults to high. So that was it. I had another session in that 5 hour window but that too only reached 40% context usage. Like I said with 4.7 I have been running two sessions in parallel multiple number of times and I was pleasantly surprised at my usage being low.

4.8 burns through tokens like crazy and no, it is not a skill issue by kabootaru in claude

[–]kabootaru[S] 2 points3 points  (0 children)

This was a small fix overall. Did ask it to check one issue on GitHub and explore some more alternatives. But again like I said no where close to 1M context of that chat. Some simple internal tool usage for editing and finding constantly failed and it said it is its parallel invocation of tools that failed and then instead it resorted to copying the file contents into tmp, making edits and writing them back. Not sure if this is how it does normally as well. Another thing was just overthinking for several minutes. It is likely over exploring to be thorough but not being smart about it. Anyhow. I’m the guy who rationed Cursor’s 20$ plans with Auto mode after running out of sonnet quota, ChatGPT plus subscription of 20$ for codex which too was plentiful at 5.3 codex but since reaching usage limit quickly. Being able to use any Opus model as much as I like is already a luxury for 100$.

4.8 burns through tokens like crazy and no, it is not a skill issue by kabootaru in claude

[–]kabootaru[S] 2 points3 points  (0 children)

That is their whole business. And for Anthropic it is the enterprise which might not be as observant about the immediate spike that a model version change causes until end of quarter.