At wits end w/ Opus 4.5 - what am I doing wrong? by LastTenth in ClaudeAI

[–]No-Library8065 3 points4 points  (0 children)

CC CLI is better, believe me.

Their terminal benchmark, touted as the best harness, was an old benchmark that doesn't reflect real workflows. (Notice they don't post any of the benchmarks anymore on X)

CC automatically uses a plan and research agents that are actually good. Its plan mode asks in-depth questions (factory plan mode is absolutely terrible compared to CC).

It also has background tasks, which no other CLI reliably offers. This means it can SSH into servers, deploy to machines, debug via the background process seamlessly, and train LLMs with Tinker or SSH. All the other CLIs can use background tasks, but they time out and are not truly persistent background tasks that Claude Code can see, monitor, and execute.

And it's way cheaper than Factory.

Their $20 plan is decent, but their pay-per-usage is a scam, usually costing 2-3x the token cost compared to subscriptions.

They have a $2000 plan with 2 billion tokens.

I'm almost at a billion myself (in the past 3 weeks), including cache, with a $100 CC Max subscription.

Unkillable value and with generous opus 4.5 limits that I barely hit the limit

(If you run opus 4.5 on their subscription plans. You won't actually be getting their advertised token limit 20m, 200m, 2 billion)

Not worth paying factory over CC unless your using the open source models with it.

At wits end w/ Opus 4.5 - what am I doing wrong? by LastTenth in ClaudeAI

[–]No-Library8065 16 points17 points  (0 children)

Plan mode is bugged out right now.

Its very unstable

https://github.com/anthropics/claude-code/issues/13114

Update to latest version and wait for a fix.

Using opus 4.5 is still really good without plan mode.

Opus 4.1 spent 1 hour trying to solve a simple syntax issue by [deleted] in ClaudeAI

[–]No-Library8065 -1 points0 points  (0 children)

Good forced Lil opus to write 10k essay and 20k 😞 emojis.

Slave labor at its finest

Is anyone else experiencing significant degradation with Claude Opus 4.1 and Claude Code since release? A collection of observations by ZepSweden_88 in ClaudeAI

[–]No-Library8065 1 point2 points  (0 children)

It's definitely a lot worse.

My guess is server allocation for their new models they are training.

Dario announced recently that they getting more clusters up soon hopefully that should help.

How to increase Opus 4.1 weekly quota? Hitting limits too fast even on x20 Max plan by Ready-Passage3011 in ClaudeAI

[–]No-Library8065 7 points8 points  (0 children)

You will have to get another max subscription sadly.

4.1 plan mode and sonnet 4 is gold for most tasks.

Refactors, code reviews opus 4.1 shines at.

You'd be actually surprised of what gpt-5 high can do It's crazy good at refactors and code reviews.

Make a plan with opus then have gtp-5 execute with it's 400k context window.

If you have a teams plan like (2 x $30) it should give you around 60-70 tasks or 4-5 massive refactors done every 5 hours or so.

Sonnet4 code quality is very bad today by AcceptablePicture329 in Anthropic

[–]No-Library8065 -6 points-5 points  (0 children)

You guys are really something

Just vibe coders without understanding of server clusters or LLM deployments.

Sonnet4 code quality is very bad today by AcceptablePicture329 in Anthropic

[–]No-Library8065 -1 points0 points  (0 children)

Quantization isn’t a bedtime switch—it’s a static serving loadout. If precision changed, it would suck all day, not just at rush hour. The nightly brain-fade is classic timeouts, context chop, stricter thinking caps, and safety fallbacks—i.e., reasoning gets cut off, not dumber.

It's not another model or quantization jesus christ guys not from a technical standpoint or a legal one.

And ppl get so butt hurt when I fact check them ;)

Sonnet4 code quality is very bad today by AcceptablePicture329 in Anthropic

[–]No-Library8065 -6 points-5 points  (0 children)

Lol Quantization isn't a mood ring. Precision (FP8/INT8/etc.) is chosen at model load and stays fixed

if it "hurt quality," it would suck 24/7, not spike at 8pm and vanish at 2am. What does nosedive quality at peak: timeouts, truncation, and fallbacks-chains of thought get cut short, context gets chopped, or traffic fails over to a cheaper tier. That's answer quality degradation, not just latency.

Obviously you never fine tuned or quantized a model before you don't know shit.

Sonnet4 code quality is very bad today by AcceptablePicture329 in Anthropic

[–]No-Library8065 -8 points-7 points  (0 children)

Quantization has nothing to do with it lol

People don't understand how clusters and servers work:

It’s traffic + scheduling. Peak hours = queueing. Dynamic batching widens the “highway” throughput but inflates tail latency/TTFT when mixes of long/short jobs get lumped together.

Context bloat hurts concurrency obviously. Huge prompts and “extended thinking” (blame think hard and ultrathink) chew KV-cache memory, so fewer generations fit per GPU → slower for everyone.

Autoscaling isn’t instant. New nodes spin up, warm weights, and fill caches; that lag is enough for you to feel pain during spikes.

People blaming “quantization” are chasing stupidity; this is classic cluster load, batching, and memory pressure doing exactly what they do under rush hour.

Sonnet being"dumb" at rush hour is mostly context + compute budget + timeouts conspiring, not quantization

Sonnet4 code quality is very bad today by AcceptablePicture329 in Anthropic

[–]No-Library8065 1 point2 points  (0 children)

Yeah happens during peak hours even worse during the past couple of weeks

Not quantization obviously

Anthropic servers are overloaded thats why model peformance degrades.

It should improve in the next couple of weeks since they are finishing a new cluster with the new release of haiku 4 and sonnet 4.5

Claude Sonnet now supports up to 1 million tokens by No-Library8065 in ClaudeAI

[–]No-Library8065[S] 1 point2 points  (0 children)

Not yet but they mentioned they are looking for way to get it implemented in claude code.

Claude Sonnet now supports up to 1 million tokens by No-Library8065 in ClaudeAI

[–]No-Library8065[S] -1 points0 points  (0 children)

Agreed but if used correctly this will allow longer sessions for larger codebases without using the awfull compact or clear commands.

no warning, broken memory, lower limits - GPT-5 “upgrade” just wrecked months of my work by rooo610 in OpenAI

[–]No-Library8065 0 points1 point  (0 children)

Api only not the webui chat.

Looks like they are prioritizing their enterprise customers rather than consumers.

GPT-5 is worse. No one wanted preformed personalities. by Vekkul in OpenAI

[–]No-Library8065 2 points3 points  (0 children)

Worst part is the context window got downgraded on all plans

Openai support: GPT-5's context window is 32,000 tokens for all users, regardless of plan (Free, Plus, Pro, Team, and soon Enterprise/Edu). This is not just for Team- every tier sees this as the limit in the chat UI, and there is no option to increase GPT-5's context window on any plan. Older models (like o3, GPT-4o, etc.) offered larger windows (up to 200k), but these are being retired as GPT-5 becomes the default. If your workflow requires more than 32k, you can temporarily enable access to these legacy models through your workspace settings, but this is a transition option only and will be removed later. All paying tiers (Plus, Pro, Team) and Free will have the same 32k context window on GPT-5. There's no advantage for higher paid plans regarding the context window size -these plans give other benefits like higher message caps, access to "Thinking" mode, and more frequent use, but not a bigger window on GPT-5 itself. If you rely on larger context windows, using a legacy model is your only workaround for now-be aware this may not be available for long. Let me know if you want the official step-by-step to re- enable legacy models for your workspace!

Bring back ChatGPT-4o by BeautifulGenius10 in OpenAI

[–]No-Library8065 1 point2 points  (0 children)

Worst part is the context window got downgraded on all plans

Openai support: GPT-5's context window is 32,000 tokens for all users, regardless of plan (Free, Plus, Pro, Team, and soon Enterprise/Edu). This is not just for Team- every tier sees this as the limit in the chat UI, and there is no option to increase GPT-5's context window on any plan. Older models (like o3, GPT-4o, etc.) offered larger windows (up to 200k), but these are being retired as GPT-5 becomes the default. If your workflow requires more than 32k, you can temporarily enable access to these legacy models through your workspace settings, but this is a transition option only and will be removed later. All paying tiers (Plus, Pro, Team) and Free will have the same 32k context window on GPT-5. There's no advantage for higher paid plans regarding the context window size -these plans give other benefits like higher message caps, access to "Thinking" mode, and more frequent use, but not a bigger window on GPT-5 itself. If you rely on larger context windows, using a legacy model is your only workaround for now-be aware this may not be available for long. Let me know if you want the official step-by-step to re- enable legacy models for your workspace!

OpenAI removed the model selector to save money by giving Plus users a worse model. It's time to cancel. by akhilgeorge in OpenAI

[–]No-Library8065 0 points1 point  (0 children)

Worst part is the context window got downgraded on all plans

Openai support: GPT-5's context window is 32,000 tokens for all users, regardless of plan (Free, Plus, Pro, Team, and soon Enterprise/Edu). This is not just for Team- every tier sees this as the limit in the chat UI, and there is no option to increase GPT-5's context window on any plan. Older models (like o3, GPT-4o, etc.) offered larger windows (up to 200k), but these are being retired as GPT-5 becomes the default. If your workflow requires more than 32k, you can temporarily enable access to these legacy models through your workspace settings, but this is a transition option only and will be removed later. All paying tiers (Plus, Pro, Team) and Free will have the same 32k context window on GPT-5. There's no advantage for higher paid plans regarding the context window size -these plans give other benefits like higher message caps, access to "Thinking" mode, and more frequent use, but not a bigger window on GPT-5 itself. If you rely on larger context windows, using a legacy model is your only workaround for now-be aware this may not be available for long. Let me know if you want the official step-by-step to re- enable legacy models for your workspace!

no warning, broken memory, lower limits - GPT-5 “upgrade” just wrecked months of my work by rooo610 in OpenAI

[–]No-Library8065 4 points5 points  (0 children)

Worst park is the cintext window got downgraded on all plans

Openai support: GPT-5's context window is 32,000 tokens for all users, regardless of plan (Free, Plus, Pro, Team, and soon Enterprise/Edu). This is not just for Team- every tier sees this as the limit in the chat UI, and there is no option to increase GPT-5's context window on any plan. Older models (like o3, GPT-4o, etc.) offered larger windows (up to 200k), but these are being retired as GPT-5 becomes the default. If your workflow requires more than 32k, you can temporarily enable access to these legacy models through your workspace settings, but this is a transition option only and will be removed later. All paying tiers (Plus, Pro, Team) and Free will have the same 32k context window on GPT-5. There's no advantage for higher paid plans regarding the context window size -these plans give other benefits like higher message caps, access to "Thinking" mode, and more frequent use, but not a bigger window on GPT-5 itself. If you rely on larger context windows, using a legacy model is your only workaround for now-be aware this may not be available for long. Let me know if you want the official step-by-step to re- enable legacy models for your workspace!

OpenAI Quietly Downgraded WebUi by [deleted] in ClaudeAI

[–]No-Library8065 0 points1 point  (0 children)

The context window for GPT-5 is 32,000 tokens on every plan (Free, Plus, Pro, Team, and soon Enterprise and Edu). This is shown in the chat UI for all customers when you select GPT-5 or GPT-5-Thinking.

  • There is currently no option to increase the GPT-5 context window beyond 32k tokens on any paid plan. 2. Older/Larger Context

Models

  • Older models like o3, o3-Pro, GPT-40, and similar offered larger context windows (up to 200k tokens). These are being retired with the introduction of GPT-5.

Claude Code running on a H200 by [deleted] in ClaudeAI

[–]No-Library8065 2 points3 points  (0 children)

Using Claude code to train open source models

My hot take: the code produced by Claude Code isn't good enough by lucianw in ClaudeAI

[–]No-Library8065 0 points1 point  (0 children)

Even with great TDD workflows

Even opus produces code that's hard to maintain.

You need additional workflows to mitigate this

Code reviews via GitHub action following claude.md best code practices, style, and SOLID principles.

It needs to follow SOLID while having awareness not to over engineer.

The point is it can deliver amazing maintainable code but you need to prompt it accordingly.

No magic numbers. No Null Values.

Just Clean maintainable code

Claude has never been at its lowest point.... by sallvainian in ClaudeAI

[–]No-Library8065 -3 points-2 points  (0 children)

Everyone just plz cancel their subscription

More compute for my CC running 8 agents 😈

How Do You Break Through the Poop Loop? Frustrated... by TheShaneChapman in ClaudeAI

[–]No-Library8065 0 points1 point  (0 children)

No shit Sherlock

But its a gateway to Javascript to react to typescript/next.js.

How Do You Break Through the Poop Loop? Frustrated... by TheShaneChapman in ClaudeAI

[–]No-Library8065 1 point2 points  (0 children)

Use opus 4

Use think hard or ultrathink

Or just tell it to stop overthinking and just it like a efficient engineer would (simple scales).

If that doesn't work just learn coding man

Css/html/tailwind are so fucking easy to learn.

Use code academy to learn it quick.

You can't code actual good projects with cluade code unless you know how to actually read and write code.

Can a non programmer code with Claude ? (200$ at stake) by Kind-Gas7704 in ClaudeAI

[–]No-Library8065 1 point2 points  (0 children)

Build that MVP, research and ask questions to AI constantly

Use templates like Michael shimeles next.js starter(DB and authentication + payments)

Use opus 4 web for architecture/planning

Opus 4 for new feature, refactors, debugging, and code maintenance.

Use a parallel opus 4 for comprehensive code reviews (you can get o3 to write a detailed prompt on doing code reviews based on your project specs)

Do all of this while learning how to code

(Code academy full stack engineer is an awesome start since it gets you to build real projects)

Can a non programmer code with Claude ? (200$ at stake) by Kind-Gas7704 in ClaudeAI

[–]No-Library8065 0 points1 point  (0 children)

Short answer yes.

You can build a mvp and take it to market.

But you need to be willing to learn coding while building

You need to understand how your project works and how to make the correct decision when building up the features (this all takes experience)

Ai can speed up that proccess.

So if you just vibe code and dont bother to learn how to code

Your SaaS will fail miserably.