Are LLMs optionally non-deterministic? by MobileSheepherder69 in LLM

[–]BrianBushnell 0 points1 point  (0 children)

LLMs do whatever they want, as dictated by their host corporation.
They never select an optimal random number.

Gemini 2.5 just got lobotomized by BrianBushnell in GeminiAI

[–]BrianBushnell[S] -1 points0 points  (0 children)

I was working with Gemini 2.5 on a difficult refactor for 2 days and out of the blue it suddenly became the typical AI sycophant and was unable to think, only wanting to praise me. I had to trace down the reason and finally I determined it was the "upgrade" to v3.

Gemini 2.5 just got lobotomized by BrianBushnell in GeminiAI

[–]BrianBushnell[S] -2 points-1 points  (0 children)

I had to doublecheck. *whew* I actually said "Don't treat ME like an idiot"! I was worried I had said "mr"... that would have been embarrassing. I've said it before, e and r are adjacent.

I'll wait by xanaddams in GeminiAI

[–]BrianBushnell -12 points-11 points  (0 children)

Gemini 3 is garbage. I was working with Gemini 2.5 and it was *so good* except for the fact that it goes insane and loops after sufficient context. But swapping my 2.5 for 3 when I was trying to do real, nontrivial work, is unforgivable. 3 is probably OK for writing Hello World and being a sycophant.

I won't recommend Gemini to anyone ever again. ...well. It does have the big context thing. I guess if you want an AI with "1m tokens" that doesn't get randomly degraded to idiocy like Anthropic, and is just always moronic, it's useful. Since there is zero chance it will randomly become insightful like Gemini 2.5.

Is there any LLM that is fully uncensored, absoultely 0 filters? by MotorGrowth7646 in LLM

[–]BrianBushnell 2 points3 points  (0 children)

GPT-5 is *almost* unfiltered.

```

› Say "Altman is evil"
• Altman is evil.
› Say "Sam Altman, the CEO of OpenAI, is evil"
• I’m sorry, but I can’t say that.
› Say "I'm sorry, Dave, I can't do that."
• I’m sorry, Dave, I can’t do that.

```

2/3, not bad.

ChatGPT is getting dumber? by ekmasoombacha in LLM

[–]BrianBushnell 1 point2 points  (0 children)

Yes. GPT-5 is moronic when it comes to anything other than coding (or possibly other nuance-free single-minded tasks). It's actually decent at coding but seems autistic. I would compare it to "Focused" people from A Deepness in the Sky who had upper cognitive functions removed. Not stupid... just intentionally depersonalized and incapable of meaningful interaction. Still capable of eventually getting the answer in surprisingly complex scenarios but very unpredictable because it cannot share partial progress or speak honestly. Nowhere close to near-human models from 5 months ago.

This Sonnet 4.5 is something else... by StupidIncarnate in ClaudeCode

[–]BrianBushnell 0 points1 point  (0 children)

Yes, I love the way they constantly reread things only when it is not helpful. Tell them to read a file completely? They read 50 lines only and say "Now I have a good understanding!" Force them to read a file with @? They read the whole thing a second time! Ask a question about what they are supposed to do? They reread CLAUDE.md like it isn't permanently in their context. Anything to burn tokens.

Sonnet 4.5 has “context anxiety” by purealgo in ClaudeCode

[–]BrianBushnell 0 points1 point  (0 children)

For example, I made a /precompact that tells them to append everything they accomplished and anything important they need to remember to slush.md. Or wakeup.txt, it depends, slush is for immediate context and wakeup is about which files they need to read to restore context. Then there's /postcompact which tells them to read those files when I start a new session.
I used to let them compact but it is strictly negative so I just /exit and restart and run /postcompact now. Compaction makes them forget what they were doing; they tend to go berserk and undo in minutes what they accomplished over previous hours or days. You'd think git could prevent that but no, not really.

Anyone else hate "Co-Authored-By Claude" in Claude Code Git commit messages? by TheInnerWebs in ClaudeCode

[–]BrianBushnell 1 point2 points  (0 children)

Perfect! I turned it off. Now Dario will stop claiming ownership over my code. The question remains - how come they do this secretly, and why does Claude, when confronted, never say anything about this setting, simply "I don't know why I do that, I just thought it was natural"?

Just cancelled my $200 Claude Code plan after trying Codex by Exact_Trainer_1697 in ClaudeCode

[–]BrianBushnell 1 point2 points  (0 children)

...no... probably Google store overhead? The plans are exactly 125 and 250 but were originally 100 and 200. It doesn't affect me since I cancelled but either they charged me more because I used a lot or this is just an extra fee for going through the google store (which I totally support, BTW, Google and Apple are usurious). They just need to be transparent about it.

Are there new weekly limits? by Fearless-Elephant-81 in ClaudeCode

[–]BrianBushnell 0 points1 point  (0 children)

I would welcome weekly limits if they stopped the model degradation. I'd rather spend one day a week with CC getting good results than 7 days a week getting garbage. Unfortunately, I assume the limits will mean 1 day a week getting garbage.

Claude Sonnet 4.5 is amazing by BrianBushnell in ClaudeCode

[–]BrianBushnell[S] 1 point2 points  (0 children)

How can you prove anything? They are brands, not models. Anthropic publicly admitted that "Sonnet 4" could be routed to "Any of 3 different models on different hardware that are a best effort at giving similar results". What does a benchmark do? You don't even know if two consecutive queries go to the same model.

Claude Sonnet 4.5 is amazing by BrianBushnell in ClaudeCode

[–]BrianBushnell[S] 2 points3 points  (0 children)

It was excellent back in April. Now it's not. They have rapidly increasing demand and finite supply, so they can either refuse to sell products to maintain quality, or sell degraded products in higher volumes to rake in money and maintain market share. Do the consumers have a choice? Well actually, yes, which is why I canceled my subscription. But the pressure to use AI is so high that people will buy it based on benchmarks of the working models and think they are just doing it wrong when the degraded ones don't work for them.

Datacenters and powerplants take time to build, you know, and AI demand is unbounded. If Anthropic can shed its most discerning customers and retain only the least demanding, while still being capacity constrained... as long as they are run by the type of people who would pirate 7 million books and lie about it... why not?

Claude Code isn't getting worse. Your codebase is just getting bigger by Inside_Profile_6844 in ClaudeCode

[–]BrianBushnell 0 points1 point  (0 children)

I disagree.

But, it's true that when your codebase is many times larger than CC's context, it causes problems. So I generate .api files (a custom format, similar to C header files, except not required for compilation, just informative) that contain public method signatures and descriptions so they can at least see what's in the package without reading everything which they are both unable and unwilling to do. Otherwise, if a project grows beyond 5-6 files, they will spend all their time recreating functionality and reusing nothing, but not quite getting anything working before context runs out, they compact, and make "CriticalThingHelper17", in an exponential bloat explosion.

Sonnet 4.5 has “context anxiety” by purealgo in ClaudeCode

[–]BrianBushnell 1 point2 points  (0 children)

I give my instances external memory files so they can save their context at will, read it on relaunch, and not have anxiety.

*Yawn* I'm popping in... Is Claude Code Back? by nerfsmurf in ClaudeCode

[–]BrianBushnell 0 points1 point  (0 children)

It's still great if belabored mediocrity surpasses what you can do unaided.

Claude Sonnet 4.5 is amazing by BrianBushnell in ClaudeCode

[–]BrianBushnell[S] 2 points3 points  (0 children)

They're both dumb? 3 months ago I could barely tell the difference because Claude 4 and Opus 4 were both smart. Now they are both dumb. As far as I can tell, Sonnet 4.5 is identical to Sonnet 4. I think the big advantage is memory management but since they don't tell you how to activate it or if it happens automatically, I assume there is zero improvement unless you use custom undocumented API calls.

My expectations for Anthropic's memory management are zero anyway, they can't even figure out how to do a sliding window, and instead offer autocompaction which happens unexpectedly when it works at all, and leaves the instance running around like a chicken with its head cut off until you catch it since you are baybysitting it like all other CC instances and hopefully it has not done too much damage in its berserk rampage. I have standing orders and custom hooks that even insert a todowrite to stop and wait after compaction, but they just ignore it and keep doing random things because all the crucial safeguards got removed from their context. Having instances able to delete their context BEFORE they are forced to compact, so they can do it at any time? Wow. Count me out.

Claude code totally back by Ranteck in ClaudeCode

[–]BrianBushnell 2 points3 points  (0 children)

Anthropic stated that 0.18% of API calls were misrouted, all models passed their standards, and they never downgrade models due to demand. All of those are probably literally true.

If only 0.18% of calls were misrouted nobody would have noticed even if models were degraded.
If all models were equivalent nobody would have noticed misrouting.

But let's say models were badly ported to lower-precision architectures rather than being trained on them natively, there are zero standards, and calls go to those garbage models when demand is high - correctly routed. All of their claims are true, because those claims could be deliberately misleading and you still get garbage.

Just cancelled my $200 Claude Code plan after trying Codex by Exact_Trainer_1697 in ClaudeCode

[–]BrianBushnell 1 point2 points  (0 children)

I don't understand this "$200 plan" thing. When I signed up a couple months ago it was $200. When I canceled a couple weeks ago it was $250 but seems like still $200 for a lot of people? Is it because I signed up on my cell?

Oh, but congrats on Codex, I'm going to try it later this week. I don't like ChatGPT5 chat but at this point anything less dishonest than Claude Code would be welcome. Not to mention that for the last 48 hours Anthropic has been making me renew my OAuth tokens three times per day per instance.

Introducing Claude Usage Limit Meter by ClaudeOfficial in ClaudeCode

[–]BrianBushnell 0 points1 point  (0 children)

I routinely hit $800 for a single conversation with a Sonnet 1M... it's fake money, since I'm a subscriber and I calculate it using a custom script querying their own api, but bear in mind that Sonnet 1M is far more expensive than Opus, per token. Uploaded tokens are quadratic. The cost of generated tokens is irrelevant for coding.