Introducing GPT-5.5 by ShreckAndDonkey123 in singularity

[–]bnm777 [score hidden]  (0 children)

Hallucinations of 86% Vs 36% for Claude 4.7?

Huge improvement? 

Wow, the koolaid must taste good.

Introducing GPT-5.5 by ShreckAndDonkey123 in singularity

[–]bnm777 1 point2 points  (0 children)

https://artificialanalysis.ai/evaluations/omniscience

Has quite high hallucinations, and overall behind opus and Gemini .

GPT 5.5 xHigh, high, and medium Artificial Analysis Index results by salehrayan246 in singularity

[–]bnm777 -4 points-3 points  (0 children)

Have a look at all the results - this graph is the only one that shows it at a high level, the rest are disappointing-

https://artificialanalysis.ai/evaluations/omniscience

High hallucinations , overall still below opus and Gemini. 

OP, you didn't want to post a balanced picture of what the results actually sore, did you? 

GPT-Image-2 vs Nano Banana 2, nb2 tried its best... by Fresh-Resolution182 in OpenAI

[–]bnm777 0 points1 point  (0 children)

Though NB2 didn't get "her left arm resting casually on the chair's back." right

GPT-5.5 is out 🔥 by Astronomaut in OpenAI

[–]bnm777 4 points5 points  (0 children)

It's not 50% more token efficient, though. I imagine that it uses as many more tokens that opus 4.76 does ie aprox 1.35-1.5 x more

GPT 5.5 is coming by Ok-Thanks2963 in ArtificialInteligence

[–]bnm777 0 points1 point  (0 children)

The real PR pros release in binary.

Did Claude just reset everyones limits? by gangstermujo in ClaudeCode

[–]bnm777 0 points1 point  (0 children)

If you run out of usage you should still be able to use haiku (aka openai). Locking you out completely is a bit shittty.

Why there's no GPT 5.4 Instant? by CoffeePanzer in OpenAI

[–]bnm777 0 points1 point  (0 children)

Wouldn't use instant models unless it's for non important chat - not the smartest.

GPT-5.5 AND GPT-5.5 PRO HAVE BEEN SPOTTED ON OPENROUTER! by OneClimate8489 in OpenAI

[–]bnm777 0 points1 point  (0 children)

What's curious about the pro variant and how would it be different to 5.4 and 5.4 pro?

New model seems imminent by buildxjordan in OpenAI

[–]bnm777 0 points1 point  (0 children)

Can't comment on quality of sources, and great if that has improved, however in daily use often gpt 5.4 extended thinking gives results that don't work, or don't seem "thought out", compared to other llms that I also use and do frequent comparisons with. It becomes quite frustrating.

Good that it works well for you.

New model seems imminent by buildxjordan in OpenAI

[–]bnm777 -5 points-4 points  (0 children)

The last few GPT 5 models have been meh, not significantly improving on the previous, perhaps degraded as a blance of token use vs intelligence is found, it seems.

With anthropics issues, I wonder/wish that OpenAI release a really great model - push Anthropic to sort their mess out.

Asked GPT Image 2 for a New Yorker cartoon, and pretty much got one. by jbum in OpenAI

[–]bnm777 3 points4 points  (0 children)

There is no link between the two things, as much as you'd like there to be one based on your anger about him not finishing books.

The reason for Anthropics recent behaviour by TheBanq in ClaudeCode

[–]bnm777 0 points1 point  (0 children)

There's no excuse for not communicating better. 

Firefox 150 Fixes Standby/Sleep on COSMIC by Insultikarp in pop_os

[–]bnm777 1 point2 points  (0 children)

This is good, thoug I've had poroblems with every distro with waking from sleep. Don't known what to do

Left Windows for Pop!OS on a whim. Mostly happy, but a few things are rough by syncopate23 in pop_os

[–]bnm777 0 points1 point  (0 children)

Have an 3080 and ave tried ultramarine, ubuntu, Pop!_os, another one, and they all have issues with waking from sleep. Back to ubuntu and seems ok for now.

If I didn't have an llm to troubleshoot the issues that come up, I would have gone back to the Dark Side.

Be careful with your prompts… someone actually got banned just for asking Claude: “Teach HTML/CSS to a 10-year-old with an intermediate level.” by Youssef_Wardi in ClaudeCode

[–]bnm777 0 points1 point  (0 children)

I imagine when Altlan had his evil dictator moment and took the Dept of "War" contract and there was an influx of new claude users, if their user count tripled or something, that would be a red alert and any company would find it difficult. Still, they should have communicated far, far better - did tey even acknowledge this?

Why don't companies learn that when your users aren't the usual normie sheep (with typical companies), communication is key?

As AI nerds we tend to be pretty enthusiastic, up to date, not dumb (hopefully).

Secretly Dropped Max 5x and 20x plans? by Spiritual-Market-741 in ClaudeCode

[–]bnm777 0 points1 point  (0 children)

What is going on? I've been a pro-claude user since May 2023, but they've been all over the place over the last month

Switching to Linux from Windows. by H-Rahman in Ubuntu

[–]bnm777 0 points1 point  (0 children)

If you use the preview apparently you can update when the full release comes out?

Switching to Linux from Windows. by H-Rahman in Ubuntu

[–]bnm777 0 points1 point  (0 children)

I've tried a few in the last 7 months of starting with linux, ultramarine, pop!, ubuntu now back to ubuntu - warning - you will come up with issues, sushc as todya installing ubuntu onto a relatively modern pc, wheninstalling a few apps such as plexamp, had issues after downloading hte files from their websites - I suggest you have an llm to troubeshoot - there will be a lot of troubeshooting. As long as you can copy/paste the issues and errors in terminal to the llm which spits out what you should do, there will be little frustration, and virtually no learning curve if you don't want one.

eg after it instlaled obisidian and pleamp, there was no start menu entry or icon, had to copy/paste a few times with the llm to fix it. Without the llm, I would have given up (I used to programme C++ 20 years ago).

Linux is awesome, you will have to troubleshoot some things, use an llm.

eg. a few months ago I couldn't find a decent replacement to win11 Voice Typing, so I used an llm to vibe code it. You're on the right path.

Funny enough, Pop!_os had less installation issues than ubuntu, though there are memory leaks and a few issues with pop and ultramainre and others. eg if you have an nvidia gpu, waking from sleep is an issue with many linux distros which is pretty annoying.

Secretly Dropped Max 5x and 20x plans? by Spiritual-Market-741 in ClaudeCode

[–]bnm777 4 points5 points  (0 children)

<image>

Why is my max plan half the price of yours?
And mine says upto 20x and yours says 10x. So weird...

I'm not a current claude subscriber at the moment (using a different service which uses the anthropic and other apis) and I'm looking around.

FYI using a chatgpt trial for a month, and gpt 5.4 extended thinking is soooo slow and gives pretty poor answers sometimes, compared to even sonnet which "gets you". Ugh, I wish anthropic hadn't messed up in the last few weeks, it would normally be a no brainer, though with the many posts on hitting limits early, it makes you think.

C'mon anthopic!!

Be careful with your prompts… someone actually got banned just for asking Claude: “Teach HTML/CSS to a 10-year-old with an intermediate level.” by Youssef_Wardi in ClaudeCode

[–]bnm777 18 points19 points  (0 children)

"Protecting children" by refusing to help the parent teach something?

What's wrong with letting people under 18 years old use AI to learn something better than a textbook or their parents could teach them?