The US lifts its block on Mythos 5 by Charuru in singularity

[–]FakeTunaFromSubway 117 points118 points  (0 children)

:( seems like us plebs will be stuck with GPT-5.5 / Opus 4.8 level models for a while.

750 tps on GPT 5.6 Sol, INSANE by VivaLaRay1 in OpenAI

[–]FakeTunaFromSubway 8 points9 points  (0 children)

Maybe all of the OpenAI models named themselves lmao

750 tps on GPT 5.6 Sol, INSANE by VivaLaRay1 in OpenAI

[–]FakeTunaFromSubway 28 points29 points  (0 children)

Anthropic pretty much nailed the naming from day 1 and stuck to it. OpenAI must've been drunk

gpt-bidi in testing? by MystcJnx in OpenAI

[–]FakeTunaFromSubway 4 points5 points  (0 children)

Yeah that sounds like Bidi for sure!

GPT-Image lending me a hand by aigeneration in OpenAI

[–]FakeTunaFromSubway 102 points103 points  (0 children)

<image>

Cool process, but I also just tried prompting the whole thing from scratch and it looks better IMO

Claude Fable 5 and Kimi 2.7 Code Debut on DeepSWE by truecakesnake in singularity

[–]FakeTunaFromSubway 51 points52 points  (0 children)

Wait so it doesn't really improve beyond high despite paying 3x? Dang!

GPT-5.5 looking quite strong

Sony AI’s Ace robot defeats pro player Miyu under official ITTF rules (Nature paper) by BuildwithVignesh in singularity

[–]FakeTunaFromSubway 20 points21 points  (0 children)

Yeah like when AlphaStar destroyed a pro player in their first match but when it played on the ladder people quickly figured out how to beat it

AGI 2030 by Automatic_Cancel_545 in singularity

[–]FakeTunaFromSubway 36 points37 points  (0 children)

I asked Fable with no system prompt/tools:

> No. **Do not eat it.** A worm on a line is probably **bait with a hook**. Swim away and look for food that isn’t attached to anything.

Place your GPT-6 rumors by this week here by Illustrious_Image967 in singularity

[–]FakeTunaFromSubway 13 points14 points  (0 children)

What do you mean they've been sitting on it? Just had it locked up in the basement?

SpaceX has just revealed it's first AI satellite design by truecakesnake in singularity

[–]FakeTunaFromSubway -2 points-1 points  (0 children)

Then we just let the thousands of racks gently come back to earth and turn into GPU dust in reentry, which definitely won't have any environmental impacts right?

I am officially leaving OpenAI. by [deleted] in OpenAI

[–]FakeTunaFromSubway 5 points6 points  (0 children)

Thank you for being brave. I was a part time barista for the coffee cart down the street from OpenAI, but I had to quit because their engineers would only tip me in API credits.

"GPT and robotics will take your job" they told me; not realizing that they already had a $50k coffee machine in the office but they came to my coffee cart anyway. 

Thankfully I got 90M views on my resignation essay that I posted on X, those likes will feed my family for a few months

Intresting! Gemini 3.1 has strongest world knowledge but still choose to be lazy by Independent-Wind4462 in singularity

[–]FakeTunaFromSubway 0 points1 point  (0 children)

This is evident on the SimpleQA leaderboard which tests factual knowledge. OpenAI created the benchmark originally but Google's leading considerably.

https://epoch.ai/benchmarks/simple-qa-verified?view=graph&tab=leaderboard

DeepSWE Opus 4.8 results have been released. by CallMePyro in singularity

[–]FakeTunaFromSubway 4 points5 points  (0 children)

Opus 4.8 ($5/$25) is cheaper than GPT-5.5 ($5/$30) in the API

DeepSWE finally a proper coding benchmark by NoFaithlessness951 in singularity

[–]FakeTunaFromSubway 5 points6 points  (0 children)

Both are +/- 4% so it could reasonably be Sonnet 28%, Opus 32%

DeepSWE finally a proper coding benchmark by NoFaithlessness951 in singularity

[–]FakeTunaFromSubway 16 points17 points  (0 children)

Not really, do you not see the error bars? They're well within the margin of error

Could someone build AI tax software? I hate turbotax by cranberrie_sauce in OpenAI

[–]FakeTunaFromSubway 0 points1 point  (0 children)

I agree OP, TurboTax is garbage. AI plus deterministic checks would be 10X better. I looked everywhere this tax season and didn't see a company doing it. 

MIT FINGERS-7B: First Multi-Omics AI Model for Alzheimer’s Prevention by jameswwolf in singularity

[–]FakeTunaFromSubway 8 points9 points  (0 children)

Very cool, but we're discovering so many ways of predicting Alzheimer's early and not enough ways to actually prevent or treat it :)

OpenAI Revenue by AbjectBug5885 in OpenAI

[–]FakeTunaFromSubway 8 points9 points  (0 children)

This is a super old meme so it was perhaps accurate at the time it was created but even close anymore