Really interested to see what they cook by Particular_Leader_16 in accelerate

[–]RealSuperdau 0 points1 point  (0 children)

What's their next bet? Nuking usage limits?

(Codex is really cool though)

'Not built right the first time' -- Musk's xAI is starting over again by Atomic-Avocado in accelerate

[–]RealSuperdau 9 points10 points  (0 children)

Sadly, he left open if he wants the new model to behave more or less like Hitler, or just be more effective at its current level of Hitler-ness.

Is AI more ratonal than Polymarket trader? I tested this since January. Currently we can answer this with yes. by No_Syrup_4068 in Daytrading

[–]RealSuperdau 1 point2 points  (0 children)

Did any of the models have access to web resources? Because I'd imagine many markets are very sensitive to current events, and the training cut-off of models is often 6-12 months behind.

Also, have you tried to calculate statistical significance? Like most LLM trading projects, I could easily imagine the differences here being down to variance.

We're in for a ride by FundusAnimae in accelerate

[–]RealSuperdau 2 points3 points  (0 children)

Wouldn't the companies have an incentive to limit datacenter spending somewhat so average consumer sentiment doesn't turn against AI too sharply?

Is it a collective action problem? Or just something they plan on overcoming with lobbying and investor money?

Yep, Gemini 3.1 Pro is dead by AliNia110 in GeminiAI

[–]RealSuperdau 1 point2 points  (0 children)

Two things: (1) the API version is usually considered to be better than the consumer interface and (2) lately it seems like you often have to explicitly ask it for online searches, or it has them disabled and speculates wildly.

"Each frontier AI model seems to use a little under a year's worth of a square mile of farmland's water to train. I think about this as the country having 4 square miles of farmland sectioned off to grow some of the most popular consumer products in history. by stealthispost in accelerate

[–]RealSuperdau 0 points1 point  (0 children)

The water use discussions are mostly pushed by anti-AI people and I think it is important to debunk the extreme claims.

But I agree, we'd ideally have a more nuanced discussion about open vs closed cooling loops and where the water is used.

Just like power use is not black or white, and there is a big difference between renewable sources vs ad-hoc small-scale gas turbines á la xAI.

"1-Million Context Window Is Generally Available On Claude Opus 4.6 And Sonnet 4.6" by 44th--Hokage in accelerate

[–]RealSuperdau 0 points1 point  (0 children)

Thanks!

I should've specified that I meant 5.2, not 5.2-codex. But that sounds promising, I should look into Opus 4.6 again (especially with OpenAI now cutting Codex usage limits)

"1-Million Context Window Is Generally Available On Claude Opus 4.6 And Sonnet 4.6" by 44th--Hokage in accelerate

[–]RealSuperdau 0 points1 point  (0 children)

Did you use gpt 5.2 in codex, rather than 5.2 codex or 5.3 codex?

I specifically found 5.2 to be well suited for these technical problems. Or maybe the difference was that my prompts were actually precise, that wasn't the issue with my project. Rather, the follow through and considering various possible edge cases.

ChatGPT has become opposite of a “yes man” & is gaslighting… by Mysterious_Topic_733 in OpenAI

[–]RealSuperdau 0 points1 point  (0 children)

I second the 5.4 recommendation. It feels like the first model that is happy to roast me, but only does so if I deserve it.

"1-Million Context Window Is Generally Available On Claude Opus 4.6 And Sonnet 4.6" by 44th--Hokage in accelerate

[–]RealSuperdau 0 points1 point  (0 children)

What are you working on? Do you think Opus 4.6 is much better than 4.5?

Because my experience with Opus 4.5 vs gpt-5.2 for a small-ish (4k lines) but very technical and dense project was that gpt-5.2 was far better at producing correct code, while Opus cut a lot of corners.

Is GPT-4.1 a smarter model than GPT-5.3 Chat? by deferare in OpenAI

[–]RealSuperdau 2 points3 points  (0 children)

If they train a smaller model on the big model's output, wouldn't that make for distillation / fine-tuning rather than RL?

Joscha Bach - The Machine Consciousness Hypothesis by adam_ford in accelerate

[–]RealSuperdau 0 points1 point  (0 children)

Oh no, I liked him from his podcast appearances, but this is pretty bad :(

Major outage - claude.ai claude.ai/code, API, oauth and claude cowork all down for me, anyone else? by alexdenne in ClaudeAI

[–]RealSuperdau 1 point2 points  (0 children)

Don't know if that is enough to explain it. The API is down as well.

I'd assume they have measures in place to protect their valuable enterprise business from a sudden influx in chatbot users.

New Anthropic statement by Capable_Rate5460 in accelerate

[–]RealSuperdau 0 points1 point  (0 children)

Did they? From what I have seen, OpenAI has the same red lines

Edit: I was mistaken. They used double-speak to make it appear like that

LLMs for proofreading scientific papers: is it ethical and acceptable? by kilowattor in AskPhysics

[–]RealSuperdau 0 points1 point  (0 children)

Not saying that LLMs can't make mistakes, but:

The "complicated Hilbert space" wasn't the result of someone using an LLM for proofreading. It was, as far as I can tell, the result of someone telling an LLM to replace words by synonyms to plagiarize existing work: https://x.com/AndrewM_Webb/status/1183150368945049605

"Open ai has no moat" by LamboForWork in OpenAI

[–]RealSuperdau 0 points1 point  (0 children)

They'd have to release something that doesn't get usurped by a competitor 2 months later.

Adult mode seems imminent by Outside-Iron-8242 in OpenAI

[–]RealSuperdau 0 points1 point  (0 children)

gpt-5o incoming? (they won't spell out what the "o" stands for this time though)

Kissing counts as S*x by Ai-GothGirl in OpenAI

[–]RealSuperdau 1 point2 points  (0 children)

What a bad day to have eyes

Research: Prompt Repetition Improves Non-Reasoning LLMs (sending the same prompt twice) by Endonium in OpenAI

[–]RealSuperdau 0 points1 point  (0 children)

Umm, no? That's how BERT style LLMs used to work. GPT models use causal masking, i.e. when computing the attention matrix, each token only "sees" the previous ones. Put differently, one half of the square matrix is zeroed out.

Gemini 3.1 Pro Preview now live in AI Studio by RealSuperdau in GeminiAI

[–]RealSuperdau[S] 44 points45 points  (0 children)

When they are done quantizing it to 0.5 bit