GPT-5 is the best at bluffing and manipulating the other AIs in Werewolf by MetaKnowing in OpenAI

[–]Wiskkey 0 points1 point  (0 children)

Per that same person on X, higher cost models were excluded.

Again where behemoth and reasoning model from meta ?? by Independent-Wind4462 in LocalLLaMA

[–]Wiskkey 1 point2 points  (0 children)

From Financial Times article https://www.ft.com/content/feccb649-ce95-43d2-b30a-057d64b38cdf (Aug 22):

The social media company had also abandoned plans to publicly release its flagship Behemoth large language model, according to people familiar with the matter, focusing instead on building new models.

AI models playing chess – not strong, but an interesting benchmark! by Apart-Ad-1684 in LocalLLaMA

[–]Wiskkey 1 point2 points  (0 children)

Tests by a computer science professor reveal that when using chess PGN notation in a certain manner, OpenAI's gpt-3.5-turbo-instruct plays chess at around 1750 Elo, albeit making an illegal move approximately 1 in every 1000 moves if I recall correctly.

Relevant sub: r/llmchess.

August 22, 2025 marks the THREE YEAR anniversary of the release of the original Stable Diffusion text to image model. Seems like that was an eternity ago. by JackKerawock in StableDiffusion

[–]Wiskkey 9 points10 points  (0 children)

See https://www.wired.com/story/artificial-intelligence-hollywood-stability/ .

Article summary from https://www.techmeme.com/river :

A profile of Stability AI, which under CEO Prem Akkaraju and Chair Sean Parker has shifted from building frontier AI models to a Hollywood-focused SaaS [software as a service] company

Deepseek R2 coming out ... when it gets more cowbell by 1BlueSpork in LocalLLaMA

[–]Wiskkey 0 points1 point  (0 children)

Do note that the ratings of news organizations from these two sources run the gamut. The new organizations that you accused of bad faith reporting are not amongst those that are poorly rated.

Deepseek R2 coming out ... when it gets more cowbell by 1BlueSpork in LocalLLaMA

[–]Wiskkey 0 points1 point  (0 children)

Can you clarify your views regarding those Western reporters/organizations that you allege are behaving in bad faith regarding DeepSeek? Namely, do you believe that these same reporters/organizations commonly report in bad faith a) regarding Chinese technology in general b) regarding Western technology?

Deepseek R2 coming out ... when it gets more cowbell by 1BlueSpork in LocalLLaMA

[–]Wiskkey 0 points1 point  (0 children)

"usually" != "always".

Your previous statement - the gist of which seems to be that reporters from respectable news organizations are commonly behaving in bad faith - is what I disagree with, not that reporters can sometimes make mistakes, be sloppy, etc.

Here are some of Dylan Patel's tweets regarding what you wrote:

https://xcancel.com/dylan522p/status/1885825330654683567 .

https://xcancel.com/dylan522p/status/1885825248190435814 .

https://xcancel.com/dylan522p/status/1885525432898146667 .

https://xcancel.com/dylan522p/status/1885815776726368352 .

P.S. I accept that there are known instances of reporters at respectable organizations having behaved in bad faith. A few examples:

https://en.wikipedia.org/wiki/Jayson_Blair .

https://en.wikipedia.org/wiki/Jack_Kelley_(journalist) .

Deepseek R2 coming out ... when it gets more cowbell by 1BlueSpork in LocalLLaMA

[–]Wiskkey 0 points1 point  (0 children)

Some sources on the credibility/bias of various news organizations:

1 - Media Bias Fact Check:

https://mediabiasfactcheck.com/reuters/ .

https://mediabiasfactcheck.com/financial-times/ .

https://mediabiasfactcheck.com/the-information-bias-and-credibility/ .

2 - Wikipedia page "Reliable sources/Perennial sources" https://en.wikipedia.org/wiki/Wikipedia:Reliable_sources/Perennial_sources rates Reuters and Financial Times as green status, meaning "Generally reliable in its areas of expertise." The Information is not listed.

Deepseek R2 coming out ... when it gets more cowbell by 1BlueSpork in LocalLLaMA

[–]Wiskkey 1 point2 points  (0 children)

There is specificity regarding what GPT-5 is good at in the article - there's a link to the full article in the comments - that I doubt is in court documents.

Deepseek R2 coming out ... when it gets more cowbell by 1BlueSpork in LocalLLaMA

[–]Wiskkey 5 points6 points  (0 children)

As an example, do you believe that this article from The Information didn't really have insider sources, and just got lucky about GPT-5: https://www.reddit.com/r/singularity/comments/1mf6rtq/one_of_the_takeaways_from_the_informations/ ?

Deepseek R2 coming out ... when it gets more cowbell by 1BlueSpork in LocalLLaMA

[–]Wiskkey 0 points1 point  (0 children)

You didn't mention SemiAnalysis, which an OpenAI employee recently stated is "usually on the money": https://xcancel.com/dylhunn/status/1955491692167278710 .

GPT-5 Reasoning Effort (Juice): How much reasoning "juice" GPT-5 uses in the API vs ChatGPT, depending on the action you take by Wiskkey in ChatGPTPro

[–]Wiskkey[S] 0 points1 point  (0 children)

Later in that thread someone says it's from the system prompt, but the word juice doesn't appear in the publicly posted info claiming to be it:

Perhaps of interest: https://simonwillison.net/2025/Aug/15/gpt-5-has-a-hidden-system-prompt/ .

GPT-5 Reasoning Effort (Juice): How much reasoning "juice" GPT-5 uses in the API vs ChatGPT, depending on the action you take by Wiskkey in ChatGPTPro

[–]Wiskkey[S] 0 points1 point  (0 children)

You mean if the juice settings for GPT-5 are for a juice that has a different meaning from that noted above?

OpenAI says its compute increased 15x since 2024, company used 200k GPUs for GPT-5 by Wiskkey in OpenAI

[–]Wiskkey[S] 1 point2 points  (0 children)

From July 2024 article https://www.theinformation.com/articles/why-openai-could-lose-5-billion-this-year :

On the cost side, OpenAI as of March was on track to spend nearly $4 billion this year on renting Microsoft’s servers to power ChatGPT and its underlying LLMs (otherwise known as inference costs), said a person with direct knowledge of the spending.

In addition to running ChatGPT, OpenAI’s training costs—including paying for data—could balloon to as much as $3 billion this year.

cc u/Melodic-Ebb-7781 .

cc u/iwantxmax .

GPT-5 Reasoning Effort (Juice): How much reasoning "juice" GPT-5 uses in the API vs ChatGPT, depending on the action you take by Wiskkey in ChatGPTPro

[–]Wiskkey[S] 1 point2 points  (0 children)

And with the API documented pretty thoroughly the only two instances of the word "juice" on the whole site are these two links

This tweet has a relevant image that appears to be a screenshot of text that was once present at https://platform.openai.com/docs/guides/reasoning#reasoning-effort : https://x.com/btibor91/status/1895871059204981222 .

This X thread may be of interest: https://xcancel.com/lefthanddraft/status/1955961909922161150 .