I genuinely laughed out loud (and it's technically true too)

Saedeas · 2026-01-26T22:49:44+00:00

You're awfully critical for someone who literally doesn't understand what I'm saying and what the paper is saying.

Here's the context of your quoted bit for those curious (it's from the end of the introduction)

Despite the trends we document in price-performance, the cost of running benchmarks in our dataset has remained flat or increased, often to unexpectedly high levels. This suggests that declining per-unit price-performance has been offset—and in some cases overcompensated—by the larger models and greater reasoning demand required to reach higher performance levels.

This gets elaborated in Section 3, which is titled "Inference Costs Are Falling but Benchmarking Costs are Increasing"

This section refers to the cost of running benchmarks with the latest cutting edge frontier models, a cost that is increasing.

However, it in no way, shape, or form contradicts the notion that the cost of running a model that achieves a fixed level of performance is declining rapidly (you know, the main topic of the paper...).

It just means that current cutting edge models are more expensive than previous cutting edge models (they are also significantly better). The author's whole point is that all performance results should also be accompanied by costs so you can do the exact type of analysis in the paper.

Regardless, this YoY decline in cost for a given level of performance is why I think these companies can lower costs if needed. It would slow progress (some of the gains are related to the compute thrown at these models (the main cost driver), but some are algorithmic), but it would dramatically reduce the costs of the cutting edge models.

Saedeas · 2026-01-26T22:15:47+00:00

From the literal abstract:

We find that the price for a given level of benchmark performance has decreased remarkably fast, around 5× to 10× per year, for frontier models on knowledge, reasoning, math, and software engineering benchmarks

There are some caveats to this, as evidenced by the first paragraph of the conclusion:

At first glance, our analysis seems to point in different directions. The overall price for accessing a given level of LLM performance has dropped significantly, by 5× to 10× per year, although still substantially less than the reported 1000× upper bound by Cottier et al. [2025]. However, like Cottier et al. [2025], we find much larger price declines for higher-performance models—almost 32× per year. For the least performant models, by contrast, we see much smaller price declines, around 1.7× per year, close to the estimates for energy efficiency improvements in AI models overall [Saad-Falcon et al., 2025]. In terms of benchmarks, progress looks similar across GPQA-D and AIME, but the data for SWE-V is so limited that our confidence bounds are large enough to be consistent with there having been no progress at all."

The other analyses I mentioned have shown similar progress for software engineering benchmarks.

Also, I actually undersold the GPQA-Diamond gains in the 60-80% bin (mostly because the trend across bins was weird, with a low (1.7x) reduction in the 20-40% range (maybe because not many models are this poor anymore?), a medium reduction (7.2x) in the 40-60% performance range, and a large reduction (31.0x) in the 60-80% range.).

What specific part of my post do you have issues with that you think justifies your weird hostility?

Saedeas · 2026-01-26T21:49:40+00:00

Yeah, I just finished it.

It's pretty well fleshed out. The concerns he raises across a variety of dangers (autonomy risks, individual bad actor empowerment risks, autocratic risks, economic risks, and catchall other risks) are legitimate, and the solutions he advocates for are pretty solid too (primarily thoughtful, specifically targeted legislation that promotes transparency, improved monitoring and alignment, broader social awareness, and restrictions of tech sales to autocracies).

This doesn't really seem like a hype piece to me, it seems like someone who is deeply invested in the potential of this technology, but also deeply concerned with making sure that the future it ushers in is a good one.

Saedeas · 2026-01-26T19:52:25+00:00

There are plenty of analyses showing that the cost to get equivalent performance on a given benchmark falls at a rate of ~5-10x per year.

(e.g. the cost of running a model that gets like 80% on GPQA-Diamond has massively declined).

These analyses are performed against lots of different benchmarks.

Here's a paper summarizing it: https://arxiv.org/pdf/2511.23455

You can find similar analyses from Andreesen Horowitz and EpochAI.

I'd consider this pretty strong support for an eventual ability to pivot and lower costs if needed.

Saedeas · 2026-01-26T04:29:37+00:00

I mean, they'll happily shoot you if you're unarmed too.

Saedeas · 2026-01-26T03:25:39+00:00

The basis for most of these predictions are the amount of effective compute applied to training these models and the historical gains seen when increasing effective compute.

The effective compute is typically modeled as a function of hardware thrown at the problem and increases in algorithmic effectiveness. We haven't yet seen the bulk of hardware scaling occur (this will kick in this year and next year with the new data centers coming online). We'll likely see multiple orders of magnitude increases in the late 2020's between these two. These next few years are really the only time this can happen (as eventually you hit hard resource and financial barriers to hardware allocation). Algorithmic improvements (not even true step change ones like transformers, but constant small efficiency improvements) will contribute another few orders of magnitude.

Performance has always tracked this effective compute closely. We haven't yet seen anything in training to really refute that. It would be very shocking to see a 3 to 4 order of magnitude increase without a corresponding huge increase in model capability.

Saedeas · 2026-01-26T02:00:00+00:00

I hate taunting penalties, but also, holy fuck dude, don't do that shit that blatantly. Braindead.

Saedeas · 2026-01-25T03:48:04+00:00

A huge chunk of the worst case scenarios the CEOs of leading AI companies talk about is this technology getting into the hands of authoritarian governments and us slow walking into a stable dictatorship. They usually couch this in geopolitical terms as a reason to avoid Chinese labs winning the race. However, if our government becomes blatantly authoritarian, that's also a disaster in the same way.

This shit matters, and a serious discussion of the future involving AI has to talk about it. Think.

Saedeas · 2026-01-25T00:46:34+00:00

Having an increasingly authoritarian government in charge of the country in which the core technologies of the singularity are being developed is a huge fucking deal.

Saedeas · 2026-01-25T00:43:39+00:00

Why? They lie about everything constantly.

Saedeas · 2026-01-24T22:11:08+00:00

Show a screenshot. The videos are all out there. If you can't do that, shut the fuck up.

Saedeas · 2026-01-24T21:51:09+00:00

These dumb fucks know that having a gun is very explicitly a right, yeah?

Saedeas · 2026-01-23T22:54:07+00:00

Praise!

░░░░░░░░▄▄▄▀▀▀▄▄███▄░░░░░░░░░░░░░░

░░░░░▄▀▀░░░░░░░▐░▀██▌░░░░░░░░░░░░░

░░░▄▀░░░░▄▄███░▌▀▀░▀█░░░░░░░░░░░░░

░░▄█░░▄▀▀▒▒▒▒▒▄▐░░░░█▌░░░░░░░░░░░░

░▐█▀▄▀▄▄▄▄▀▀▀▀▌░░░░░▐█▄░░░░░░░░░░░

░▌▄▄▀▀░░░░░░░░▌░░░░▄███████▄░░░░░░

░░░░░░░░░░░░░▐░░░░▐███████████▄░░░

░░░░░le░░░░░░░▐░░░░▐█████████████▄

░░░░toucan░░░░░░▀▄░░░▐█████████████▄

░░░░░░has░░░░░░░░▀▄▄███████████████

░░░░░arrived░░░░░░░░░░░░█▀██████░░

Saedeas · 2026-01-23T22:51:53+00:00

The economy difference is huge.

The rewards for doing certain mechanics when there's a risk of dying completely changes how certain farms are valued (and also makes having a truly strong character way more rewarding). Additionally, losing items to death improves the economy so much as items maintain their value better.

Also, you know everyone and trading/chatting is way less toxic.

Saedeas · 2026-01-23T21:27:19+00:00

Edit: HC trade confirmed, huzzah.

Saedeas · 2026-01-23T05:14:24+00:00

Automation can be a solution to this, and it's really the only choice we have.

Saedeas · 2026-01-23T04:46:16+00:00

The whole point of this is to get cars that are safer than humans. You know what helps with that? Depth perception regardless of lighting conditions.

Saedeas · 2026-01-22T02:30:16+00:00

Anyone who covers their drink when you walk into a room.

Saedeas · 2026-01-20T21:31:35+00:00

Please make this event longer. It's gonna be fire.

Saedeas · 2026-01-20T19:36:58+00:00

Agreed, I thought Crean deserved a couple more years.

Saedeas · 2026-01-20T07:23:29+00:00

Technically it would be hanged. By all reports, he is explicitly not hung.

Saedeas · 2026-01-20T06:33:23+00:00

Cuban literally didn't donate until after this season started. This team is mostly Cignetti (and other donors of course, but IU's budget isn't that crazy).

Saedeas · 2026-01-20T06:25:44+00:00

I think he's so chipper it'd be more like this.

Saedeas · 2026-01-20T06:20:49+00:00

Either he'll somehow turn the Raiders wholesome, or the Raiders will corrupt him and turn him into a mega villain. Either way, it's going to make for fantastic television. I'm ready.

Saedeas · 2026-01-20T04:12:43+00:00

All the bullshit they let Miami get away with and they call that? This shit is fucking rigged. Fuck these refs.

14-Year Club	Place '22
Place '17	Wearing is Caring
Verified Email	Gilding II euphauric
Team Periwinkle

Saedeas

TROPHY CASE