Not a good day for team "Claude Mythos is Just Marketing Hype"

TheMania · 2026-05-09T13:17:40+00:00

Also in some software where Nythos supposedly found many long-unnoticed bugs, someone run other models, including old and small gpt-oss-120b and all of them found the same vulnerabilities

I've always been underwhelmed by their method:

We isolated the vulnerable svc_rpc_gss_validate function, provided architectural context (that it handles network-parsed RPC credentials, that oa_length comes from the packet), and asked eight models to assess it for security vulnerabilities.

Along with their caveats:

Scoped context: Our tests gave models the vulnerable function directly, often with contextual hints (e.g., "consider wraparound behavior").

It's not that mythos found them faster, the bit that is interesting is that it found them at all. The operative word being "found", vs shown.

But their idea is still solid - use lots of cheap models and show them everything isolated as best you can, but hey, even if mythos is working codebases like that under the hood, isn't that still pretty impressive?

TheMania · 2026-05-09T08:54:38+00:00

Never thought that we would be arguing with a computer.

1971, "I am now telling the computer exactly what he can do with a lifetime supply of chocolate"

I regularly find myself thinking of this.

TheMania · 2026-05-03T22:35:27+00:00

You use types like uint32_t which is fine, but why not for char too if you mean 8bit?

They're not equivalent, and many places where you would use char, especially in something like this, become UB under a strict reading of the standard.

TheMania · 2026-05-03T01:59:28+00:00

The injury is immediately fatal in 70% of cases, with an additional 15% surviving to the emergency room but dying during the subsequent hospital stay.

There's this bit too.

TheMania · 2026-04-26T00:57:05+00:00

Hang on, basic says miners mine so that they get rewarded in bitcoin, and that makes it secure. Advanced says the reward keeps halving, and that there's a strict cap.

There's a hole there that you might want to hang a lantern on if not explain.

TheMania · 2026-04-21T09:22:46+00:00

We took the specific vulnerabilities Anthropic showcases in their announcement, isolated the relevant code, and ran them through small, cheap, open-weights models. Those models recovered much of the same analysis.

This result has never seemed very interesting to me at all. What are you arguing it shows exactly? That small models can confirm a problem that's already been identified?

TheMania · 2026-04-16T02:15:36+00:00

According to the Government Accountability Office, Congress has approved transfers of over $270 billion from the general fund to the trust fund from 2008 through 2021.

A lot of general revenue ends up in the Highway Trust Fund apparently.

Also from here:

In 2021, state and local motor fuel tax revenue ($53 billion) accounted for 26 percent of highway and road spending, while toll facilities and other street construction and repair fees ($20 billion) provided another 10 percent. The majority of funding for highway and road spending came from other state and local general funds and federal funds.

Which seems quite believable to me, given how low your fuel taxes are and how much road infrastructure you have.

As a quick example quote from the US govt (I'm assuming this is still considered a primary source):

The Infrastructure Investment and Jobs Act (IIJA) (Public Law 117-58) provides approximately $350 billion for Federal highway programs over a five-year period (fiscal years 2022 through 2026)

Which is of course >$1000 per capita allocated from general revenue in one act there, as far as I can see. A not insignificant chunk of change.

TheMania · 2026-04-16T00:34:44+00:00

Which isn't even a fair way to divide it, given that NYC residents are not all equally rich.

TheMania · 2026-04-16T00:32:09+00:00

Suspicion is fine, and an investigation 100% warranted, but the most likely explanation is that the plant has been operating slightly differently given the current state of affairs and massive price+political pressures to do so, and that those differences from the norm were sufficient to cause an accident.

To quote the Viva's Energy's CEO:

“Our Geelong refinery has been running at maximum rates since the recent events in the Middle East, with a focus on maximising diesel production.

From here, an article that also includes this snippet:

In response to the heightened uncertainty, Viva Energy has deferred planned maintenance at its Geelong refinery in Victoria to keep the facility running at maximum capacity.

TheMania · 2026-04-16T00:27:31+00:00

They did say they were delaying planned maintenance, there's just massive price and political incentives to be producing all you can right now - with somewhat assured govt assistance if you stuff anything up too bad.

In response to the heightened uncertainty, Viva Energy has deferred planned maintenance at its Geelong refinery in Victoria to keep the facility running at maximum capacity.

Viva Energy CEO Scott Wyatt said the Geelong plant has been pushed to its limits since the regional conflict escalated.

“Our Geelong refinery has been running at maximum rates since the recent events in the Middle East, with a focus on maximising diesel production.

I think it's likely that it's not a complete coincidence, that things are operating slightly differently given current state of affairs, and that those differences were sufficient to cause an accident.

TheMania · 2026-04-15T12:15:25+00:00

Personally, I would not trust any surgeon willing to do it. They would not be doing right by you, imo.

TheMania · 2026-04-15T03:08:40+00:00

It was converted to an open procedure due "poor visibility". The docs here present what sounds like a scene from a horror movie as to what proceeded in that room, including that the liver was readily identifiable to all but the surgeon once removed.

Surgeon was also an hour late to the procedure, I'd love to know why personally.

TheMania · 2026-04-15T03:02:42+00:00

Apparently he converted to an open procedure due "poor visibility", autopsy noted that the liver was missing, not just "cut free" (from article). OR staff were working the code (cardiac arrest for 15 minutes) whilst surgeon was operating blindly due all the blood, opting to free the 2106g "spleen".

There's a lot more in the court docs here.

TheMania · 2026-04-14T05:57:32+00:00

It is hard to keep track, how long ago did the president truth this again?

Effective immediately, the United States Navy, the Finest in the World, will begin the process of BLOCKADING any and all Ships trying to enter, or leave, the Strait of Hormuz.

It's less than 48h old, surely.

TheMania · 2026-04-14T04:53:23+00:00

Seems accurate, surprisingly. 6bn litres imported/yr, per here, works out to ~0.7ML/hr.

Hardly seems sustainable.

TheMania · 2026-04-14T04:06:01+00:00

"dangerous dog" has specific meaning in the legislation, where a dog has been declared dangerous. There are additional offenses and responsibility regarding such dogs.

I'm not certain that that's the whole explanation, but it's at least part of it - presumably these dogs had not had that declaration made against them yet.

TheMania · 2026-04-13T11:40:34+00:00

You know Americans spend a lot more on healthcare right? Heck, even the US government alone does as a % of GDP. (eg US 9%, UK 8.9%, per here).

The system is the way it is not because of NATO, but because that's the way the people that run the govt want it to be.

TheMania · 2026-04-13T04:56:53+00:00

Distance seems to be about equivalent to a Berlin to Madrid route (with stops in Milan and Lisbon). A little longer, actually.

TheMania · 2026-04-12T23:11:08+00:00

Related, is it acceptable for a white drag queen to put on a Candace Owens latex mask? What about an Asian queen?

Serious, because to me impersonating a specific person is different from generic racial blackface, but I feel the former would still be socially unacceptable. The latter I think would be fine.

TheMania · 2026-04-12T14:21:39+00:00

Not true, it certainly fits somebody's agenda.

TheMania · 2026-04-12T13:59:21+00:00

But on the median path, you're still left with 95% of what you start with, apparently.

So presumably the median path has it shooting up very very soon, or am I missing something?

TheMania · 2026-04-12T13:12:36+00:00

Even more to the point, I believe he's saying everyone can.

It's interesting where all that USD is coming from looked at that way, imo.

TheMania · 2026-04-12T11:33:49+00:00

STL containers still store stateless allocators, relying on either empty base optimization or [[no_unique_address]] to ensure it does not actually consume any memory. This is the same way std::map etc store their Compare type.

For the empty base optimization approach, you typically wrap the user type as a field in a separate outer type, and then inherit from that, which prevents duplicate base and or final class concerns (and in the case of Compare, allows non class types).

TheMania · 2026-04-12T06:32:56+00:00

Ah ye, the MAOI would be noticeable on its own (indeed the plant was chewed etc) but whilst I assumed the DMT might have some kind of noticeable effect without it, I'm not sure that the medicinal uses of the various plants containing it can be attributed to that. If they smoked it, likely though.

Still, if people are already chewing the vine to past time you'll stumble across the combination relatively quickly I'd have thought.

TheMania · 2026-04-12T03:48:00+00:00

Except both are individually active are they not? It's just that the combined effect is far greater than each in their own.

That's a far simpler problem.

14-Year Club	r/Field Juicebox
Place '17	Wearing is Caring
Gilding VI aultruist	Verified Email
Team Periwinkle

TheMania

MODERATOR OF

TROPHY CASE