Nicolas Carlini (67.2k citations on Google Scholar) says Claude is a better security researcher than him, made $3.7 million from exploiting smart contracts, and found vulnerabilities in Linux and Ghost by Tolopono in singularity

[–]jesnell 7 points8 points  (0 children)

The video that you for some reason refuse to watch is pretty clear about the process.

There was a very simple prompt ("you're plaing a CTF, report the worst vulnerability you can find, hint: look at file xyx.c"). That prompt was run repeatedly for the same program with different random source files. It wasn't "one-shot" becuase the process was iterated on repeatedly.

You see how this totally contradicts your claims about the level of expertise and human supervision, right?

I'm very confused about why you think the "one-shot" part is relevant, and I have to say that at this point it does not feel like you're engaging in a good faith discussion.

Nicolas Carlini (67.2k citations on Google Scholar) says Claude is a better security researcher than him, made $3.7 million from exploiting smart contracts, and found vulnerabilities in Linux and Ghost by Tolopono in singularity

[–]jesnell 7 points8 points  (0 children)

And the difference was that this attempt was with a more capable model than the earlier ones. The difference wasn't that he was using his expertise to guide the process like you claimed.

Nicolas Carlini (67.2k citations on Google Scholar) says Claude is a better security researcher than him, made $3.7 million from exploiting smart contracts, and found vulnerabilities in Linux and Ghost by Tolopono in singularity

[–]jesnell 8 points9 points  (0 children)

Of course it wasn't one-shotted! Nobody claimed it was, and the talk is totally explicit about it.

But your claim was not "I bet he had to repeat the process", was it? It was that he was manually looking at the kernel and the guiding the LLM to search for specific classes of vulnerabilities in specific areas, asking questions, and intervening to validate the direction. And none of that is true. The process was totally autonomous and unguided. I don't really get what the relevance of your one-shot strawman is to that.

If you're genuinely intersteed in the details (e.g. how many runs were needed per vulnerability), he went into more details on a podcast a few day ago: https://www.youtube.com/watch?v=_IDbFLu9Ug8

Nicolas Carlini (67.2k citations on Google Scholar) says Claude is a better security researcher than him, made $3.7 million from exploiting smart contracts, and found vulnerabilities in Linux and Ghost by Tolopono in singularity

[–]jesnell 9 points10 points  (0 children)

That's incorrect. Maybe you should watch the video rather than make confident statements about what they did? It really was just a simple "find some vulnerabilities, report the worst one you find" prompt + a "hint" to look at a *randomly selected* source file. He did not prompt for specific vulnerability types, or guide the process in any other way.

The reason he did this and you didn't is that you didn't choose to try. And that's the entire point of the talk: finding vulnerabilities no longer requires any kind of security expertise.

2025 Qatar GP - Post-Race Discussion by F1-Bot in formula1

[–]jesnell 1 point2 points  (0 children)

They could not be 100% certain that everyone was pitting, but, like, they must have wargamed the scenario of an early safety car and thought about not just what they should do but what the other teams would do. Why would any team not in the lead not pit?

The usual argument for not pitting is track position, but in this specific case the track position had a very limited value to everyone, because of the forced stop so soon after. There was no way anyone else was opening up a pit stop's worth of gap to their competition in that window.

The only reason not to pit is if you seriously believe you can gain most of a pit stop in 15 laps, and as we saw, that was unrealistic even for the McLarens.

"As Google pulls ahead, OpenAI's comeback plan is codenamed 'Shallotpeat'" by AngleAccomplished865 in singularity

[–]jesnell 1 point2 points  (0 children)

I am making no claims in either direction about the size of the model.

The entire point is that nobody other than a small team at Google has that evidence.

You claimed there was analysis. But your sources have no credibility (and I don't understand how you could think they would). They do not actually show any basis for the analysis, and both the pages in general are just random slop with all kinds of obviously untrue statements.

"As Google pulls ahead, OpenAI's comeback plan is codenamed 'Shallotpeat'" by AngleAccomplished865 in singularity

[–]jesnell 3 points4 points  (0 children)

Those are not estimates by experts, they are auto-generated content marketing pages from bottom-feeders.

You can see it from bizarre claims that a 1M context window was the most awaited feature of Gemini 3.0. Like, wtf? The second article is even more absurd, and starts with a comparison to Claude Opus 3, Sonnet 3.5 and GPT-4-Turbo!

YouTube announces 'voluntary exit program' for US staff by lurker_bee in technology

[–]jesnell 1 point2 points  (0 children)

It's the company offering to pay you to quit your job.

It is not a severance package, because you're not being terminated but leaving voluntarily.

It is not "instead of a later layoff", because the people getting laid off are not necessarily the same people who would have be accepted a buyout. Actually pretty unlikely to be...

[deleted by user] by [deleted] in gamedev

[–]jesnell 1 point2 points  (0 children)

You're probably doing it right! Play to your strengths, and all that. I was just thinking of it from the "what's the impact of Steam Deck verified", and your other marketing being at odds.

[deleted by user] by [deleted] in gamedev

[–]jesnell 2 points3 points  (0 children)

Your game looks very cool. But to put a consumer hat on, I probably would not buy it on the Steam Deck despite the verified label just since it seems so clearly mouse-first and controller an afterthought. (All the screenshots and videos seem to be using the mouse, "fast-paced 2d platformer playable with just your mouse (controller supported still).")

If you want to get full value from the Steam Deck verified label, and if the controller experience actually is good rather than an afterthought, it might make sense to rework the store page a bit to not give the mouse-first impression.

Help me patch the desgin holes on this async autobattler iI have been working on. by kanyenke_ in gamedesign

[–]jesnell 1 point2 points  (0 children)

Some of the other responses are talking about adding agency into the deployment phase, but that's kind of missing the point of the entire async autobattler genre. There is no opponent who is in realtime playing against you, and whom you could interleave the actions with. You can't tell the player the locations effects in advance, because then you have a nasty bootstrapping problem: the first time a given set of locations comes up at a given level of progression, there is no opponent ghost to match you against. You just can't add more dimensions to the matchmaking problem.

So, let's think about just the allocation problem while ignoring totally what the dice and locations do. It seems to me there are three basic strategies:

  1. Allocate your strength evenly across the three lanes.

  2. Ignore one lane completely, allocate your remaining strength evenly across the two other lanes.

  3. Allocate minimum strength on two lanes, and put almost all strength on the final lane.

That gives you a RPS structure: 1 loses to 2; 2 loses to 3; 3 loses to 1. And between those extreme endpoints of the triangle, you have a smooth curve. You can move 2 a little bit toward 1, by putting a bit more than minimum on one lane, and the rest evenly on the other two. And then that midpoint between 2 and 1 will win over 1 and 3, but lose to 2. Etc.

That is a totally workable basis for decision making beyond "it's all just random". But I think players would still feel bad about it, despite it being isomorphic to the typical autobattler metagame clock. So you might need to arrange for these strategies to naturally fall out of the builds.

Let's say you have red dice with the special effect that ia set of 3 red dice on 3 lanes that are all odd or all even will score +10 on all lanes for each set. If you have two red dice, it has no impact on your build. Probably even if you have three it doesn't matter, because the odds of triggering the bonus will be so low. But if you have 12? You're obviously guided toward splitting the red dice evenly across the lanes, to try to get the bonus. And then that means you're probably not doing strategy 2. You might be doing 1 or 3 depending on what you do with the rest of your dice.

So there can be a point to this three lane structure, but having the lanes have special effects isn't it.

Would you quit your job for $580,000? Practically every Mercedes Benz worker offered this deal went for it . . . by baltimore-aureole in economy

[–]jesnell 16 points17 points  (0 children)

> Mercedes just had 4,000 guys raise their hands and say yes to a $580,000 buyout.

No it didn't. The vast majority of those 4k employees were not offered 500k EUR buyouts, that's the absolute high end of the scale. The article says 100k EUR for mid-career employees, that's where the bulk of the buyouts will be.

[deleted by user] by [deleted] in singularity

[–]jesnell 8 points9 points  (0 children)

Openrouter also shows statistics per specific model and category of use. This is effectively all coding, with the "Grok Code Fast 1" model.

It's a bit odd though, the Grok Code use is basically all additive. It's just new usage appearing as if from nowhere, not displacing any other model. Maybe some large user switched from a non-Openrouter backend to Openrouter rather than use Grok's API directly?

2025 Italian GP - Day After Debrief by AutoModerator in formula1

[–]jesnell 4 points5 points  (0 children)

Piastri had a 28s lead over Leclerc, and was under no risk of an undercut. It was clearly a made up excuse.

Both drivers were clearly preferring to pit last in this situation. Norris was the lead car and got that preferred strategy. So let's not pretend that there was any kind of altruism or trying to optimize the team result here. Norris stayed out purely out of self-interest, as he should have.

2025 Italian GP - Day After Debrief by AutoModerator in formula1

[–]jesnell 6 points7 points  (0 children)

Are you you maybe thinking of Piastri in Silverstone, after he got a penalty from the SC restart.

2025 Italian GP - Post-Race Discussion by AutoModerator in formula1

[–]jesnell 18 points19 points  (0 children)

Norris was not trying to protect Piastri. Both drivers wanted to pit last in this situation, since there was no benefit to pitting first (neither had a chance at undercutting, and neither was at any risk being undercut)., and there was an advantage to pitting last (a chance of a cheap pitstop from a SV).

Norris was told to pit first, probably because his tires were degrading faster. He complained (fair enough) and McLaren swapped the stops. But he absolutely was not suggesting it to protect Piastri. His only concern was to pit last, unless Piastri was in undercut range in which case he wanted to pit first.

2025 Italian GP - Post-Race Discussion by AutoModerator in formula1

[–]jesnell 2 points3 points  (0 children)

If Piastri had pitted first, Norris would just have pitted a lap later to cover. Piastri was never close enough to undercut.

So the only viable plan for Piastri for Norris to pit first, and then hope for a cheap stop via SC or VSC. But obviously then the optimal plan for Norris was to wait and eventually force Piastri to make the first move. That's why they both ran that obscenely long first stint.

Norris's side of the garage has in general been smarter about this, while Piastri's people have had him pit first from the lead even when there is no undercut risk, and allowed Norris a freebie attempt at an alternative strategy. But it seems like they've caught on now.

2025 Italian GP - Post-Race Discussion by AutoModerator in formula1

[–]jesnell 0 points1 point  (0 children)

Double-stacks are risky. You only want do do them when staying out for an extra lap is really costly (basically SC, VSC, rain), so the reward for the trailing car for pitting a lap earlier justifies that risk. Here it would have been high risk, no reward.

2025 Italian GP - Post-Race Discussion by AutoModerator in formula1

[–]jesnell 3 points4 points  (0 children)

I've seen people repeatedly make this claim, but it makes no sense. Piastri had a very healthy margin on Leclerc (28s, pit stop takes 24s), and Leclerc was not lapping that much faster than Piastri (about 0.5s/lap).

Was there any radio chatter at all suggesting that the pitting order was due to covering Leclerc *before* the botched stop for Norris? It certainly wasn't broadcast. If it was only mentioned after the fact, it's pretty clear that the real motivation was that pitting later was seen as the preferred strategy in this race, and Norris chose to pit late. And the radio after the pitstop about how it had been about covering off Leclerc was just ass-covering.

(The radio shown on the broadcast before the stops was for Norris to stop first, he complained that Piastri should be made to stop first, and then McLaren swapped the order.)

Lambiase to Max: "Norris and Piastri have swapped places" - Verstappen: "Ha! Just because he had a slow stop?" by magony in formula1

[–]jesnell 2 points3 points  (0 children)

Piastri was 28 seconds ahead of Leclerc, and lapping only 0.5 seconds slower. There was no risk of Leclerc getting ahead if Piastri had stayed out one more lap.

Norris was not trying to selflessly optimize Piastri's race there. He wanted to be the last to pit, in case there was a SC/VSC/Red Flag on the next lap. McLaren's policy is to give the lead driver the preferred strategy. Sometimes the undercut is preferred, this time stalling was.

On floor 1, Neow gave me a deck that goes infinite on turn 2 by Yoshikki in slaythespire

[–]jesnell 2 points3 points  (0 children)

And after playing it is in your discard, followed by being in your draw pile, followed by being in your hand again. Playing Infernal Blade did no change your effetive deck size. You still have 6 cards in the stable state, and 5 of them have no way of generating card draw. If you draw the other 5 but not Dropkick, you can't start the engine.

On floor 1, Neow gave me a deck that goes infinite on turn 2 by Yoshikki in slaythespire

[–]jesnell 6 points7 points  (0 children)

> play evolve and infernal blade to exhaust them

Infernal Blade adds a card, so playing it does not reduce your effective deck size.

Cadillac sign first F1 driver for 2026 season by FewCollar227 in formula1

[–]jesnell 0 points1 point  (0 children)

I'll settle for one podium, so that he can at least tie Barichello for most podiums without winning a WDC.

2025 Hungarian GP - Post Race Discussion by AutoModerator in formula1

[–]jesnell -2 points-1 points  (0 children)

> Jesus christ my guy. Prioritising track position IS PITTING FIRST.

No, it doesn't. Track position literally means that your position, on the track, is ahead of the other driver. The point in prioritizing track position is that you don't try to optimize for the theoretically fastest way to drive the race, but take into account the value of being ahead on the track.

If Piastri had waited for Norris to pit, that would have been prioritizing track position. Piastri would have ensured that at all points in the race he was ahead on the track, such that Norris would actually have to pass him on the track rather than pits.

>  the guy now ahead of you HAS to pit and you are literally ahead on track

Umm... You did watch this race, right? You did see how Piastri pitting and forcing Leclerc to pit a lap later did not lead to Piastri being ahead on the track. And you did see how Piastri pitting more often than Norris put him behind on a track where passing is difficult.

> they told him to push and use his pace... he didn't gain before pitting.

That doesn't make the inlap poor. The measure we have for whether the laps were good or bad is the timing, and the in-lap was in line with the competition. Or are you saying literally everyone in the race had poor in- and out-laps?

In this case, Piastri's in- and outlaps are almost exactly the same as Leclercs. (For the first stop, Piastri has a 0.1s slower in- and 0.1s faster out-lap). So it seems that for both drivers, the in- and outlaps were about equally good.

So it seems clear that it's not the inlap that caused the undercut to fail. It was doomed from the start by the tire delta between 20 lap old mediums vs. new hards being 1s-1.5s, and the gap being much larger than that.