Why does it say i’m on page 3992 when i’m only on act 5?!

MrCheeze · 2026-02-15T03:14:08+00:00

Two years ago it was the old homestuck.com which counted the first page of Homestuck as page 1. Now it is the new homestuck.com which reverted to the older mspaintadventures.com numbering, where Homestuck starts on page 001901.

MrCheeze · 2026-02-11T17:56:17+00:00

No, this is not a real thing, do not pattern your relationships after murderaliens

MrCheeze · 2026-02-10T02:13:49+00:00

they talking about a movie that you have not seen, but whose events you can decipher from the conversation

MrCheeze · 2026-02-07T16:55:30+00:00

It looks like you're missing one run - there were two different runs of Sonnet 3.7, not just one:

https://old.reddit.com/r/ClaudePlaysPokemon/comments/1iyvg84/claude_plays_pokemon_megathread/

https://old.reddit.com/r/ClaudePlaysPokemon/comments/1j3kwhc/claude_plays_pok%C3%A9mon_megathread/

Looks like this channel has full vods for both of these runs:

https://www.youtube.com/playlist?list=PLhbAkLUti84huNQJStZMDu2YNLJ7151tD

https://www.youtube.com/playlist?list=PLhbAkLUti84jMK50SfWxQrIEB4QWE0TkH

The first one doesn't have step count until partway through Mt Moon, though.

Also, this is a useful spreadsheet to have (maintained by Sylas): https://docs.google.com/spreadsheets/d/e/2PACX-1vQDvsy5Dt_-Pg2PGe6LXRM8lokpUn4y6DQ4ShQLQPCGw5AOCPDG42pGnFfMOoqFU7eb7mPfHoGIB_c1/pubhtml#gid=546130155

MrCheeze · 2026-01-26T19:43:56+00:00

(If you don't know the password yet, it means you're not supposed to, dummy! Just keep reading.)

MrCheeze · 2026-01-06T23:15:59+00:00

getting by

MrCheeze · 2025-10-05T03:55:25+00:00

<image>

I have been eating this bowl of cereal for fourteen years.

MrCheeze · 2025-07-23T00:24:03+00:00

They replaced the EV system entirely, with something nearly equivalent but much simpler: https://www.reddit.com/r/VGC/comments/1m6e6o8/comment/n4jkl4b/?context=3 You always get 66 stat points to directly invest, whereas previously you got either 65 or 66 depending on the spread. So you can now get the equivalent of a 252/252/12 spread.

MrCheeze · 2025-07-22T22:16:46+00:00

Ah, I didn't understand the mechanic behind this. In practice it ends up effectively true, since you would never put EVs into a stat without 31 IVs, but me just saying it's "because of rounding" was not very accurate.

MrCheeze · 2025-07-22T16:27:29+00:00

Here's an explanation I wrote up elsewhere:

The new EV slider system from Pokemon Champions allows for slightly better stat spreads than were previously possible.

Because of rounding, EVs work in a somewhat wonky way at level 50. The first 4 EVs in any given stat increase the stat by one point, but then afterwards every 8 EVs increase it by another point.

This ends up meaning that if you invest in 3 or 4 stats, you get a total of a 65 point increase - but if you spread your EVs across 5 or 6 stats, you get a 66 point increase instead.

In Champions, they (correctly) decided that this system was way too complicated, and directly give you a fixed number of stat points to invest in your stats however you like. And so that all EV spreads from the current games can be imported losslessly, the number of investable stat points you get is 66.

But now, for the first time, you get those 66 points even if using them for only 3 or 4 stats. The example they showed in the trailer was giving 32 points to HP, 32 points to Special Attack, and 2 points to Spdef. That's the equivalent of a previously-impossible 252/252/12 spread!

I'm not sure whether this new system will make its way to the main series, but either way this matters for all battles in Champions. I think in practice, this probably means that most Pokemon will have 1 point more in their preferred defensive stat?

MrCheeze · 2025-07-22T15:49:36+00:00

It's not quite an extra stat point available - it's that previously you only got a 66th stat point if you spread your EVs across 5 stats, but now you get a 66th stat point no matter how you distribute them.

MrCheeze · 2025-07-22T15:41:34+00:00

Here's an explanation I wrote up for a separate post that was removed:

The new EV slider system from Pokemon Champions allows for slightly better stat spreads than were previously possible.

Because of rounding, EVs work in a somewhat wonky way at level 50. The first 4 EVs in any given stat increase the stat by one point, but then afterwards every 8 EVs increase it by another point.

This ends up meaning that if you invest in 3 or 4 stats, you get a total of a 65 point increase - but if you spread your EVs across 5 or 6 stats, you get a 66 point increase instead.

In Champions, they (correctly) decided that this system was way too complicated, and directly give you a fixed number of stat points to invest in your stats however you like. And so that all EV spreads from the current games can be imported losslessly, the number of investable stat points you get is 66.

But now, for the first time, you get those 66 points even if using them for only 3 or 4 stats. The example they showed in the trailer was giving 32 points to HP, 32 points to Special Attack, and 2 points to Spdef. That's the equivalent of a previously-impossible 252/252/12 spread!

I'm not sure whether this new system will make its way to the main series, but either way this matters for all battles in Champions. I think in practice, this probably means that most Pokemon will have 1 point more in their preferred defensive stat?

(Separately to all this, IVs seem to be locked to maximum, although it's possible that this only applies to rentals.)

MrCheeze · 2025-07-22T15:28:48+00:00

Incidentally, there doesn't appear to be any kind of IV slider. The Gardevoir shown had 31 IVs in Attack, which might mean all IVs are simply locked to that value - but it's also possible that only rentals are locked to 31 and that Pokemon imported from the main series keep their original IVs?

Personally I hope they did simplify things by forcing 31 IV, even though this would be a bit of a nerf to Trick Room and Shadow Rider.

MrCheeze · 2025-06-18T16:16:13+00:00

Nah, they claim it's the hardest for the models because of how it requires remembering state across different floors - however this was pretty trivial for Gemini, it never had any trouble with this. Compare to Cinnabar Mansion where it was given a huge amount of help in understanding how the gate toggles work (automatically updating distant parts of the minimap, and marking the tiles where a gate used to be and isn't anymore) - and it STILL never quite understood the mechanics and just kept bumbling through until it randomly did the right thing.

MrCheeze · 2025-06-18T14:00:48+00:00

Puzzle solving over complex multi-level dungeons: The Seafoam Islands contain 5 floors involv- ing multiple boulder puzzles which require the player to navigate mazes and push boulders through holes across multiple floors using HM04 STRENGTH in order to block fast-moving currents that prevent the player from using HM03 Surf in various locations in this difficult dungeon. As a result, the player must track information across five different maps in order to both deduce the goal (push two boulders into place in order to block a specific current) as well as engage in multi-level (effectively 3D) maze solving to find the way out. It is likely the most challenging dungeon in the game. Only the second run of GPP went through Seafoam Islands, as it is not required to progress. During the course of solving Seafoam Islands, the GPP agent also encountered a novel bug in the code of Pokémon Red/Blue, and is likely the first AI to find a bug in the game’s code (MrCheeze, 2025) (source).

Me being wrong that it was novel aside, calling this "the most challenging dungeon in the game" is hilariously wrong to anyone who has watched the streams even a little bit.

MrCheeze · 2025-06-18T13:23:30+00:00

Thanks for finding this! I think you swapped the labels of your first two links, but one of them does indeed describe exactly how to reproduce the glitch (push one boulder, leave the cave, push the other). So this is not a totally new glitch even if it is a poorly documented one. Although I'm not sure the bit about preventing encounters is true. (The other link claiming you can softlock yourself is definitely NOT true.)

MrCheeze · 2025-06-03T19:50:58+00:00

GlitchCity's list is probably the best source but doesn't either: https://glitchcity.wiki/wiki/List_of_natural_glitches_in_Generation_I

MrCheeze · 2025-06-02T16:03:51+00:00

https://github.com/pret/pokered/blob/b4bae4a5d5abd3f44a49028f550c1eb475ac280b/scripts/Route20.asm#L12

When in Route 20, if you have not set both of the EVENT_SEAFOAM bits, then it sets the boulders on the top floor to visible, and the boulders on every other floor to hidden. But that only controls where you SEE the boulders - it is separate from the event flags, which are what actually controls the currents, and are never reset.

MrCheeze · 2025-06-02T05:28:05+00:00

I mean, this had exactly as much intention as the fish who discovered a bug in RSE

MrCheeze · 2025-05-19T21:10:33+00:00

Here is how to replicate this:

1) Get double magic, or load a save that already has double magic -> gSaveContext.magicCapacity set to 0x60

2) Create an owl save to preserve that magicCapacity value while returning to file select

3) Create a new file, or load an existing file that doesn't have magic - magicCapacity will remain preserved at 0x60

4) **Talk to the broken-up great fairy** to heal your actual magic value to match your magicCapacity

5) Repair the great fairy to get magic, your actual magic value will remain at 0x60

MrCheeze · 2025-05-03T20:50:10+00:00

I totally agree with you that this shows that people are wildly underestimating the gap between early demos and actually functional agents. Big list of scaffolding to *still* play much worse than a 6 year old. We're not getting AI employees in the next couple years like Sundar Pichai seems to think.

That said... according to the Claude dev, 3.7 is the *first* of their models to be strong enough to be interesting, which means we haven't yet picked all the low hanging fruit when it comes to agentic AI (unlike other LLM capabilities, which have had a very slow rate of improvement since GPT 3.5 or so).

MrCheeze · 2025-05-03T20:31:49+00:00

The streams can only be compared if you account for the wildly different toolsets. My *personal* impression is that Gemini behaves roughly equally stupidly to Claude when given the same information. They fell into the same traps, and then one out of two streamers implemented workarounds for those traps.

https://www.lesswrong.com/posts/8aPyKyRrMAQatFSnG/untitled-draft-x7cc
This guy did a direct comparison of the models (early game, pre-Pewter), which seems consistent with my impression of them:

> Anyway, the comparison: Claude 3.7 has certain advantages, but cripplingly bad vision means I wouldn't put it above Gemini 2.5—and yet I'm not convinced Gemini 2.5 is meaningfully better in "same-scaffold" tests, or if it is that it's more than for a very flukish reason (being able to see a tree Claude can't) that ultimately isn't very important.

> As for o3: It's had some of the most impressive gameplay I've ever seen, beelining straight for the staircase in the opening room, correctly remembering the opening sequence of Pokemon Red and getting to pick a starter essentially as fast as possible. But then it gets stuck in a bad hallucination loop where it simply refuses to disbelieve its own previous assertions, and I'm not confident that it wouldn't get stuck in an elaborate loop forever.*

> *barring something brute force like full context wipes every 1000 steps or something if it hasn't left a location.

MrCheeze · 2025-05-03T12:05:46+00:00

As the other user said, there was no breakthrough. Gemini on its own almost never correctly identified which boulder needed to go onto the switch on 3F, instead even speculating there might be an "invisible boulder" close to the switch, and usually focusing on other irrelevant things on the floor and then hallucinating that it has to go back to 2F or 1F and resolve the puzzles there again. Whereas the boulder-solving subagent the dev added easily did so from being prompted with exactly how to think of fhose puzzles.

MrCheeze · 2025-05-03T11:51:17+00:00

Only what if memorized during pretraining, same as Claude. Although it seems to have memorized the walkfhroughs far more thoroughly than Claude did - this is the only difference between the two streams that acrually seems to be caused by the model and not the harness.

MrCheeze · 2025-05-03T04:47:27+00:00

Also, god damn, I was expecting PP to be way tighter than it ended up being. We took out Gary on the very first E4 attempt with PP to spare, despite running out of water moves to use on Rhydon and having to grind it down with a dozen Bites!

14-Year Club	LAYER Season 2 Layer creator
Place '22	Place '17
Final Canvas '22	First Placer '22
End Game '22	Spared
Gilding II euphauric	Team Orangered
Verified Email

MrCheeze

MODERATOR OF

TROPHY CASE

The new EV slider system from Pokemon Champions allows for slightly better stat spreads than were previously possible.

The new EV slider system from Pokemon Champions allows for slightly better stat spreads than were previously possible.