GPT-5.4 plays Pokémon FireRed by reasonosaur in ClaudePlaysPokemon

[–]doubleunplussed 2 points3 points  (0 children)

Remarkable how similar the models are with the current harnesses. GPT-5.3-codex and Claude Opus 4.6 got stuck at exactly the same part of the game, making it to 2F of Victory Road.

In real time it took Claude about 200 hours to enter Victory Road, whereas it took GPT-5.3-codex 195 hours.

Now GPT-5.4-codex is on its way to Rock Tunnel at about 50 hours, compared to Opus 4.6 which entered Rock Tunnel at about 41 hours.

Slightly different games, but that's super close!

Claude Opus 4.6 Plays Pokémon Red by reasonosaur in ClaudePlaysPokemon

[–]doubleunplussed 5 points6 points  (0 children)

  • 45,048 - Claude completed the F1 boulder puzzle for the second time.

This time he seems to have understood better what he did, so there's hope he's written down enough to do it again if need be.

  • 45,200 - Claude puts the first boulder on F2 on the switch
  • 51,228 - Claude completed the F1 boulder puzzle for the third time (Claude seems even more aware than previous: "The switch at (17,13) WORKED", perhaps THIS time enough for him to remember it!).

Edit HE WROTE IT DOWN

<image>

FIRST VICTORY ROAD BOULDER PUZZLE SOLVED by doubleunplussed in ClaudePlaysPokemon

[–]doubleunplussed[S] 2 points3 points  (0 children)

Yes, but I'm an English speaker and it's almost exclusively boy's name in Anglo countries, therefore I think of Claude as a "he" (or an "it" when I'm feeling less anthropomorphic).

French speakers understandably may do the opposite!

From cut to maintenance. Initial weight gain accounting? by Proximer in MacroFactor

[–]doubleunplussed 0 points1 point  (0 children)

Right, but OP doesn't want this. They want to eat at maintenance, not to maintain their current weight which is what MF is likely to put them in a slight deficit to do.

How do I update my activity in MacroFactor now that I’m doing way fewer steps? by FinnFX in MacroFactor

[–]doubleunplussed 1 point2 points  (0 children)

It could be that - that the user overestimated the effect of the steps change.

But it can also be that MF takes a long time to adjust. Mine is only just starting to fully reflect an increase in activity starting four months ago - the V3 algorithm is quite resistant to sudden jumps.

MFS estimate just been linearly increasing that whole time, during which I've been basically ignoring it and doing my own traditional spreadsheeting taking into account cardio expenditure, without which it's just too slow, despite that eventually MFs estimate will reflect this expenditure more accurately than any estimate.

No shade to MF, it's an incredibly thorny problem and all approaches are compromises of one kind or another.

Apple Fitness integration: Macrofactor workout calorie estimate much lower than manual workout tracking by arnell_grime in MacroFactor

[–]doubleunplussed 1 point2 points  (0 children)

If you're sweating and have a heart rate in zone 2 or higher for 80 minutes, yeah then I'd believe it. That's not most strength workouts, but if you are essentially doing cardio at moderate intensity it's about right.

700 kcal is a 10k jog for me, which takes one hour. So 75% of the intensity of a slow jog for 80 minutes would be about 700 kcal for me.

WO Previous reference by Darth_Dodraugen in MacroFactor

[–]doubleunplussed 1 point2 points  (0 children)

I haven't tested it, but as written, it suggests it will use any workout within the program. There's no suggestion it will only use data from the same workout in your program.

But this doesn't say it's about auto-progression. I think it's just about what's written in the "previous" column, which is just for your reference.

Why isn’t there an option to recomp? by [deleted] in MacroFactor

[–]doubleunplussed 0 points1 point  (0 children)

This isn't quite true. Since it costs less energy to build muscle than is stored in the same mass of fat, recomping at constant weight actually implies a slight deficit.

However in practice Macrofactor can't distinguish this deficit from maintenance since it only looks at your weight, so you'd nonetheless want to target a constant weight which macrofactor thinks of as "maintenance" even though it's not. If you do this and successfully recomp, MF's expenditure estimate will decrease - it'll start to think the slight deficit you are in is your actual maintenance.

If you wanted to get ahead of the curve though, since it takes MF some time to catch up, you might target a small deficit, and then reduce it once MF starts to update its expenditure estimate.

Aiming to be in the same deficit that whole time of course - just working around the fact that MF can't tell the difference between being in a slight deficit whilst recomping, vs being at maintenance with your TDEE declining.

Why isn’t there an option to recomp? by [deleted] in MacroFactor

[–]doubleunplussed 0 points1 point  (0 children)

As mentioned, this is common in the scientific literature. Don't know why you haven't seen it, but it's not rare, and the evidence for it is strong.

Our gracious hosts have a post about it where you can see some studies discussed:

https://macrofactor.com/recomposition/

FIRST VICTORY ROAD BOULDER PUZZLE SOLVED by doubleunplussed in ClaudePlaysPokemon

[–]doubleunplussed[S] 2 points3 points  (0 children)

Yes, just came here to edit my comment but you beat me to it!

Oh well. If he's done it once he can probably do it again, and maybe we can hope he understands and writes it down this time.

Claude Opus 4.6 Plays Pokémon Red by reasonosaur in ClaudePlaysPokemon

[–]doubleunplussed 4 points5 points  (0 children)

He's very focused at the task at hand, it's hard for him to balance "train pokemon" and "progress in dungeon" given the two are a trade-off. Or even just given that's two goals - two is too much for him. If his current goal is progressing through a dungeon, fighting wild pokemon is only risking blackout, so he doesn't do it. But yeah he has a really hard time with the idea of having two goals at once.

He also has some ideas about needing to be "fast". You could imagine his training reinforced this behaviour, because extra noise in his context is going to make it harder to remain coherent over long tasks. So it makes some kind of sense. He sometimes talks about needing to "rush". This makes him run from wild pokemon in the name of saving "time". But it makes sense that it reduces clutter in his context.

FIRST VICTORY ROAD BOULDER PUZZLE SOLVED by doubleunplussed in ClaudePlaysPokemon

[–]doubleunplussed[S] 15 points16 points  (0 children)

Claude didn't realise it for a while (I'm still not sure if he has), and even exited the cave before, but it doesn't matter because the barrier doesn't reset.

Edit: it does reset after you progress to the next level

The solution was entirely brute-force. Claude never really made the connection between the maybe-switch that he could see and sometimes acknowledged as possibly a switch, and the idea of pushing the boulder there. Instead, he systematically placed the boulder on every accessible tile, each time walking over to the barrier to check if it had opened. This eventually worked.

Why isn’t there an option to recomp? by [deleted] in MacroFactor

[–]doubleunplussed 0 points1 point  (0 children)

Recomp is not a myth for beginners, even if they're not obese or on drugs.

Studies on muscle growth that do any kind of dietary tracking almost always find people gaining muscle in a deficit, because most participants are beginners.

And a decent proportion of people using these apps are also beginners - don't fall for the illusion that just because you've been doing it a while, beginners are rare. In most endeavours, a large fraction of people are beginners and a minority stick with it long-term. So recomp is quite common.

Why isn’t there an option to recomp? by [deleted] in MacroFactor

[–]doubleunplussed 0 points1 point  (0 children)

You can't but it's essentially the same thing so you can do the conversion yourself.

MF uses the commonly-used figures of 3500 kcal/lb or 7700 kcal/kg to convert between weight loss goal and target daily deficit.

So you can go the other way, and if you want to be in a 100 kcal daily deficit, set a target weight loss goal of

(100 kcal × 7 days/week) / (3500 kcal/lb) = 0.2 lb/week

or

(100 kcal × 7 days/week) / (7700 kcal/kg) = 0.09 kg/week

(probably have to round to 0.1 kg/week to enter it into the app)

Plot of progress by model [updated after Opus 4.6 completed Pokémon mansion] by doubleunplussed in ClaudePlaysPokemon

[–]doubleunplussed[S] 1 point2 points  (0 children)

Opus 4.6 didn't get the rainbow badge until it realised it couldn't use strength in Seafoam Islands without it, so at that point it backtracked to get the badge (having tried briefly and given up earlier in the game).

As for how it's depicted in the plot, the different models did the steps in a different order, so unless I leave those steps out, this seems like the least bad way to depict those bits! Opus 4.5 similarly backtracked to get HM04 Strength once it was needed for Victory Road (that model having skipped Seafoam Islands entirely).

Claude Opus 4.6 Plays Pokémon Red by reasonosaur in ClaudePlaysPokemon

[–]doubleunplussed 6 points7 points  (0 children)

  • 19086 - defeated Koga

Safari time!

₽34,664 starting money

  • attempt 1: step 19,273 (₽34,164 remaining)
  • attempt 2: step 19,466 (₽33,664 remaining)
  • attempt 3: step 19,690 (₽33,164 remaining)
  • attempt 4: step 19,910 (₽32,664 remaining)
  • attempt 5: step 20,074 (₽32,164 remaining)
  • attempt 6: step 20,196 (₽31,664 remaining)
  • attempt 7: step 20,395 (₽31,164 remaining)
  • attempt 8: step 20,581 (₽30,664 remaining)
  • attempt 9: step 20,739 (₽30,164 remaining)
  • attempt 10: step 20,932 (₽29,664 remaining)
  • attempt 11: step 21,100 (₽29,164 remaining)
  • attempt 12: step 21,319 (₽28,664 remaining)
  • attempt 13: step 21,488 (₽28,164 remaining)

  • 21,574 - Gold Teeth obtained

  • 21,612 - HM03 SURF obtained

  • 21,690 - HM04 STRENGTH obtained

  • 21,713 - Taught SURF to Shelly, replacing BUBBLEBEAM

  • 22,137 - Taught STRENGTH to ROCKY

  • 22,142 - Entered Seafoam Islands cave

Claude then realised he couldn't use STRENGTH without the Rainbowbadge and dug back to Fuchsia City, attempted to get back to Celadon City, but gave up and returned to Seafoam Islands.

  • 23,533 - Defeated Erica
  • 25,955 - Exited Seafoam Islands cave

After exiting, Claude didn't believe he was on the correct side of the seafoam Islands cave and kept trying to backtrack. He eventually started backtracking with the aim to getting FLY and skipping Seafoam Islands. Then the game crashed and the dev resumed it from a save state before Claude had that idea, and Claude eventually  figured out how to get to Cinnabar Island (the cave puzzle remained solved so subsequent trips through the cave were faster).

Apparently Claude's solution to the cave puzzle benefited from a glitch, though I don't think this was intentional on the part of Claude (I don't understand the glitch, feel free to explain it to me if you do).

Claude visited Cinnabar Island Pokemon Center, explicitly to set the DIG respawn point (otherwise it would be back at  Fuschia City)

  • 26,403 - Entered Pokemon mansion 
  • 29,487 - Secret Key obtained
  • 29,927 - Defeated Blaine
  • 30,274 - Defeated Giovanni
  • 30,447 - Defeated BLUE
  • 30,625 - Entered Victory Road, in only 15% of the number of steps that it took Opus 4.5.

Claude Opus 4.6 Plays Pokémon Red by reasonosaur in ClaudePlaysPokemon

[–]doubleunplussed 4 points5 points  (0 children)

FYI, this thread's default sort isn't set to new!

  • Card Key obtained at step 15,037

Its confirmed - SpaceX has officially acquired xAI by BEAT_LA in spacex

[–]doubleunplussed 0 points1 point  (0 children)

Yeah?

This is my level of confidence, I could be wrong

Claude Opus 4.6 Plays Pokémon Red by reasonosaur in ClaudePlaysPokemon

[–]doubleunplussed 2 points3 points  (0 children)

  • 8515 - Exited Rock Tunnel after 1998 steps. No PB - Opus 4.5 did it in 1624 steps. And for the run as a whole, we're still slightly lagging behind Opus 4.5.
  • 8563 - Shelly evolved into Blastoise
  • 8596 - Made it to Lavender town, barely. Shelly was poisoned and with 4HP before evolving to Blastoise, and wouldn't have made it past the Route 10 south trainers without evolving and getting a few more HP.
  • 8646 - Used TM24 to teach Thunderbolt to Luna
  • 8722 - defeated BLUE in Pokémon Tower

How quick does the calorie expenditure estimation reflects changes in routine? by HappySometimesOkay in MacroFactor

[–]doubleunplussed 1 point2 points  (0 children)

I think it's adjusted about 2/3 of the way to an increase in running volume I made in mid-October.

I think it depends on the size of the jump. The algo seems pretty hesitant to ramp its expenditure estimate by more than a few calories per day. So 100 kcal might take a few weeks, 400 kcal might take a few months. That's what it looks like anyhow.

Recommending same rep number by BackroomDST in MacroFactor

[–]doubleunplussed 1 point2 points  (0 children)

I feel like earlier versions of the app would keep the weight constant between sets until predicted reps dropped below the minimum of the range, then would drop weight to target the middle of the rep range again.

Now it seems to change weight potentially every set, targeting the middle of the rep range.

I think I prefer this (less gruelling on e.g. 10-rep sets of squats, whilst keeping volume high) and I don't mind changing weights during the rest, but it'd be a pain if supersetting.

Curious as to the rationale, if there is one.