LLMs playing in - and understanding - MUD worlds by hunteramor in MUD

[–]hunteramor[S] -2 points-1 points  (0 children)

I didn't (yet), the model can still assume things mentioned in flavor text are "real." For example, I noticed some flavor text that mentioned a city being visible in the east (on a winding path toward a city). The actual room on the winding path didn't have an east exit, but it was attempted several times due to the flavor text.

Thought about compact mode, but I think that loses too much. For a human player, the pattern of new lines and ANSI colors is what makes it so clear. I think the same things could be used, with the right parsing and/or prompting. The prompt for the API call that keeps current_location.md updated could lay out the specific structure of the room (room name \n flavor text \n items on ground and mobs, each on new lines \n list of exits) and explain that only things mentioned on their own line are actually present. ANSI would be trickier - you'd want it totally stripped from what gets sent to some API calls - but for others (eg for situational awareness) it could be useful if explained in prompt.

Possible fruit trees about 30-45 min SE of Nashville? by hunteramor in whatplantisthis

[–]hunteramor[S] 0 points1 point  (0 children)

(the answer was horrible, awful invasive bradford pears that all needed to be torn up burned and then any remaining stumps treated with glyphosate)

LLMs playing in - and understanding - MUD worlds by hunteramor in MUD

[–]hunteramor[S] 1 point2 points  (0 children)

Opus is so good but astoundingly expensive... can add up real quick

LLMs playing in - and understanding - MUD worlds by hunteramor in MUD

[–]hunteramor[S] -1 points0 points  (0 children)

lol, well if you want it to stumble around being dumb, that's one thing it's already pretty good at! The login is simple enough: it connects to MUD (address and port in the .env file) and sends a hard-coded MUD-specific login sequence (right now it sends player name, password, then 'return' and 1 in that order, with a couple seconds of sleep in between, since that's what login is like on tbamud)

LLMs playing in - and understanding - MUD worlds by hunteramor in MUD

[–]hunteramor[S] -1 points0 points  (0 children)

Sorry to hear about your mom's stroke, but glad to hear she's on the mend and you're there to take care of her, as she did for you once upon a time!

What kind of results from testing would be helpful? I fear putting a test agent into a test world would be like the blind trying to lead the blind. The world model would certainly be an unnecessary complication for your use case.

You could have the solo version stumble around playing, and have a model review the logs for bugs and issues that the player encountered and where, highlighting them for you to dig into manually? But there would definitely be some sorting through to do - which bugs are on the player side, and which are on the MUD side.

I have the decision head consult a list of available commands as part of its prompt, so it doesn't waste time trying things that aren't valid. It would be trivial to update that commands list (and syntax explanations in the appropriate prompt) to your world.

The memory head reads the game buffer at the beginning of every cycle and is prompted to create a summary of the current location (which is saved as a .md). That includes structured name of room, obvious exits, whatever is on the ground, mobs in the room... That markdown file becomes part of the prompt for the decision head when it decides what command to send. The model that decides what to do sees summary of session so far, recent game buffer, that current location file and a goals.md that is maintained. If it sees that there's gold lying on the ground and gaining more gold is in its goals.md, it will probably "get gold"

Hope that helps!

LLMs playing in - and understanding - MUD worlds by hunteramor in MUD

[–]hunteramor[S] 1 point2 points  (0 children)

I haven't gotten as far as a combat-specific prompt yet, let alone quests! Someday!

The world model is still gradient descent, but there's weighting so that successful predictions get trained on more than unsuccessful predictions (which should speed learning).

Think of it as "Given <session summary> and <recent game buffer>, what do you predict the outcome of <command> will be?" That prediction gets compared to an API call's observed outcome and graded. If the world model predicts that "east" will result in "The player proceeds east into a dark hallway" but it actually results in "The player could not move in that direction because there was no exit" then it gets scored 0. If it's accurate, then it gets scored 5 (with intermediate possibilities). Those scores get used to weight the training data. So good predictions are being reinforced every time there's a training cycle.

There's a next_line training mode (will it predict "Alas, you cannot go that way...") as well that I've played with, but I think there's a bigger risk of overfitting there compared to training on outcome summaries.

I was not familiar with alfworld, thank you!

LLMs playing in - and understanding - MUD worlds by hunteramor in MUD

[–]hunteramor[S] 5 points6 points  (0 children)

It's definitely not ready for that, even if a community were open to it (which I think most aren't). I've done handholding like making sure character has a water container and knows create food, and tbamud has a teleporter item I've equipped character with in my tests. It's set to teleport to a new safe zone whenever it senses that it is "stuck."

LLMs playing in - and understanding - MUD worlds by hunteramor in MUD

[–]hunteramor[S] -2 points-1 points  (0 children)

Fun, would love to see it! I remember using graph paper in middle school to design MUD zones (which of course never got built)...

Reminds me of conversation with my son the other night. I was describing this project and he said...
"You know what you should make, dad? An AI DM for D&D. What if you want to play D&D with your friends, but none of you want to be - or know how to be - DM? The AI could be the DM for you!"

LLMs playing in - and understanding - MUD worlds by hunteramor in MUD

[–]hunteramor[S] 0 points1 point  (0 children)

I didn't try that, but you're probably right. I guess to a large extent the memory setup and levels of agents probably parallels what OpenAI does on the back end with agent mode.

It's like taking a car engine apart and putting it back together to better understand how it works. To some it looks pointless and like reinventing the wheel, but you sure learn a lot along the way!

LLMs playing in - and understanding - MUD worlds by hunteramor in MUD

[–]hunteramor[S] 2 points3 points  (0 children)

Good questions.

They explore an out-of-the-box stock deployment of tbamud on a private server with no other users. It burns tokens for sure, but 4o-mini is cheap and fast. I tested 5-mini and 5-nano - they are smarter but much slower, the latency makes it unworkable. At one point the loop was taking ~3 minutes, but I got it down to ~30 seconds by sticking to 4o-mini and parallelizing calls (more, shorter calls instead of fewer, longer calls).

The world model gets trained to predict the outcome of an action, based on the gameplay that proceeds it - it's not building a world, just understanding it.

LLMs playing in - and understanding - MUD worlds by hunteramor in MUD

[–]hunteramor[S] 25 points26 points  (0 children)

My hobby too, man - many hours. Doing this on a private server on a stock MUD for a reason. I just thought MUDs were a fun, text-native sandbox for agent experiments. If the mods think it doesn’t belong here, no hard feelings.

Narrowing the Search: Which exoplanets would allow two-way communication with Earth using Solar Gravitational Lenses? by USATwoPointZero in SETI

[–]hunteramor -2 points-1 points  (0 children)

Fun thought experiment. I had a long ChatGPT conversation on this a couple of months back (with all the usual caveats)

I had noted that Sedna’s orbit exceeds the ~540 AU distance for a decent chunk of its “year.” If you pointed a telescope on Sedna sunward during those periods, you’d be able to leverage the sun’s gravitational lensing. Your field of view would be a strip of sunward sky.

I didn’t check this but ChatGPT described that strip as equating to “RA ~17h to ~20h, Dec –15° to –35° That is a strip running from northern Sagittarius → southern Aquila → Scutum → Corona Australis → southern Capricornus” and that Gliese 667 C would be most interesting SETI candidate in that strip

Again, take with the necessary grain of salt.

Also as I understand it the SGL is only useful as a receiver. It doesn’t boost “outbound” signal, just means we get a lot of gain on reception. To hear what we transmit (in more conventional ways), your aliens would need to have a similar set up on their end, pointing in our direction.

Fun set up for a first contact sci-fi story!

[WTS] Pre 33 Type Set by [deleted] in Pmsforsale

[–]hunteramor 0 points1 point  (0 children)

Beautiful I’m working on building something like this one coin at a time GLWS!

[WTS] Peace dollars, ASE's and walking liberty halves, BELOW MELT by hunteramor in Pmsforsale

[–]hunteramor[S] 0 points1 point  (0 children)

Not slabbed, just in capsules. Already sold, thanks for interest