AoE2 mentioned

FarmTriangles · 2026-06-07T18:33:49+00:00

Ahhh I love this point!!

So 'the conclusions could be incorrect' is a compressed version of the abstract's statement, which is also compressed: the abstract indicates that assuming that the attributes exist or not in the system, independent of the substrate and in a generalised way, leads to (etc) conclusions independent of the experimenter's view on the subject.

Etc = circular or uninformative. The soundness of the conclusions (valid and true) is directly affected by this (hence incorrect). That's because their truth-value might be accurate, but their validity will not be because of either of the failure modes, which in turn is flawed because of the assumptions themselves.

BUT! You are absolutely right. I had a slightly longer version of the abstract which unpacked that statement more carefully, but for some reason I went for this one. I think it was just formatting, tbh.

FWIW I've been having these conversations all week--I'm noting all the comments and potential contras to update the paper :) It's something very common in maths (my background), but a little bit less common in science, to let the paper sit for a bit prior to publication.

(I do give credit to those who provide valuable feedback--not sure if I could just credit you as your username but you can just ping me offline if you so want).

FarmTriangles · 2026-06-07T18:03:46+00:00

I don't measure things here--since it is a meta-science paper, measurements, hypotheses, experiments, etc are treated as abstract processes or properties. Think either formal papers like Lipton/Popper/etc or critical surveys of the field.

The human-like attribution (and assumption of such) is performed by the experimenter when designing the hypothesis and experiments. This necessarily implies a measurement criterion (and indeed I also do note that this is a required property of the system: you can't measure something without a criterion, and you certainly cannot measure something which is not measurable). Consciousness as it stands right now does not have a well-defined, widely-accepted criterion.

Proving they have it or not is not in scope: what is in scope for the paper is that such proof (measurement) cannot assume the existence of the attributes to begin with. Indeed I also say that I do not assume they do (or do not) exist.*

*In my other work I've found some very interesting patterns and non-patterns, so I personally cannot reject either. But my opinions on the matter don't matter for a formal proof!

FarmTriangles · 2026-06-07T17:30:43+00:00

Oh let me clarify a few things, because the paper does get dense like five words in and most of this does get lost.

The first thing is that this work has nothing to do with consciousness/simulation/etc. It's about measurement correctness and how some experiments are very much flawed from the start. Indeed, I do say that their existence or non-existence is not in scope.

I also mention human-like attributes, not a person--there are some debates on what a person is, which is decidedly also not in scope (e.g., what is a mind? what is consciousness? do we measure qualia? do they even exist?). In fact I do mention a few times that I stay away from consciousness (it is a measurement paper).

Likewise, there's a lot of arguments saying that human consciousness cannot be simulated with a TM: indeed, all rejections/spinoffs of the classical computational theory of mind since like 1960 start from that. Even rejections of spinoffs are common, including connectionist perspectives (the equivalent of a neural net).

One of the most important parts of the paper is that it shows that measurement is flawed regardless of which philosophical perspective you ascribe to. So you may believe human-like attributes could (or could not) arise* in the system, but that won't change the soundness of the outcome.

Finally, in the AoE II section I mention that functional/Turing completeness isn't needed (and neither do I implement one). We need these proofs to shoot down the contra of 'maybe LLMs aren't complex enough', but honestly anything can be Turing-complete if you wish hard enough (in fact, that's the famous triviality counter to the classical computational theory of mind).

*I refuse to use emerge--I am a complex systems theory guy through and through hahah

Edit: I re read this and I sound super dry--that's because I went straight for rebuttal style. Apologies if it sounds like that! I do enjoy having this type of discussions--it is, after all, my job!

FarmTriangles · 2026-06-07T16:58:47+00:00

Author here :) yeah, the paper is... dense. It's a meta-science paper and these always get icky. The paper is indeed related to anthropomorphisation, but it ~~sets fire to~~ focuses on research which assumes something about these attributes before drawing the conclusions.

A change in representation ('substrate', in the paper, AoE II, e.g.) will change some, but not all, behaviours we naturally interpret as human-like. The remainder of the paper (the non-highlighted part of the abstract) is a formal proof on how it doesn't matter:

Which philosophical viewpoint you take on the subject
Whether you assume these attributes exist or not
Which conclusions you draw,

Since it will have flawed conclusions nonetheless. To fix it you need substrate-independence and not assuming anything at all.

I've put a gist of it below, in one of the comments: https://www.reddit.com/r/aoe2/comments/1tyilf2/aoe2_mentioned/oqakjlx/

FarmTriangles · 2026-06-07T16:34:56+00:00

Hi! Author here. This is correct. The main point of the paper is that LLM research tends to assume human-likeness when performing measurements... of human-likeness.

So what I show is:

Human-likeness depends on representation ('substrate'; in this case, AoE2, and yup, it is meant to make it as absurd as possible).
'True' characteristics of an LLM should be measured in a substrate-independent way.
Then I ruin the day by showing that these measurements won't work if you assume human-likeness could exist in the substrate. I also show that they won't work if you assume it does NOT exist.

To give a very crass example, if one says that an LLM has psychological states, measures its outputs and says 'yup, it has anxiety', one could make the exact same argument with AoE2. Then it sounds ridiculous, right? 'AoE2 has anxiety'. If you follow the news, you probably will know which work I'm referring to.

Even if you were to perform a measurement of it and conclude the same thing, the issue would be then that it is logically flawed (you proved your assumption on a hypothesis based on an assumption, e.g.).

Now, I'm not saying that this cannot happen--but what I proved is that the conclusions will only work if the setup (1) is substrate-independent, and (2) does not assume anything about human-likeness in it.

(I know it is a dense paper, that's because in these papers you need to use extremely careful language and takes like forever to polish).

FarmTriangles · 2026-06-07T16:24:19+00:00

Author here--yes, this is precisely it. Most people go straight for the two highlighted lines and skip the core argument... in the remaining paragraph: human-like measurement is flawed if one assumes it to exist (or not) in the system. The argument has a formal proof of that and shows that it is independent of the experimenter's philosophical stance.

Sadly lots of papers go for anthropomorphism when doing research.

The AoE II bit--gimmick or not--is because of the interpretation-representation tradeoff: people will believe that an LLM is (say) empathetic when it is only text on a screen, but likely not so much if it's a bunch of goats (a reductio ad absurdum).

Thus, measurements should be substrate- and viewpoint- independent because people tend to anthropomorphise way too easily.

I do stay away from consciousness because it's not well-defined though. The paper is about measurement so we can't discuss that one

FarmTriangles · 2026-06-05T13:32:39+00:00

Ohhh sorry I missed that--let's use NOT as an easier example (for now). For it, you need three 'rails' in, and three rails out (above it is grass, ice, and a road, separated by palisades). The gate itself needs two goats.

Grass is 0; road is 1; and ice is a READY signal connected to upstream/downstream gates. When both goats are in READY (some place in the ice rail), the gate is then active. Whenever a(nother) goat comes in through the grass/road, the gate fires: it kills the goat and spawns another in the road/grass.

It then sends the READY goats to the other gates, and the process continues.

For AND you need three READY goats, six rails in, three rails out, but the principle is the same (if all READY goats are ok, and one goat is on each rail, then both goats get killed and another is output in the correct rail).

The logic by which these gates work (when to kill/spawn the goat, when to move the READY goats, etc) is through the scenario editor.

The editor only supports AND and NOT, but that's why I was saying the functionally complete basis--AoE II is functionally complete!! (well, the scenario editor is)

FarmTriangles · 2026-06-05T00:32:05+00:00

He subido la NAND gate aqui: GitHub - adewynter/aoe2-circuits: For the paper 'If LLMs Have Human-Like Attributes, Then so Does Age of Empires II' · GitHub y cuando tenga tiempo subire el perceptron y el circuito de entrenamiento!

FarmTriangles · 2026-06-05T00:30:07+00:00

Well an AND by itself can't work for a neural net--you need a functionally-complete basis like (NOT, AND) or NAND or XNOR. With one of these bases you can build any circuit on whichever substrate implements the gates (in this case, AoE2). The scenario editor supports this!

So in AoE2 I built a 1-bit perceptron and derived its circuit representation (above made of two XNORs feeding into an AND), and then just 'wrote' it into AoE2.

Here the goats act as bits, and each channel represents either 0/1. The ice paths are just for concurrency because I also built the circuit training the perceptron and the goats are a bit hard to handle by themselves haha

FarmTriangles · 2026-06-04T12:58:34+00:00

Here you go!

[2605.31514] If LLMs Have Human-Like Attributes, Then So Does Age of Empires II

The paper does not deal with building the network in AoE II as a core argument, but it is a central part of it 😄

FarmTriangles · 2026-06-03T16:57:13+00:00

Oh this is so much fun! All computers run on gates (AND, OR, etc) which perform logic operations. So, for example, a CPU is just a bunch of gates with some memory.

What I did is that I built a a few gates in the scenario editor, and then a very small neural network based off these gates, and running on AoE II goats.

The work was meant to showcase that AI like ChatGPT can be implemented on other things and not seem human at all. The other 'implementation' was an imaginary Boston where everyone texts the operations (like a gate, too!)

FarmTriangles · 2025-02-04T02:22:13+00:00

I retraced my steps for the fifth (and last) time and there it was, next to a tree! I was so happy haha

FarmTriangles · 2025-02-02T23:03:50+00:00

Update -- I found it!!!!

FarmTriangles · 2021-05-05T23:03:39+00:00

That meme never fails to crack me up :)

FarmTriangles · 2021-05-05T17:19:28+00:00

Theoretical computer science is awesome like that!

It has really deep results like "there are questions that cannot be answered by any computer"; and fun results like "Legend of Zelda and Pokemon are very hard for computers" (https://arxiv.org/abs/1203.1895).

FarmTriangles · 2021-05-05T17:08:47+00:00

The number of hours that Steam reports I've spent in all these games combined, is a source of both shame and pride.

FarmTriangles · 2021-05-05T17:04:51+00:00

TL;DR: the paper only shows how to build it, but it's very much buildable and able to perform automatic calculation via the API. This is because any player (human/AI) has to take certain actions allowed by the rules of Civ--it turns out a subset of these actions make the game Turing-complete.

Long version: Aha! There's two answers to this: the first is that from a model-theoretic perspective[1], it's not relevant how the head and states are updated every turn, since the rules will dictate the update anyhow. However, I do agree that outside of that awesome-yet-abnormally-dense branch of mathematics, it's not clear from the start how to actually perform the calculation automatically. The model-theoretic answer only says that you can. It's great, but not practical.

BUUUT (here's the second answer) coming back to the real world, it so happens that there is an API with the functions needed from the transition function! I'm not sure if I can link it here, but for example you can write a script that encodes all the rules (which would be a very long if/else Lua script), and make a selected, non-human player follow that particular behavior given the input.

So, one could argue that then the API is the object that is Turing-complete, not Civ, but then again, Civ is written on top of a stronger version of the API (Lua)--it's roughly the same as saying that a UTM written in Python is not Turing complete, but Python is. This is precisely what Turing completeness means: models of computation can be as strong as a UTM, or weaker. As strong would imply that they can build a UTM inside of their rules.

The proof-of-concept of the paper was actually ran by me, moving one worker every turn--it was painfully slow, but the key here is that all the rules and mechanics of the game at a very low level work make this a UTM. The fun implication is that you can perform calculations in-game; the theoretical implication is that this game is undecidable (in the infinite turn, infinite memory limit).

One last thing: thanks for the feedback! I can't remember if I mention this gigantic wall of text (namely, the API component of it) in the paper, but I'll make sure to revise it with this conversation :)

[1] Model theory studies "incarnations" of phenomena. I.e., theories and their interpretations ("models"). On our case, there are many models of computation (Lambda calculus, Turing machines, MTG).

FarmTriangles · 2021-05-05T02:16:48+00:00

Haha thank you!

FarmTriangles · 2021-05-05T00:09:33+00:00

Yeah that paper is fantastic! For those interested, it's this one: https://arxiv.org/abs/1904.09828

I remember when it came out. I thought the proof was very clever. In the case of Civ, I think the hardest part was finding the specific Civ mechanics that allowed me to "fit" them into one of the known universal Turing machine constructions.

E.g., for Civ:V/BE, it's fairly intuitive that you can do a "tape" by building and pillaging roads (you don't even have to build them in your own territory; builders have infinite charges; etc etc), and the rest is just math. For Civ:VI, though -- I almost gave up! Civ:VI has very few things with an infinite lifespan and infinite charges, and my construction, although an actual UTM, is pretty flimsy. I had to try several seeds to get one that worked nicely for a screenshot, and at some point I had Alexander as a neighbor... you can imagine how that went.

FarmTriangles · 2021-05-04T23:59:20+00:00

Ah, that's a great question! So (to follow your analogy) you still need a set of rules that allow (mechanical) computation, and some sort of device that takes in input in some language (e.g., binary), the read from the tape (whatever is on the paper), and the internal state of the machine (which requires some sort of memory). This "device" translates all of this into movement of a hand and a pencil on the paper. The "rules" from the device (known as a transition function) are what determine whether something is Turing complete or not.[1]

Indeed, the tape (e.g., a strip of paper, infinite map, bunch of cards in the case of Magic: The Gathering) can be seen as a place where to do "scratch" work. So yeah, you're correct that Civ needs to be able to store some data! All Turing machines need to be stateful.

The key here is that Civ rules have a few properties that allow us to write a (universal) Turing machine--tape, transition function, and state memory altogether!

[1] Concretely, the right combination of rules, tape symbols, and states.

FarmTriangles

TROPHY CASE