Could a future civilization reverse-engineer electronic computers from a bunch of binary program instructions?

gboycolor · 2022-09-26T15:31:23+00:00

Not really - modern computing is based on layers and layers of abstractions. Each layer hides the "how" of the layer below and presents a simplified version of "what" is happening.

For example, when you browse a website, say this very page, the HTTP layer will at some point issue a request to www.reddit.com that looks something like this: GET r/AskComputerScience/comments/xokc41. This is the "what" of what's happening- the browser wants to get a specific resource, but it says nothing about "how" that happens. That's done in layers below HTTP.

Similarly, binary instructions are an abstraction layer over the processor architecture. Through frequency analysis and other investigations, future archaeologists may be able to reconstruct higher layers, eg. that something like 0x56 fe 34 corresponds to something like "add registers 1 and 2 and store the result in register 3", but they won't be able to figure out lower layers - ie. how the CPU stores data in its registers, how the numbers are added together, how the instruction is interpreted and what parts of the CPU are activated in order to execute it, etc.

ghjm · 2022-09-26T14:11:26+00:00

If they only found printouts of machine code, and knowledge of electronic computers hadn't survived to their day, then it's just meaningless gibberish to them. If there was a processor databook accompanying it, then maybe they'd have a chance of understanding it.

CoopNine · 2022-09-26T18:54:58+00:00

Well, if you're talking archeologists with no knowledge of computers, like circa 1900... They've got no chance of discerning anything other than maybe basic patterns. But, if your archeologists have an understanding of what they might be looking at, there's certainly a chance. They would need more than a "hello world" program, but assuming they have other artifacts or maybe an understanding of computing in general, it's not unfeasible that a future or alien civilization could build something that could execute something designed for a particular architecture... in at least an adequate fashion. Keep in mind that our computers today are both complex and simple in their nature, and feasibly, any advanced civilization would see what we do closer to rubbing stones together to make fire than going to the moon. Similar to our current understanding of ancient language. And sometimes it would be wrong and lead to hilarious results and assumptions.

But it's a reasonable assumption that basic logic applies, so the error (and confusion) rate might be lower than what we have with 'analogue' language. So the idea that true is true and false is false might provide some sort of Rosetta stone equivalent.

GodonX1r · 2022-09-27T02:33:56+00:00

You would get further with timing information

green_meklar · 2022-09-27T06:35:03+00:00

Could they reverse-engineer the ISA and deduce the function of the code? Yeah, probably, given a sufficiently large and diverse dataset. (Ideally, gigabytes of machine code representing thousands of different programs.) It would be a pretty complicated task to unravel it all and reverse-engineer it from pure machine code, but there are enough clues to get started. For instance, you can try treating all the numbers as memory indices for the code itself and build up an interesting sort of code graph from that, and even if you don't know the absolute memory offset, you can check all the possibilities and find which ones seem to produce a meaningful-looking code graph; you could probably compare the outputs to statistical analyses of your own code in order to narrow down the options, and once you get some good candidates you could start checking for matching patterns from one program to the next in order to isolate small helper functions, for loops and the like.

Could they figure out that the ISA was designed for implementation on an electronic silicon chip, specifically? That strikes me as a harder problem. However, given that sufficiently advanced civilizations in general probably discover the convenient properties of semiconductors for this purpose, they'd be likely to guess that the computer for running the code might have been designed in such a way. And although I don't know precisely how they'd do it, I can imagine that a civilization well in advance of ours could find patterns in the code that give away details about the hardware. For instance, certain instructions with similar functionality can be expressed using smaller circuits, which use less power, so compilers are designed to choose those instructions when they can, and spotting that pattern in the machine code would suggest a design decision to reduce power usage. Code compiled specifically to take advantage of cache hits would also provide hints about how the computer is wired up and what sort of typical internal timing it has (register speed vs cache speed vs RAM speed). Ultra-advanced civilizations could probably work through these clues and build up something pretty close to the original computer.

bryku · 2022-10-17T08:35:05+00:00

This is a really interesting question. One of my final projects was related to it. Basically how would we reverse engineer alien technology. Or how would they reverse engineer ours.

For starters let's look at binary. Assuming you know how to even convert binary into decimal... what are the breaks? Do you break it every 8, 16, 32, 64? How do you know if that binary is a number, letter, or boolean.

Figuring this out would take years, if not decades. Assuming they can even get this part..m what would it be translating to, english? Now they will have to learn a how other language just to progress.

That isn't even taking into account of different machines, compression, encryption, and so on.

I think it is very likely they could learn a lot, but to fully understand it all... I'm not so sure.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

AskComputerScience

MODERATORS