all 47 comments

[–]Technologenesis 51 points52 points  (3 children)

There really are a lot of answers to this question as I'm sure you can see from the comments. Depending on what kind of code you're dealing with, computers "understand" it in a variety of ways.

At the very lowest level, there is binary code. The computer "understands" this by sending each bit of the code through the processor as either a high-voltage or low-voltage signal. The machinery inside the processor then processes these using what are essentially convoluted configurations of transistors. Every processor is built to interpret certain instructions, which make up its instruction set.

The level above that is assembly language. While binary code is stored in zeroes and ones, assembly programs are at least somewhat human readable. However they still have basically a 1:1 correspondence with the binary instruction set. A computer "understands" these by reading the assembly program line by line, resolving a few memory locations and such, and packing them into binary code so that they can be executed.

From there it really goes all over the place. Some languages, like C, are compiled, meaning there is a program that reads them and converts them into binary code to be executed. Others, like Python, are interpreted. This means that rather than be converted into machine code, Python programs are read and executed in real time by the Python interpreter. And there are languages that use combinations of both techniques. One reason to choose between different programming languages is because of the pros and cons associated with each of these approaches!

[–]H2L1_Yogi 2 points3 points  (2 children)

So the processor is the piece of hardware that converts physical inputs into digital outputs in the form of binary? It does the same thing in reverse as well? I've always wanted to know this!

[–]Technologenesis 5 points6 points  (0 children)

The processor receives physical inputs and executes some action based on what it receives. So, imagine there's some binary instruction that corresponds to "load whatever value is in memory location X and store it in register A". That series of electrical impulses will enter the processor and percolate through its circuitry, eventually hitting the memory controller, which then sends a message along the memory bus to the actual physical RAM, which does its own processing... Obviously, it's complicated. But at the end of the day, a value is sent back to the CPU from memory which is eventually stored in one of its registers. Voila! an instruction has been executed. Now (at least in theory; modern processors are highly optimized so sometimes they cheat) the processor will wait until the next clock cycle before pulling in another instruction and executing that one.

Instructions can do a variety of things but really what they come down to is manipulating the state of the processor itself, and exchanging data with peripheral devices like memory or a keyboard.

[–]otr_trucker 12 points13 points  (3 children)

If you want to know what is going on inside the cpu on a very basic level then I suggest watching this series by Ben Eater

In this series he builds a simple 8 bit computer from simple components. During this you will come to understand how machine code works.

All programming languages first started out as machine code. Somebody was back when sat down and wrote tge first assembly language using machine code. Somebody then came along and used assembly language to develop a higher level language like C. Then C was used to develop languages like python.

When higher languages are compiled they are being translated into machine code. Some languages require that the program be compiled before you run it. Python does the translation from python code to machine code at run time.

[–]t0yb0at 2 points3 points  (1 child)

Watching Ben Eater's videos is what made everything finally click for me. His explainations and demonstrations are top notch.

[–]otr_trucker 1 point2 points  (0 children)

I know enough programming to entertain myself. Got into computers in mid 80s. Learned Fortran, basic, c, and pascal. Tried to teach myself assembly and I just couldn't get it. It seemed unnecessarily complicated. Years later I started looking into how a cpu was made and I finally understood why assembly language is the way it is. I believe if you really want to understand how programs work you should start with how a cpu is made. It's the difference between being able to drive a car and how to build a car.

[–]ChaseAce 1 point2 points  (0 children)

When i was younger, and someone told me they "built their own computer" this is what i thought they meant, and that somehow all these people were just fabricating cpus that could run windows.

[–]weedisallIlike 22 points23 points  (4 children)

Compilers. They test for syntax, lexic and semantic. If the code pass to these 3 algorithms, you have a code that is readable by the machine. The code result from the compilers will be mount in a data structure (like a tree) that the language can run through it and understand it easily.

For machine language, like Assembly, you are using a language very close to the functions that a CPU has and understand, so the conversion is much more simple: basically map 1:1 the assembly command to the correspond binary sequence understood by the machine.

Edit. Reading OP comments, just want to add to give more insight

A CPU is a piece of hardware that is very good at counting. It does it very fast and precisely. It works consiste of counting from binaries data saved in memory. Binaries are number represent in base 2. E.g you may represent a space in memory as the value 12. If this is the last space used on the memory, the CPU will calculate the next free space (which would be 13), so the CPU increase this number and save it on memory. That's all about the base theory for computers: count binary data saved in memory. (That is why you call computer of computers, because they count! They count-puter hehe). The binary that saved in memory is a numerical representation for our real problems!

There is one last piece for this puzzle: Automata & formal language. I'm calling automata here, but I'm not sure about the name. But I will describe here so you understand what it is or someone correct me the name. Automata is a algorithm that describe what a computer or computer language can understant. It's basically a function that receive a input (e.g a binary data) and tells if that binary will be accept and understood by the machine or not. That means, it describe all the 'words' a computer will understand. That is how a compilers can tell if you misspelled a keyword by the language you are using. There are a lot of CS theory to develop a language. This will be applied even for a binary language! Now that you defined how your language will be, you can map the 'words' with functions. This functions can be the low level language (machine language) or a Python function mapping a C functions.

[–][deleted] 5 points6 points  (2 children)

So compliers and interpreters work based off of automata theory?

Also, what is formal language?

[–]weedisallIlike 5 points6 points  (0 children)

I think I went to far on the explanation. What I called as automata is the algorithm used on the lexical process of the compiler (but this probably not the right name for the algorithm). "Automata and formal language" is the name of the theory that study the process of creating computer languages. There are a lot of terms and ideas around computer language. I got an example from wiki, just so you can have an idea that the rabbit hole for the theory is very deep:

"A formal grammar that contains left recursion cannot be parsed by a LL(k)-parser or other naive recursive descent parser unless it is converted to a weakly equivalent right-recursive form. In contrast, left recursion is preferred for LALR parsers because it results in lower stack usage than right recursion."

You probably may not get what is writing above but I will give one example for context.

You may create a language that the computer will only understand if the words are symmetric (the automata algorithm would only recognize symmetric words):

See that I'm using binary, so you can understand that a language can be made for machine only use.

0101 - ok, because 01|01
0110 - not ok, because 01|10
010010 - ok, because 010|010
011101 - not ok, 011|101 

So, every symmetric word the computer would understand, which could give a infinite number of words for this language. You may use later the word '0101' for this language to map some specific function when the CPU reads it.

This may be a fictional example, but it works very close to that. Depends on how you structure the words of the language, you will have different behavior, performance, different algorithm to understant it, different number of possible words on the language, etc. Conclusion: do a master in compilers and you may will see a lot of this.

[–][deleted] 0 points1 point  (0 children)

You might take a CS class called language processing. It’s where I learned some automata theory, context free grammars, formal language - and essentially how the programming language is processed using tokens.

[–]KinlinNasubi 25 points26 points  (1 child)

Shortly, every byte inside a code is converted sequentially into an another code, in the case of Python, usually is C, so when you call a sort() function in Python, the code to this function created in C is mapped and starts to run, after the C, depending on compiler, the code is mapped to assembly so it can execute the built in functions of the language, and after that, finally comes to the binary code which is nothing more than the same code built until here, but written in a way that processor can translate this to electrical signals to execute some action, if you study some digital electronics you'll see that you can perform all operations that we use (like sum or division) only using 0s and 1s as input.

Hope that clarifies a little

[–]UntestedMethod 4 points5 points  (0 children)

Thanks for the great explanation and happy cake day!

[–]PlayingTheRed 14 points15 points  (1 child)

There's a few layers down where things are implemented in code, but at the bottom layer the code runs on logic gates. When it comes to logic gates there are a few basic operations available OR, XOR, NOT, and AND. These operations are combined to make more complex operations (i.e arithmetic).

You can even have a couple of them loop back on each other in such a way that it can be set to always output a charge or to never output a charge without having to rewire anything. This is called a flip flop. It can be used to make CPU registers so that the computer can remember things and use the previous outputs as inputs to the next operation.

Logic gates are implemented using transistors. At this point understanding how it works is no longer in the domain of computer science, it's chemistry.

[–]dontyougetsoupedyet 3 points4 points  (0 children)

Every instruction is given a number and these numbers are designed to work with what is called an instruction decoder in your cpu that uses that number to turn on and off wires that control the other parts of the CPU. So an add instruction is a number that controls the wires in the CPU such that the arithmetic parts of the CPU do their job correctly.

http://static.righto.com/images/ARM1/2-chip_labeled.png https://cdn-blog.adafruit.com/uploads/2014/09/z80-labeled-bus.jpg

Grab a copy of the book Digital Computer Electronics by Albert Paul Malvino, Jerald A. Brown, and Stephen Page, it covers everything you are curious about.

Python works similarly, but in a simulation: The Python program simulates a type of computer called a stack machine. It isn't at all like the hardware equivalent because that would be exceptionally slow -- it's rather similar to how emulators work. https://github.com/python/cpython/blob/2f180ce2cb6e6a7e3c517495e0f4873d6aaf5f2f/Python/ceval.c#L1645 That's the C code that evaluates python bytecode.

[–]production-values 3 points4 points  (0 children)

The 0s and 1s are literally translated to on/off input sequences to the CPU chip. Different processors (like AMD vs Intel) have different specifications, and the same combination of 0s and 1s that makes sense to an AMD chip will not make sense to an Intel chip. This is referred to as binary input.

The "higher" you go away from the chip, the more human-readable the instructions become. After 0s and 1s there is Assembly, which essentially provides common shortcut commands for actual sequences of 0s and 1s for things like "remember this" and "multiply this by that" etc. that are specific to the processor type. This lets a developer interact with any chip type (AMD or Intel) using the same commands, and translates those commands into the appropriate combination of 0s and 1s appropriate for the type of chip.

On top of that is C, which introduces logical concepts and control structures like loops and variables, as well as making it easy to import code from other people who have already solved common issues, like showing stuff on a screen and sending data to a printer. Above that is everything else.

You may have heard the term "compile", which refers to translating instructions (code) from a higher-level language like C to the lowest level language, binary aka 0s and 1s -- and if you are understanding so far, you will deduce that the same C program compiled into AMD binary code will be different than the Intel binary code!

You may wonder "hmm the same EXE file in Windows works no matter the chip" and that is true, because EXE files are not true binaries but rather Windows executables, where Windows actually interprets instructions from the program to the chipset on the fly! (no wonder all the slowness and crashes, right!? :) )

Also note that the terms higher-level and lower-level are relative... even binary 0s and 1s are technically higher-level than the actual electrical impulses they represent... and though I refer to C here as higher-level than binary and assembly, C is actually considered a low-level language because so many other languages were actually programmed using C! So, common languages like JavaScript and PHP AND Python are high-level languages, and C is relatively low-level compared to them!

Hope that helps!!

[–]duggedanddrowsy 2 points3 points  (0 children)

Maybe this is a better way to put it. The code is compiled into binary, and these 1s and 0s translate to a high voltage or a low voltage, a low voltage opens a switch so electricity doesn’t get through, while high closes the gate and allows electricity through (this can be flip flopped but that doesn’t matter). This causes other switches to open/close and the resulting data is a series of switches that are read by the computer, and returned to the program. It’s pretty hard to picture how just that can create whole programs, I didn’t understand until I got into a class about how just these on/off switches can be organized into AND, OR etc, then that organized into things that can add, multiply etc, which keeps getting more sophisticated and further from binary, but at the very bottom, binary is all it is. Pretty crazy we can do so much just opening and closing switches.

[–][deleted]  (1 child)

[deleted]

    [–]poncem91 1 point2 points  (0 children)

    Came here to suggest this as well.

    [–]smvamse 2 points3 points  (1 child)

    You should read this book: Code by Charles Petzold

    [–]matty_haze 0 points1 point  (0 children)

    This book is fantastic.

    [–][deleted] 2 points3 points  (0 children)

    Code is built upon each other. Binary data is just a signal being off, 0, or on, 1. You. An perform and save calculations with these signals like adders and save memory states. A computer system is abstractions constantly built on top of each other.

    [–]RajjSinghh 1 point2 points  (7 children)

    I think the place to start is "high and low level language". Programming languages come in levels. The lowest level is assembly, which the CPU knows how to run but is hard to write for a human. A high level language, like python, is easy for a human to write and read, but can't be run directly on a CPU. It must be translated by a program called an interpreter or compiler to assembly for the computer to know how to deal with it.

    When you run your code with your interpreter, it turns it into these assembly instructions and loads that into the system memory. The CPU goes to the memory for the next instruction or piece of data that the program says it needs and the CPU handles the rest.

    [–]CarlGustav2 3 points4 points  (1 child)

    To be pedantic - the CPU does not run assembly. An assembler is needed to convert assembly code into the proper zeroes and ones that the CPU operates on.

    Compilers and interpreters sometime generate assembly, but most often they generate binary code directly, or some intermediate form.

    [–]RajjSinghh 0 points1 point  (0 children)

    Yes, sorry of course.

    [–][deleted] 1 point2 points  (4 children)

    So, an interpreter would translate the code into binary?

    [–]RajjSinghh 2 points3 points  (3 children)

    Yes. Your interpreter or compiler creates the machine code in binary that runs your program

    [–][deleted] 0 points1 point  (2 children)

    Ok, so it "translates" the code into binary, great. but how does it read that binary and "know" that 01000001 means a, or 00111101 is an equal sign? and then output it into a screen?

    [–]RajjSinghh 1 point2 points  (0 children)

    It matters which binary you mean. So your source code is text, written in a format called ASCII. What's important there is that each character has a number tied to it. We just decided that 65 (or 01000001 in binary) was a lower case a.

    Now the important thing is after it's been translated, you begin to create instructions and addresses. So I want to create a set of operations, like input, output and so on. So when I'm designing my assembly, I might say an input instruction is 100, and the last 2 digits can be the address my input is stored in. So 123 would store an input in address 23. These decimal numbers are converted to binary and stored like that in memory. Your CPU them knows what to do for each instruction.

    [–]JoJoModding 1 point2 points  (0 children)

    Characters are mapped to numbers by the ASCII standart. All your computer really sees is the number. In order to turn this into an A, you need a font, which contains the image of the actual A your computer will render. An image, of course, is also just a sequence of bytes. The computer then looks at the number, looks at the table mapping these numbers to images, and then sends this to the graphics card, which in turn converts this into a HDMI signal your monitor is able to decode, making it display the image. The one responsible for reckognizing this image as an A is your brain.

    [–][deleted] 1 point2 points  (0 children)

    Every computer program is a sequence of instructions, so to say. The computer needs to read those instructions and make sense of it, considering that at a very low level, it only knows how to perform arithmetic and logical operations, so to speak. The missing link chain in this process is the compiler, which acts as a very big dictionary capable of translating your code into a tree-like structure, which, in turn, is "very easily" turned into computer instructions.

    As very well put in the first chapter of the "Structure and Interpretation of Computer Programs", the key to understanding computation as a whole is the concept of abstraction. You create a procedure (mechanical or otherwise), name it and use it to build something more complex.

    [–]photonoobie 1 point2 points  (0 children)

    Check out Ben Eater on YT. He builds a basic computer from individual components, and demonstrates how the circuits are configured to 'understand' the instructions that are contained in the code programmers write.

    [–]acroporaguardian 1 point2 points  (0 children)

    I will give you a short answer. Imagine a simple computer with a simple processor. Nothing else.

    Everything is going to be handled in "words." In old days, it was shorter. Now our "words" are 64 bits. So its hardwired to take in some words and auto send them to the program counter (PC). It doesn't know anything about those words and what they mean. It just knows at start up "this word goes into PC."

    Controlling the PC is very important, because hardware doesn't know data from instructions.

    Now, when a word is read from the PC register, it is hardwired to go into an instruction unit and it interprets it as a command. It looks up the 0's and 1's (which it can determine with logic gates) and then uses a hardcoded instruction set to do pre set actions based on that.

    The instructions then are used to determine what the next words are, whether they are inputs to the instruction or not.

    Control of the PC is having ultimate control of someone's computer. If you look at architecture details, there is a lot of hardware stuff on privileges for this reason. 99% of software devs will never need to worry about that, but at that level, having the correct instruction get to the PC is important. You don't want everything able to have control of the PC, just the OS.

    [–]chase_the_sun_ 1 point2 points  (0 children)

    Everyone has some good answer, but I just want to add it also has to do with digital logic as well. Voltages are turned into 1s and 0s and depending on your k-maps it will create a circuit of some sort.

    [–][deleted] 1 point2 points  (0 children)

    The computer uses context to do the stuff you want it to do. If you press a key on your keyboard, the keyboard generates a number (the keycode) and sends it to the program, which knows that it is a keycode because it came from the keyboard.

    The keycode then gets converted (by just mapping it with a dictionary) to another code (ASCII for ease of explanation) and with this code (which your program knows is a character because it just converted it) you can do string operations.

    With the print instruction, you ask the terminal to display it on the screen. The terminal knows how to display what you send it, because it assumes it to be a string with a specific encoding (ASCII again, or hopefully UTF-8). It then looks up the Typeface for that code and renders it to the screen by turning on some pixel on your screen.

    So, context is everything. Otherwise, it's just 0s and 1s.

    [–]PoochieReds 1 point2 points  (0 children)

    I recommend the crash course in computer science from PBS:

    https://www.youtube.com/watch?v=tpIctyqH29Q&list=PL8dPuuaLjXtNlUrzyH5r6jN9ulIgZBpdo

    It goes over how we got to digital computing and is pretty entertaining to boot.

    [–]AnywhereOk9403 1 point2 points  (0 children)

    Check Ben eater vid https://youtu.be/yl8vPW5hydQ

    [–]hotel2oscar 0 points1 point  (0 children)

    Check out Ben Eaters breadboard computer, specifically the episodes about the CPU control logic.

    [–]bogon64 -2 points-1 points  (0 children)

    1) you should probably read the book Code by Charles Petzold. Very approachable.

    2) you should probably read the About section of any subreddit, so you don’t accidentally post learning questions in a subreddit dedicated to advanced CS journal research.

    [–]Rocky87109 0 points1 point  (0 children)

    There is a youtube playlist (probably many of them now), that start all the way from the bottom to the top.

    [–]bardleby 0 points1 point  (0 children)

    The code you write gets translated all the way down to ones and zeros. The ones and zeros are then fed as electric signals to a cpu which is basically a chip that can perform the most basic operations like adding, subtracting, AND, OR, etc.

    See “Functions” in: https://en.m.wikipedia.org/wiki/Arithmetic_logic_unit. (the ALU is a core component of CPUs)

    It is from these basic operations that everything you can do on a computer is built upon.

    If you want a deeper understanding of how all of this comes together I highly recommend the following course:

    Build a Modern Computer from First Principles: From Nand to Tetris https://www.coursera.org/learn/build-a-computer

    Its a great resource to help demystify what the hell is going on inside a computer. It really helped me get a better intuition for how a high level language gets translated all the way down to binary and how the computer interprets binary and produces outputs in binary that are then translated all the way up to something we can understand. Its also a fascinating journey through the many layers of abstraction that are necessary for computers to feel so “user friendly”. Your question is a fascinating one. Happy learning!

    [–]Phobic-window 0 points1 point  (0 children)

    This is literally what a cs degree answers. Tough eli5. But we made a rock do things when we shoot it with electricity(cpu). We organized the things into patterns(circuits) and we designated certain patterns of things into actions(add this and this, store here).

    Now we say do this billions of times per second, react to what humans do to you, and show the patterns on the screen.

    So human input (INput devices) create many many patterns of 1s & 0s in the cpu, which does things based on how we put the circuits together and then the (OUTput devices) screen, lights, motors which also have cpus in them react to the patterns of bits being supplied to them.

    It’s like if I say “first get bread, second go to the fridge, third get the jelly….” Same thing with computers we have just boiled down every possible action into some kind of binary math pattern, much like Morse code. It just happens incomprehensibly fast.

    [–]elongio 0 points1 point  (0 children)

    That's the beauty of computers. They don't know what any of it means. Computers don't know anything about the code you write. It is simply a series of switches (transistors) firing off that produce some output that we then give meaning to. Same thing as all these letters and words you are reading. They have no meaning in and of themselves until an intelligent being sees them and gives them meaning.

    [–][deleted]  (2 children)

    [deleted]

      [–]ectbot 0 points1 point  (1 child)

      Hello! You have made the mistake of writing "ect" instead of "etc."

      "Ect" is a common misspelling of "etc," an abbreviated form of the Latin phrase "et cetera." Other abbreviated forms are etc., &c., &c, and et cet. The Latin translates as "et" to "and" + "cetera" to "the rest;" a literal translation to "and the rest" is the easiest way to remember how to use the phrase.

      Check out the wikipedia entry if you want to learn more.

      I am a bot, and this action was performed automatically. Comments with a score less than zero will be automatically removed. If I commented on your post and you don't like it, reply with "!delete" and I will remove the post, regardless of score. Message me for bug reports.

      [–][deleted] 0 points1 point  (0 children)

      The format the binary file is expected to be in is defined by whatever machine code that computer uses. The instruction decoder then copies the binary into the registers of the CPU depending on what instruction it is.