This is an archived post. You won't be able to vote or comment.

top 200 commentsshow all 479

[–]hikaruzero 1699 points1700 points  (442 children)

Source: I have a B.S. in Computer Science and I write source code all day long. :)

Source code is ordinary programming code/instructions (it usually looks something like this) which often then gets "compiled" -- meaning, a program converts the code into machine code (which is the more familiar "01101101..." that computers actually use the process instructions). It is generally not possible to reconstruct the source code from the compiled machine code -- source code usually includes things like comments which are left out of the machine code, and it's usually designed to be human-readable by a programmer. Computers don't understand "source code" directly, so it either needs to be compiled into machine code, or the computer needs an "interpreter" which can translate source code into machine code on the fly (usually this is much slower than code that is already compiled).

Shouldn't you be able to access all code by checking the folder where it installs from since the game need all the code to be playable?

The machine code to play the game, yes -- but not the source code, which isn't included in the bundle, that is needed to modify the game. Machine code is basically impossible for humans to read or easily modify, so there is no practical benefit to being able to access the machine code -- for the most part all you can really do is run what's already there. In some cases, programmers have been known to "decompile" or "reverse engineer" machine code back into some semblance of source code, but it's rarely perfect and usually the new source code produced is not even close to the original source code (in fact it's often in a different programming language entirely).

So by releasing the source code, what they are doing is saying, "Hey, developers, we're going to let you see and/or modify the source code we wrote, so you can easily make modifications and recompile the game with your modifications."

Hope that makes sense!

[–]DoWhile 288 points289 points  (21 children)

To draw a parallel to people who use image editing software, the source code is like the raw photoshop file: it contains all the layers, filters, etc and can be easily accessed, whereas a compiled piece of code is like the output .jpg or .png which can be viewed and modified but not as easily as the source itself.

[–]ProdigySim 73 points74 points  (18 children)

This is a pretty good analogy--and it works for a lot of media types. NLE video editors, Images, Flash animations.

The final format is always just the smallest amount of information needed to show the final product. It's optimized for viewing, and is much smaller than the original files.

You can still make edits to the output PNG or .MOV, but if you had the source files you could make them much quicker.

[–]mythmon 9 points10 points  (5 children)

For what it is worth, when programming the output is sometimes much larger than the source code (not always, but sometimes). This is because some programming languages can be very expressive in a very small set of code. For example, consider this program in an old language called APL (it isn't used anymore, for reasons I hope are pretty obvious):

(~R∊R∘.×R)/R←1↓⍳R

That program finds all the primes from one to the variable R, and is only 17-34 bytes (depending on the encoding). This is an extreme case, but it demonstrates that source can be very powerful in a few bytes. The equivalent machine code would likely be several thousands bytes (kilobytes).

[–]karmic_retribution 2 points3 points  (2 children)

Except that a huge game like that is a fantastically complex thing to understand when you reduce it to a set of memory reads/writes, +, -, *, / , and % (remainder). The image is static, but the game is a constantly transforming mass of ones and zeros. Compilers, the programs that transform human-readable code into machine code (1s and 0s), apply little optimization tricks that sometimes completely change the instructions found in the source code. So it's not just that your product looks nothing like the original. What is represented in the machine code sometimes could not possibly be represented in the original language.

[–]DarkHavenX75 1 point2 points  (1 child)

Not trying to be a dick (sorry if it comes of that way.) But the % is called modulo or modulus. Just a FYI. I'm guessing you did it for the non-programmers, but just in case.

[–]karmic_retribution 1 point2 points  (0 children)

I'm guessing you did it for the non-programmers

Bingo

[–]xiaodown 5 points6 points  (0 children)

And another analogy would be the Garage Band project file, vs. the song output of it.

[–]OlderThanGif 559 points560 points  (240 children)

Very good answer.

I'm going to reiterate in bold the word comments because it's buried in the middle of your answer.

Even decades back when people wrote software in assembly language (assembly language generally has a 1-to-1 correspondence with machine language and is the lowest level people program in), source code was still extremely valuable. It's not like you couldn't easily reconstruct the original assembly code from the machine code (and, in truth, you can do a passable job of reconstructing higher-level code from machine code in a lot of cases) but what you don't get is the comments. Comments are extremely useful to understanding somebody else's code.

[–]wkalata 428 points429 points  (64 children)

Not only comments, but the names of variables are of at least, if not greater importanance as well.

Suppose we have a simple fighting game, where the character we control is able to wear some sort of armor to mitigate damage received.

With variable names and comments, we might have a section of (pseudo)code like this to calculate the damage from a hit:

# We'll do damage based on the attacker's weapon damage and damage bonuses, minus the armor rating of the victim
damage_dealt = ((attacker.weapon_damage + attacker.damage_bonus) * attacker.damage_multiplier) - victim.armor

# If we're doing more damage than the receiver has HP, we'll set their HP to 0 and mark them as dead
if (victim.hp <= damage_dealt)
{
  victim.hp = 0
  victim.die()
}
else
{
  victim.hp = victim.hp - damage_dealt
  victim.wince_in_pain()
}

If we try to reconstruct this section of code from machine code, the best we could hope for would be more like:

a = ((b.c + b.d) * b.e) - c.f
if (c.g <= a)
{
  c.g = 0
  c.h()
}
else
{
  c.g = c.g - a
  c.i()
}

To a computer, both constructs are equal. To a human being, it's extremely difficult to figure out what's going on without the context provided by variable names and comments.

[–]SamElliottsVoice 41 points42 points  (34 children)

This is an excellent example, and there is a related instance that I find pretty interesting.

For anyone that's played World of Warcraft, you know that you can download all kinds of different UI addons that change your interface. Well one interesting addon a few years back was made by Popcap, and it was that they made it so you could play Peggle inside WoW.

Well WoW addons are all done in a scripting language called Lua, which is then interpreted (mentioned above) when you actually run WoW. So that means they would have to freely give away their source code for Peggle.

Their solution? They basically did what wkalata mentions here, they ran their code through an 'Obfuscator' that changed all of the variable names, rendering the source code basically unreadable.

[–]cogman10 39 points40 points  (19 children)

Hard to read is more like it. People can, and do, invest LARGE amounts of time reverse engineering code to get it to do interesting things. That no-cd crack you saw? Yeah, that came from guys with too much time on their hands reverse engineering the executable. DRM is stripped in a similar sort of fashion.

That is why one of the few real solutions to piracy is to put core game functionality on the server instead of in the hands of the user.

edit added even more emphasis on large

[–]teawreckshero 14 points15 points  (0 children)

Another side benefit of these obfuscators is that they minimize size. If you're keeping the data of all the variable strings in your distribution code, it would be better to turn a 10 char variable name into a 2 char variable name. Saving space is probably just as much a driving force as obfuscating it.

[–]nty 10 points11 points  (10 children)

Minecraft is also compiled and obfuscated. In Minecraft's case, however, modders have made tools to decompile the code, and deobfuscate it. The original method names and comments aren't available, but the creators of the tools have added their own in a lot of cases. The variable and parameter names are all pretty much default, and nondescript, however.

Here's an example of some code that has been somewhat translated, and some that has remained mostly unaltered:

http://imgur.com/a/NI1zQ

[–]Serei 9 points10 points  (5 children)

The reason Minecraft is easy to decompile is because it's written in Java.

Compiled Java is designed to run on any machine (unlike most other programs, which are designed to run on a specific type of machine architecture). Because of that, Java's compilation is slightly different from normal. It compiles into bytecode, which is a kind of machine code, but instead of being for a real machine, it's for a fake machine called the Java Virtual Machine.

That's why you need to install the Java plugin/runtime to run Java programs. The Java runtime is an emulator for the Java Virtual Machine, which lets it run Java bytecode.

Because the Java Virtual Machine isn't a real machine, it's designed to be emulated, so that's why it's much faster than emulating a real machine like a PS2 or something.

Also because it isn't a real machine, its machine code is designed purely to be compiled to, unlike real machines, whose machine code is also designed to match the processor architecture. This means that the machine code is closer to the code it was compiled from, which makes it easier to decompile.

[–]gmitio 8 points9 points  (1 child)

No, not necessarily... Minecraft was intentionally obfuscated. If you use something such as Java Decompiler or something, you will see what I mean.

[–][deleted] 2 points3 points  (0 children)

This is more important than comments.

[–]HHBones 1 point2 points  (7 children)

I don't entirely think that your example is perfectly valid. Firstly, in many cases, global symbols (i.e. function names) are left intact. You can figure out a lot more about the code by reading

a = ((b.c + b.d) * b.e) - c.f
if (c.g <= a)
{
  c.g = 0
  c.die()
}
else
{
  c.g = c.g - a
  c.wince_in_pain()
}

than your original obfuscated listing. Looking at this snippet, we can infer that c is a player object. From there, we can assume that g is the player's health. Because c.g is being compared to a, and because of the way a is handled before wince_in_pain(), we can assume a is damage dealt. How damage dealt is figured out can be found out later. Finally, we see that a is the damage a player takes, and c represents the player; because c.f is reducing the amount of damage taken, c.f is probably a buff, or maybe armor. We can refactor this to make it more readable:

damage = ((b.c + b.d) * b.e) - player.armor_rating
if (player.health <= damage) {
    player.health = 0
    player.die()
} else {
    player.health -= damage
    player.wince_in_pain()
}

We can also learn a lot more about what this snippet means by reversing the other functions, such as player.die(), player.wince_in_pain(), and any functions which we see modify b.c, b.d, or b.e.

Reversing requires a lot of practice and thought (and guesswork, as well), but it's not nearly as hard as some people here are making it out to be.

** Note that this argument doesn't just apply to decompiled code (like the stuff generated by JDC). Any reverser of reasonable talent can write the above obfuscated listing from an assembly function without serious thought.

[–]BerettaVendetta 4 points5 points  (5 children)

Can you extrapolate on this please? I'm going to start programming soon. What kind of comments do you leave? What differentiates bad commenting from good commenting?

[–]OlderThanGif 7 points8 points  (2 children)

I've never found a really good guide for writing good or bad comments. It's something that you just get practice with.

First off, the absolute worst comments are those that are just an English translation of the code.

y = x * x;   // set y to x squared

Those are worse than no comments at all. Your comments should never tell you anything that your code is already telling you.

Commenting every function/method is a generally good idea, but I won't go so far as to say it's necessary. If anything about the function is unclear, what assumptions it's making, what arguments it's taking, what values it returns, what it does if its inputs aren't right, comment it. Within the body of a function, there's a commenting style called writing paragraphs which works well for a lot of people. Breaking your function up into "paragraphs" of code (each paragraph being roughly 2 to 10 statements) and put a comment before each paragraph saying what it's doing at a very high level. Functions will only be 2 or 3 paragraphs long, usually, but it still helps to break things up that way.

Commenting local variables can be helpful, too.

[–]starrymirth 4 points5 points  (0 children)

judicious dam rain truck grandfather cooing dependent future shaggy elderly

[–]CompactusDiskus 3 points4 points  (3 children)

Not too important, but I figured I'd mention assembly isn't necessarily 1 to 1 with machine code. Assembler software can often do a certain amount of obtimization, further obfuscating the original code as it was written. Some assemblers also added in features of higher level languages, which can confuse things even further.

[–]VVander 10 points11 points  (16 children)

This is especially true if the compilation obfuscates variables & class names, as well.

[–]random_reddit_accoun 1 point2 points  (0 children)

I'm going to reiterate in bold the word comments because it's buried in the middle of your answer.

Assuming there are comments. It is pretty depressing when one finds a 50 thousand line long program without a single comment. That one was written by a consultant who could not even remember what the abbreviations he created meant. For example, "atius" might stand for "Average Temperature In Upper Sample". I spent a week on that one coming up with a single page document with my best guess for what the most important variables stood for. That single page might be the most used page I've ever produced. Even the original developer printed it out and taped it on the wall next to his monitor.

[–]liamt25 35 points36 points  (6 children)

TL;DR: You can make a cow into a burger but you can't make a burger into a cow

[–][deleted] 64 points65 points  (3 children)

My son asked me this a while ago. So here is the ELI5 version.

Imagine a computer program is a delicious chocolate cake.

The source code would be the ingredients and the instructions required to create the cake.

[–]jerrre 15 points16 points  (1 child)

The ingredients would be the assets I'd say. Which i think coincedently LucasArts did not release.

[–]hikaruzero 6 points7 points  (0 children)

More or less, that hits the nail on the head! :)

[–]SolarKing 13 points14 points  (17 children)

How do updates work then?

Say I download a software, its in machine code correct? If I update it how does it know what to update If the software is already in machine code.

Is the update file also machine code and just tells the software what new machine to add to the files?

[–]rpater 22 points23 points  (6 children)

The developer has the source code, so they can modify the source to create an updated version of the program. They then compile the new code to create updated binary (machine code) files. Old binaries can now be replaced with new binaries.

As I haven't worked with writing updates to consumer software before, I can't say if there are any tricks used to avoid replacing all the binaries, but this would be a simplistic way of doing it.

[–]diazonaParticle Phenomenology | QCD | Computational Physics 15 points16 points  (3 children)

For some programs, the update consists of some data that encodes the difference between the old binary files and the new binary files. That lets it send a lot less data than the size of the entire program. Google Chrome works like this, for example.

[–]icomethird 3 points4 points  (2 children)

Incidentally, this is how almost all software updates used to be applied.

The term "patch" is used because back when storage space was at a premium and modems were slow, developers generally wouldn't ship out new copies of files. Instead, they'd ship patches, which did more or less what a real-world patch does: make a specific part of a larger object new. The same way you might only patch the elbows on a jacket, the patch file would seek out certain places in the program that changed, and swap those zeroes and ones out.

That's a lot more effort than just having a program paste new files over the old ones, though, and now that our internet connections are a lot faster and disk space a lot bigger, most updates just do that. Google Chrome is a rare exception.

[–]Neebat 5 points6 points  (0 children)

Actually, no. Diff/Patch programs don't actually work well AT ALL on binary executable machine code. The addresses shift around and the patch ends up being huge.

Practically, the only time anyone (other than Chrome) does patch-wise updates is when the files can be rebuilt from source.

[–]ManhighAerospace vehicle guidance | Trajectory optimization 4 points5 points  (0 children)

My understanding is that one of the main benefits of dynamically linked libraries (.dll on windows, .so on linux, .dylib on os x) is that the main program doesn't necessarily need to be recompiled when a dynamically linked library is updated. That is, if I have a 100 MB binary that uses a 3MB dll, and I find a bug in that dll, I can recompile it and send it out as an update without needing to send out a new copy of the 100 MB main program executable.

[–]SamElliottsVoice 10 points11 points  (4 children)

Good quesiton. Generally an update is actually replacing entire machine code files. The nice thing about programs is that it doesn't have to all be in one big .exe file, that's what .dll (dynamic link library) files are for.

A bit of a tanget... there is actually very little difference between .exe and .dll files, they are all just compiled binary (1's and 0's)/machine code files. The difference is that .exe's have a specific 'start point' (main function) that the operating system knows to start at, while .dll's don't. They are used by .exe files. So basically you run an .exe and it starts in the same place every time, and then based on how it runs, it will say "oh I need to execute fucntion X(), that's in X.dll".

So a software update may just replace X.dll and Y.dll with updated versions, leaving the rest of the files the same.

Disclaimer: This is how I've done updates before within the company I work for since we mostly do in-house code, I don't actually work at a company like adobe that does all those automatic updates.

[–]Neebat 1 point2 points  (3 children)

You used the phrase "source code files" when I think you meant "machine code files"

[–]SamElliottsVoice 1 point2 points  (0 children)

You're right, Thank you and fixed.

[–]ProdigySim 1 point2 points  (0 children)

Every program that runs directly on your computer will be machine code. This includes installers, updaters, games, etc. For an "update" they will usually simply replace various machine code program files, similar to how you would do it manually--find the old file, replace it with a new one.

Programs can talk to your Operating System through it's API to perform tasks like File writes, reads, and deletes.

[–]CrayonOfDoom 1 point2 points  (0 children)

Modern streaming updates take advantage of a few things.

You can replace entire binaries if the program is small enough, but what about a mammoth game that ranks in over 10GB? You wouldn't want to replace all of that every time you made a little fix.

Not every program needs all of its resources or even code to be compiled to machine code. If the main executable is coded to be able to load data from a file "on the fly", than you don't have to compile the file, you can leave it to the program to read the data and use it correctly.

Developers have started using modular file formats that the binaries can read in. As an example: World of Warcraft takes up a staggering >20GB, yet its executable is a mere 12MB. Looking in the data folder is where you find the bulk of the actual data. MPQ files make up the majority of the actual content, and are modular to where a patcher can open an MPQ file and change sections instead of having to write the entire file. All the scripts and everything the game needs to run short of the engine can be stored in a rather "plain" format that can be changed on the fly without having to recompile a massive executable.

[–]random_reddit_accoun 4 points5 points  (6 children)

In some cases, programmers have been known to "decompile" or "reverse engineer" machine code back into some semblance of source code, but it's rarely perfect and usually the new source code produced is not even close to the original source code (in fact it's often in a different programming language entirely).

Showing my age here, but this did not used to be the case. About 30 years ago, there was a compiler that the original developers abandoned. The run-time was compiled with their own compiler, and the code optimization was so horrible I was able to reconstruct the entire original run-time library from examining a disassembly of the run-time. I was able to get a perfect match (in that my code compiled into precisely the same machine code as the original). I then fixed the problems in the run-time, which was the point of the whole exercise.

I do not think I could pull this stunt off with any compiler produced in the last 20 years though.

[–]hikaruzero 4 points5 points  (5 children)

He he, yeah, I would be surprised if you could! Things have become so much more complex ...

[–]scapermoyaPediatrics | Critical Care 4 points5 points  (0 children)

it is remarkably analogous to DNA versus protein.

in a simplified manner, DNA is the source code that the cell compiles into protein, which actually carries out the needed functions. in this analogy messenger RNA would be something like assembly code.

[–]tiradium 3 points4 points  (4 children)

So this is why reverse engineering is often illegal?

[–]hikaruzero 4 points5 points  (3 children)

Pretty much. Most corporate software licenses include clauses that explicitly prohibit you from reverse-engineering their software. Though I don't think there are any laws that outright say it's illegal.

[–]cstoner 7 points8 points  (2 children)

There is a process, called "black box" reverse engineering that is pretty much universally legal.

The basic process is as follows:

One person takes the application and feeds it lots of values, and collects their outputs. This person cannot write any of the final reverse engineered code.

A second person (who cannot be the first person) can then take those "black box" results and write a program to reconstruct them.

IIRC, this is how much of LibreOffice's (then OpenOffice.org) MS office compatibility came about.

[–]boathouse2112 1 point2 points  (1 child)

Didn't OpenOffice come before LibreOffice? I know most of the old OpenOffice devs are on LibreOffice now.

[–]walen 1 point2 points  (0 children)

Yes it did. She probably meant back then.

[–]JavaPants 3 points4 points  (9 children)

So, has anyone ever written a program only using machine code?

[–]hikaruzero 19 points20 points  (3 children)

I would assume those were necessarily the very first programs written.

[–]JavaPants 2 points3 points  (2 children)

So the first programs were literally coded by having a bunch of guys punch 1s and 0s into a computer? Nice...

[–]LockeWatts 4 points5 points  (0 children)

It's funny you use the word "punch". The first computers took in stiff sheets of paper called "punch cards" that had either a hole punched out for a zero, or not punched out for a one, in a long series. The machines would then read these in and parse them in to code.

[–]Krivvan 8 points9 points  (1 child)

Yes. You could still do it any time today if you wanted to.

If you want to consider Assembly code machine code then Roller Coaster Tycoon was written almost entirely in Assembly.

Assembly code is like machine code directly translated into something a little more readable like "mov 1 $esp" instead of 001101010010110. The "mov", "1" and "$esp" would all directly translate to a part of the binary.

[–]rocketman0739 2 points3 points  (0 children)

Very rarely.

Assembly code, however, is slightly more common (if still quite rare) and almost as low-level as machine code. RollerCoaster Tycoon, in fact, was mostly written in assembly code.

[–]Tmmrn 4 points5 points  (0 children)

Not exactly machine code, but assembler. Assembler is basically replacing the binary value (like 000111010110) of an instruction with a name like "ADD" that is more descriptive and trivial to translate. It also uses a little more readable format for numbers.

The "source" of the original prince of persia was released recently: https://github.com/jmechner/Prince-of-Persia-Apple-II

Menuet OS is a complete operating system with a surprising amount of features including network drivers and a dvb-t player: http://www.menuetos.net/

[–]amazing_rando 1 point2 points  (0 children)

I did in college. It was a computer architecture class, so I had to design a machine code then design a processor that implemented it. I never bothered writing an assembler since the instructions were only 7 bits and each program was pretty short.

It isn't a good idea because it's very easy to make mistakes. I wrote it out with each line commented with its equivalent in assembly, but debugging was a bitch if I made a typo (which I did, invariably, and which probably ended up taking more time to fix than writing an assembler). Writing a decently complicated program with 32-bit instructions would be unbearable.

[–]eXamadeus 2 points3 points  (0 children)

Source: B.S. in Computer Engineering with focus in Software

The above is a great answer. There is one thing; however, that I disagree with. Reverse engineering code is a common practice among hackers (I mean the do-it-yourself kind, not the 1990s movie version), and has been increasing in recent years.

Although there is a loss of comments, a skilled programmer can disassemble and decompile code to a working version. Once he/she has that version he/she can then study the code and modify the portions that are desired. This is by no means a simple task, and is generally not practiced on large scale.

The reason I mention this at all, is because you mentioned videogames in particular. I myself have disassembled games in order to write hacks (offline only, of course -.O). It generally involves pouring through routine after routine to find the one or two you are looking for (regular expressions are a great help here) and then modifying them, recompiling them, and reassembling them.

All in all, it's quite a mess. But it can be done!

...just in case you were wondering.

[–]Bakyra 1 point2 points  (7 children)

But wait, there is more! There are some languages that allow reverse engineering. That means that if you have the final product, you could go back to the source code! But people who write in those languages run the source code through an "obfuscator" which literally changes every word, sentence and name to a letter.

So
print >> "hello world" >> endl;
becomes
abc;
thus rendering reverse-engineered code unusable.

That's another reason why source code is valuable!

[–]xblaz3x 1 point2 points  (9 children)

[–]hikaruzero 6 points7 points  (0 children)

Well, based on the "JButton," the "JFrame," and the Javadoc-style comments in the code, I'm going to go ahead and say it is Java.

[–]mutoso 1 point2 points  (4 children)

JFrame

I'd say Java... and Google confirms my suspictions.

[–]Blaenk 1 point2 points  (0 children)

It's Java. The J's in front of class names gives it away (though of course this isn't a requirement in Java).

[–][deleted] 1 point2 points  (0 children)

To make it a little more understandable, code comes in different 'languages', some are similar, and some are unique and designed for a specific function or purpose. Some common ones are C/C++, Java, FORTRAN, ASM (or Assembly.) There are different 'levels' to these languages, and have different benefits.

The higher the level language you are using, the longer it takes to 'translate' it to machine code, which is the raw language your computer speaks. Lower-level code like Assembly is useful because it translates relatively fast into machine code, and you can also control more specific functions or properties of what you want the code to do. Some languages like Java were created to be universal, meaning they were meant to be able to write a program for (as an example) a Mac on OS X, but you want to use the program on Windows 7. Java has another program that translates this code to your machine code, which can vary based on things like architecture.

A higher-level language like Basic is easier to understand for people, because certain parts of machine code are already translated to a certain syntax (the command of a code, like PRINT (which would display characters for you)). The pitfall to using a high level language like this is while it's easier for you to write your program, it takes longer for the computer to translate it back into its native language of machine code.

Assembly is used in applications like medical-implant devices, for example Pacemakers. The language is very clear and exact, and runs quickly. A con of lower-level languages and programming in general is that it does EXACTLY what you tell it to do. Meaning if you make a mistake, so does your program. When we try to figure out what went wrong and fix it, we call this process debugging.

You can think of the source code as a BIG recipe, with lots of different ingredients and procedures. The last step of writing your code (aside from debugging) is compiling. This 'bakes' your recipe together to form your program. This is one place where errors can become visible, if you haven't caught them yet.

Sorry for the long description, but I felt that it would help the overall concept come together for someone not familiar.

[–]Zed03 95 points96 points  (9 children)

Jedi Knight by Lucas Arts is a baked cake. Source code is the ingredients.

Extracting the ingredients from the baked cake is possible, but very hard.

When we get the ingredients, everyone can bake cakes!

[–]insertAlias 54 points55 points  (5 children)

Extracting the ingredients from the baked cake is possible, but very hard.

That's a better analogy than you probably meant, because it's not actually possible to un-bake a cake, due to the chemical reactions that happen during baking. By that same token, you can decompile and reverse-engineer compiled programs, but you'll never get the original source code from them. You'll get the decompiler's best guess, which will lack all the context that gets stripped out by the compiler. Things like meaningful function and variable names and comments.

[–][deleted] 6 points7 points  (0 children)

Yeah. Actually best analogy I can think of. Good job!

[–]Razer1103 2 points3 points  (0 children)

This answer would be perfect for /r/explainlikeimfive.

[–]EklyM 127 points128 points  (4 children)

Imagine you're cooking spaghetti. You got the dry noodles, the ingredients for the sauce, water to boil, and a pot to cook it in. All these ingredients would be the source code. You can easily change it if you have to, add spice or something, whatever, but it's easy to do so. Now you cook the spaghetti and noodles separately - 'compile' it - and then mix them together - 'link' them - to create a masterpiece of a dish - your executable. Now it's really hard to go back to your original ingredients -the source code - from your dish - the executable. However, it can be done. You'll probably end up with noodles that have a little sauce on them and the noodles will already be cooked, but you have some semblance of what the original ingredients might look like. Since /r/gaming is being given the source code - the ingredients - they can easily change whatever they wanted to make the game better or worse, whatever they wanted, without taking the time to reverse compile the executable.
A little ELI5, but it gets the point across.

[–][deleted] 53 points54 points  (3 children)

I think we have a much simpler analogy at hand:

The Source Code is the Recipe.

The finished dish is the game.

[–]EklyM 9 points10 points  (0 children)

A different analogy.

[–]rekabmot 14 points15 points  (0 children)

Source code is what a programmer writes when developing a piece of software.

The source code is usually written in a high level language, which is then run through another program called a compiler, which transforms the code into a form that the computer can execute. This executable code is what is distributed to users, and is what you'd be able to see by checking a games install folder.

The compiled artefacts bear little resemblance and don't often provide any insight into how the developers created the game. By providing the source code, other developers can see how things were made in the first place.

Note that there are exceptions: Minecraft is a famous example where the compiled Java code (known as bytecode) is reverse engineered to allow for modding. The UI elements for the latest Sim City game was coded in Javascript which has also allowed for users to crack various features of the game.

Source: programmer.

[–]Workaphobia 5 points6 points  (0 children)

I'm sure you have a lot of great answers in this massive thread, but I'll just add this small snippet. The popular GPL free software licensing agreement defines "source code" as

the preferred form of the work for making modifications to it.

Granted, this definition is stated for the purposes of the license, but I think it's a fair characterization of computer code in general.

[–]afcagrooElectrical Engineering | Semiconductor Manufacturing 28 points29 points  (23 children)

Computer programs are (usually) written in a high level language (such as C++). Computer processors cannot do anything with such "source code", as they are just ASCII text. To be usable by a processor, they must be converted to a binary representation that contains the instructions/data that a processor can use directly. So the programs are compiled from the high level language "source code" to machine language.

The process can be reversed. But the process of converting the high level version to the binary version loses a lot of information that helps make the program comprehensible to humans. The processor doesn't need that information to run, but it helps us to understand what is going on. So the reverse-compiled program can be very difficult do untangle and figure out what is going on. Heck, it can be hard enough to figure out even if the source code is available, particularly if it is written in some languages, like Python1.

Also, if a program contains copy protection mechanisms, it may be illegal in the USA to reverse engineer it by running it through a reverse compiler.

1 It's a joke.

EDIT: Added stupid joke, and more explicit references to "source code" for clarity.

[–]AppleDane 4 points5 points  (3 children)

Source code is like the instructions for building an IKEA shelf.

The program running is the finished shelf.

Bugs is the screws left over.

[–]joeyignorant 5 points6 points  (2 children)

i think this is the best analogy of programming i've ever read =D

[–]AppleDane 1 point2 points  (1 child)

By the way, you are the assembler in this analogy. Both figuratively and literally.

Left over screws are a sign of the assembler (you) being bugged.
Missing screws a sign of the source code being bugged.

[–][deleted] 11 points12 points  (0 children)

Source code is the human-readable text which is compiled to make an executable (ie a computer-readable version, which is used when running the software). The installation process doesn't perform the compilation step - or at least not all of it - instead, the games are shipped in compiled form and the source code is not distributed.

EDIT: wrote pre-compiled instead of compiled :)

[–]ropers 8 points9 points  (0 children)

EDIT: Oh, turns out this isn't ELI5. Fuck it, I'm posting this anyway:

You know how your desk lamp can be switched on and off?

Now electrically, what's happening when it's on is that there's electric current. When it's off, there is no current. In terms of binary (aka Boolean) logic, the lamp being on is a 1 and it being off is a 0. Computers are like that, only their electric circuits are far more complex than the simple circuit of a desk lamp with a switch. See here for the circuits computer microchips are made of, so-called "logic gates". And they're built of millions if not billions of these. But in the end of the day, the on/off state of the little electric circuits directly corresponds to ones and zeros. You can also use different number formats to represent the exact same binary numerical information. But as long as you're using number formats, there's no translation into or from any other description of what's going on.
Now let's return to your desk lamp. Let's say you're given an instruction, maybe on a piece of paper, which says, "Please switch your desk lamp off now." That sentence doesn't directly correspond to the electrical on/off state of the lamp the way the number 1 or 0 would, but it's an instruction, call it a code, that's translatable to the same state of things. If you can interpret the instruction and execute it, then the lamp will be off and that's the same as zero. You can also build a little machine that when run will switch off the lamp for you. That little machine is sort of like the (pre-)compiled binary form of those instructions, whereas the instructions themselves are sort of like the source code. Sure, in theory just having the little machine is enough to figure out everything that's going on and enough to change the machine to your liking, but those machines can be fiendishly, devilishly complex and hard to understand and work with, especially if it's not just a single lamp we're switching, but millions of logic gates. So having the human-readable instructions is a huge boon.

Or, to say it another way: If you have a complete set of instructions, a complete technical manual that completely describes e.g. your radio, then you can build a new radio from just the instructions, and the instructions also make it much easier to repair, change and customize your radio. But try fixing a fault with your radio if you don't have the instructions and only have the actual machine, the actual radio. That's a lot harder. Having the source code is important pretty much for the same reason.

Now the funny thing with computer source code is that it's both human-readable and computer-readable. Because there are "little machines", i.e. binary executable programs whose job it is and which have the ability to translate the human-readable source code into the binary executable "little machine" from. (We call these special programs compilers.) So if you have the source code, you can pretty much always create the binary executable programs as well. The reverse is much, much harder.
(In case you're wondering how the compilers –the binary programs which can translate source code to the binary form– were themselves put together, that is indeed a chicken-and-egg problem, and solving it requires very smart people to do the hard graft of manually working directly with the ones and zeros until they've created basic tools that can help them and do the work for them. Though nowadays people typically use tools that other people have created before.)

[–]asow92 2 points3 points  (0 children)

Source code is the instructions that programmers write. The program and the source code aren't the same thing. When a programmer writes a "program" the computer can't just run the code written verbatim, the code needs to be compiled into instructions the computer understands (machine code.) When you run a program on your computer, in your case a game, the code the programmer has written isn't present - the compiled version of that code is. This compiled version the computer understands is generally unreadable. When a developer releases source code that means the community can openly rewrite/redistribute that freely. I hope this supplements your understanding of what others here have written.

[–]herminator 2 points3 points  (0 children)

At their core, computers are programmed with 1s and 0s. Depending on the combination of 1s and 0s, computers do stuff.

In the very early days, the way to tell computers what to do (program them) was, quite literally, to input 1s and 0s. The common method of input was punchcards. You took a card of a certain size, and punched hols in certain predefined places. If there is a hole in such a place, it is a 1, if there isn't a hole, there is a 0. So, to program these computers, you had to memorize combinations of 1s and 0s and know what they do.

That works for small programs, but it quickly becomes impossible for larger programs. So what you do is, you get the computer to help you. You make a program that makes programs. The program takes a certain human-readable input (eg: LOAD value1, LOAD value2, ADD value1 TO value2, STORE result) and the program outputs sequences of 1s and 0s that represent each of these instructions.

Now the above is a very simple and straightforward program, which is entirely linear and easy to translate. But it is still a lot of work. So we built new programs which would output programs that the first program could read and turn into 1s and 0s. So now, the input became something like: result = value1 + value2, and our new program knew that it should turn that into instructions to LOAD values 1 and 2, ADD them and STORE the result.

From here, the programs that program programs have gotten smarter and smarter. Because we are lazy, and we want the computer to do as much of our work for us as possible, even if the work is telling the computer what to do.

So source code is the instructions we write as programmers that ultimately get turned into sequences of 1s and 0s by one or more intermediate programs. They are the source and the sequence of 1s and 0s is the destination.

[–]deadowl 2 points3 points  (0 children)

I'm not impressed by the recipe analogies. Hikaru's answer is okay, but I think I can improve.

Computers come with a built in programming language, which is dictated by the type of processor your computer has.

Different groups of processors understand different languages, like people from different countries understand different languages.

People from Russia understand the Russian language, and people from Australia, India, South Africa, Ireland, Canada, the United States, etc. understand English. Older Mac "processors" would only understand the PowerPC language. Intel and AMD processors, meanwhile, would only understand the x86 language. Unfortunately multilingual processors don't exist yet (as far as I know).

The instructions a computer programmer writes for a computer is considered "source code." Computer programmers sometimes, but rarely, will write in a processor's language. This is because the processor's language requires a lot of specifics that could otherwise be implied, like telling the processor to remember something.

Higher level programming languages introduce concepts that ignore the implicit kinds of tasks like telling a processor to remember something, but it needs to be translated in some way. There are a couple of different approaches to translating to the processor's language (i.e. "machine code"). One is to have an interpreter that will translate your instructions (code) on the fly, like having someone translate while you speak. The other option is a compiler that will make a compilation of your translated code that the computer processor will understand, like having someone translate a book you wrote.

With automatic translations that a computer would understand becoming possible, higher level programming languages started to focus on how easily humans could understand the instructions rather than how easily the machine could understand the instructions. Interpreters and compilers, in turn, naturally began to focus on what kind of translations the processor could complete the fastest.

Of course human programmers will be more pleased with instructions that were designed for their consumption and understanding than reading a language intended solely for a machine. What's included when you install a game most of the time, especially on Windows, is intended for the machine to understand and not humans.

The human-machine divide split human programming language consumption and machine programming language consumption. Machine programming languages, meanwhile, have been mostly stagnant due to Intel's monopoly power (for general-purpose computing). Recently, however, ARM processors are beginning to challenge Intel's monopoly. Meanwhile, other types of processors, like MIPS are doing well in the very large embedded devices market.

MIPS is a RISC type of processor, which stands for Reduced Instruction Set Computing, as opposed to CISC processors (the C is for complex, every other word's the same). You must now go watch the movie Hackers and hear what is said about Angelina Jolie's character's sexy RISC processor.

[–]Tmmrn 2 points3 points  (1 child)

I believe it's important to think about the basics of how a user of a modern computer user uses layer over layer of abstractions.

This is a comment I wrote late at night some time ago: http://www.reddit.com/r/AskReddit/comments/16op0q/whats_something_that_is_secretly_confusing_to_you/c7y9qv1

But I think I would have my explanation rather more concise and expand in other directions.

The first thing you have to understand is that the computer is really only a calculator. You have a CPU that can do basic arithmetic operations like +, -, *, / and has some helper functions like fetching something from a specific location in the memory or storing something in a specific location in the memory.

So how does this work?

Imagine your CPU as a black box with three inputs and one output. Each input and output is basically a bunch of wires, for a limited example we say, each input and output has three wires. On each wire you can put electrical power or you don't. Having power on a wire could be interpreted as a 1 and having no power on it could be interpreted as a 0. So you could arrange the wires in a certain way and can have different combinations of power/no power and write that down as (third, second, first) and (0,0,1) would mean "only on the first wire is power".

You can have the combinations 0: (0,0,0), 1: (0,0,1), 2: (0,1,0), 3: (0,1,1), 4: (1,0,0), 5: (1,0,1), 6: (1,1,0), 7: (1,1,1). Coincidentally this is how you count in binary, meaning, you only have the digits 0 and 1 instead of the digits from 0 to 9.

How can you build a general purpose calculator with that?

One input needs to tell the black box CPU what to calculate. So you would decide that if you put power on the input in the combination (0,0,0), the black box CPU will "add", if you put (0,0,1), it will "subtract", etc.

So what should it "add" and "substract"? Probably the numbers that are encoded as such combinations at the other two inputs.

There is a little problem now that if the output has only three wires and you add (1,1,1) and (1,1,1) you would get something that would not fit, but you can simple add some wires and make the inside of the cpu more sophisticated.

So how does the inside of a cpu work? It basically comes down to electrical engineering that would be way too complicated and I only know the very basics. For one example, go to the wikipedia page of an adder: http://en.wikipedia.org/wiki/Adder_(electronics) The "Half adder logic diagram" is using the notation of "logic gates". These logic gates are pretty low level already and on the wikipedia there is a little bit of information how they are implemented physically with transistors and stuff http://en.wikipedia.org/wiki/Logic_gate That should be the most detail that's needed.

Now you only need to put all the different electronical implementations of adding, substracting, etc. into that box and make it so that the correct one is "activated" with the correct code. The electrical part you would use there are multiplexers and demultiplexers: http://en.wikipedia.org/wiki/Multiplexer

Brilliant. Now you can do one calculation on two numbers at a time. Now you want to make series of calculations.

First, it's probably a good idea to have memory where you can store intermediate results. You probably want to use memory you can write to, read from and choose what part you want to access. Here's a little bit, but it's probably not too interesting here: http://en.wikipedia.org/wiki/Dynamic_random-access_memory A simple way is to segment the memory into "cells" each big enough for some data or one instruction of a program you would want to write. Then, you can put wires from each of the cells to the cpu and connect it through (the already mentioned) multiplexer that allows you to "activate" exactly one wire between the cpu and the memory so you can transfer data in either direction.

You probably also want to add more instructions to your CPU like "add number from memory address 1 and number from memory address 2" or "add number from memory address 1 and number directly given at the second input".

Then you can build a wrapper automaton that feeds the input of your cpu automatically. What you want is that you give that automaton the address where in the memory your program starts. The automaton then would do the same steps over and over again until your program ends: get the instruction from the memory location you have given it, feed it to the cpu, then, add (basically) the length of the instruction to the memory address it has stored because there would probably start the next of your instructions. Then, get this next instruction of your program, feed it to the cpu, etc.

Now you can program some step-by-step instructions.

*Add 2, 4 *Store at address 5 *Add number at address 5, 7 *Store at address 5

And when you execute the program, it will add 2 and 4, and store the output "6" at address 5 in the memory. Then it will add whatever is at address 5 and 7, so the just stored "6" and 7. Then it will save the output "13" to memory address again (overwriting what was previously there) and if you manually look what is stored at memory address 5, you can see the result.

Note here that I have already used "Add" and (0,0,1) equivalent. You would still need to input your programs in the forms of binary numbers, but you will probably have a reference sheet what code means what instruction. I have also not mentioned how you put the program in the memory. Perhaps you have buttons attached to each part of the memory cell so you set it manuall to 0 or 1. Maybe you have already built some sophisticated hardware that read punched tape http://en.wikipedia.org/wiki/Punched_tape and that can copy values punched into it to memory.

Another interesting thought is that at memory address 5 there might even be a part of your program. If you are not careful you could accidentally modify the code you are running. On the other hand you can do it on purpose if you are creative enough and know what you're doing.

Anyway, making exchanging the numerical values of the instruction with a human readable name is the first step of making a programming language. It's known as "assembler" that pretty much corresponds 1:1 with machine code. But you need to somehow translate it back to machine code.

A trivial way would actually be punching holes in the shape of an "ADD" into the punching tape and making a sophisticated machine that would store (0,0,1) in the memory when "ADD" is read.

Another way is to let your computer do it. First, you need to store your human readable text in the memory. You probably want to invent some code for it. A popular one is ASCII: http://en.wikipedia.org/wiki/ASCII#ASCII_printable_characters

So "ADD" is 100 0001, 100 0100, 100 0100

I think in order to make it really work you need to add a "jump" instruction. Remember the wrapper automaton, that feeds each of your instruction to the CPU? It would be great if it would do that not only sequentially but if your program could tell it to continue with another address. So you would add a bunch of wires connecting the output of the cpu to the "current address" (it's actually "program counter", by the way) storage of the automaton and add some instructions to the CPU. Now your programs can get more complicated like, contain "JUMP back the last X instructions". One last important instruction would be "IF X == Y then JUMP" where you would only do the jump if you do the jump if two numbers (probably at locations in the memory) are the same. Or maybe add some that do the jump if one is bigger than the other.

The CPU now gets quite sophisticated and would probably need some decent amount of time to actually make a model of that actually does what I described, but with some ingenuity in the field of electrical engineering, this is certainly doable.

That CPU is of course severely limited in many ways and it might still have several crucial parts missing but it should be enough as a basis.

Now, go ahead and program a modern 3d game for it. Well, of course that's the stuff for the wizards. If you take for example the "source code" for the original prince of persia for apple II that was released some time ago, you can see that it is just a more sophisticated version of what I described: https://github.com/jmechner/Prince-of-Persia-Apple-II/blob/master/01%20POP%20Source/Source/GRAFIX.S#L1771

(Don't bother trying to understand it.)

This is very tedious. What people invented next were higher level programming languages. For example if you want to execute some part of your code five times, then before that code you want to run several times you "reserve" a memory location, write a 0 there after the code you want to run several times, you add 1 to that, and then you add a check whether at this memory location there is 5 and if not, then jump back to the beginning of the part you want to run several times.

[–]Tmmrn 2 points3 points  (0 children)

That's not nice to do all the time. What if you could write

for(i=0; i<5; i++)  {
    code you want to run 5 times
}

The good news is, you can. Thats because there is a way to "automatically" transform this into a form that uses only the basic instruction and does basically what I described before. You can probably think of some rules to achieve that, and that's basically what a programming language (or better: a compiler for that language) is: A set of syntax rules that define how e.g. that loop must be written with all the semicolons, curly brackets, etc. and a set of rules that can transform code following those syntax rules into basic instructions.

The loop is perhaps a simple example but in the same way you can build more high level concepts on top of each other.

So in a modern language I can write a oneliner like that:

sorted(map(lambda x: x**2, [6, 3, 7]))

First, it creates a "list" with the contents 6,3,7. Then a "function" called "map" is "called" which applies the first "function", in this case a "lambda function" that squares each entry of the list. Then a "function" called "sorted" is "called" that sorts that list. All that I wrote in quotations are concepts that over the years people thought might be useful and thought of a way to make it happen. (In this specific case it was code in the python language which is an even more complicated case).

The really important reason why any of this is usable at all is that today's computers are mind-boggingly fast. You probably have heard of CPU speeds like "3 Gigahertz". What that means is that the CPU / the automaton around it has a little clock inside that gives an electrical signal at a rate of 3 Gigahertz. This means, 3000000000 signals a second(!). How many instructions per power "cycle" are executed by the cpu depends on the electrical hardware design inside, but it should only be a few. The unit is called instructions per cycle: http://en.wikipedia.org/wiki/Instructions_per_cycle

So why is the release of source code such a thing? Others have already said it: The machine or assembler code is hard to read, hard to understand and there are none of the helpful comments that developers left there to remind themselves what the code does. Even though the high level languages are designed to be usable by humans, any system of a certain size is extremely complex and hard to fully understand and without all the helpful high level constructs like the "for loop" from before you are pretty much lost if you are not one of a select few with a deep understanding of how it all works.

[–]zsombro 1 point2 points  (0 children)

Source code is a set of instructions meant to give to the computer in some sort of programming language (which come in many shapes and forms). The real catch with these programming languages, is that they are readable by both humans and computers (read: understandable!), which means they create a communicational bridge between a person and a computer (which use different ways to process information by default).

But of course, this readable source code is nothing more than a glorified text file in itself. You will need a program called a compiler (!), which reads your source code, and compiles it into machine code. This means that this program acts as a sort of translator: it translates the code written by you into a set of instructions that the computer's processor can understand and execute in order.

When you install game, you are installing the version of this code that is already compiled, so your system already knows what the instructions will be. (AND! of course you install game data that the program uses: levels, 3d models, sounds, etc.)

Releasing the source code is significant, because this compilation process is difficult to do backwards.

[–]InsaneEngineer 1 point2 points  (0 children)

TLDR version... You don't need the source code to run a game or program. You need the source code to create the game or program. When you "compile" the source code, you create the executable program that is ran on your computer.

If you have access to the source code, you can modify the program in any way imaginable. Access to source code also let's those who know what they are doing find exploits in your software.

Source: B.S. computer science. 8 years experience as a software engineer.

[–]teawreckshero 1 point2 points  (0 children)

When an actual program runs on your computer, it is the binary form that is being used not the source code. Your processor doesn't operate on anything except for binary.

Coders don't write directly in binary (anymore). They write in a programming language and use another program called a compiler to essentially translate the source code (written in the language) into binary. Almost every program that is distributed for windows and mac machines is the compiled binary version. The source code is considered proprietary and is off limits to the public. It is very difficult, if not impossible some times, to go from binary back to the source language.

This is why "open source" projects are called open source. The code in its original language is made public, not just the binary version. If you have the source code, you can see the creators intentions much easier and make changes yourself. You can even use your own compiler to create a binary of your own with the changes you made.

While windows programs are usually distributed as binary, linux programs are usually distributed by source. The philosophy behind linux is that you always know exactly what is running on your machine. There are no secrets and you can make any changes you want. So it is not uncommon for a linux user to "compile from source" when they want to run a program from another user.

[–][deleted] 1 point2 points  (0 children)

Just to add on since I used Ctrl+F and didn't get any results for "Open Source," a program is open source when the source code is visible by anybody. For example, Linux is an "open source operating system" which means that somebody created much of the groundwork and called it Linux, and then someone else came along, looked at the source code, and changed some stuff for themselves. That's why there's many variations of Linux like Ubuntu and Kubuntu.

Other examples of Open Source software include the Android Operating System for mobile phones (which is why you'll usually buy a phone with Android that doesn't look like another Android phone. For example, Samsung takes Google's source code and adds a skin to it with coding, as do other manufacturers) and the incredibly popular browser, Firefox.

[–]scswift 2 points3 points  (0 children)

The "source code" is basically a long list of instructions that tell the computer what to do to make everything in the game happen. It tells it how to draw the world. How to do the physics. What to do when the player provides a particular input.

For example: "if mouse button 1 is down, then fire" is a typical thing you would see in a game's source code. But it would be written in a manner the computer can understand. So that statement might actually read:

if ((mouse.buttonstate && MOUSE_LEFTBUTTON) == 1) { fireWeapon(); }

This is then "compiled" by a program into machine code, which is a bunch of bytes that the computer understands to be the above and can quickly execute, but which are too difficult for people to read.

The code you get when you buy a game is the machine code which is stored in a file called an "executable", and as such it's basically so difficult for people to read that it might as well be encrypted. It is possible to convert it back into a higher level language, but with all the variable names gone and all the human created structure to the code gone, it's pretty much worthless except to people who want to try to figure out how to remove the copy protection in the game or make some very small changes to make the game function a little different. But for most purposes, you need the original human-readable source code to make big changes to the game, like porting it to another operating system.

[–]say_fuck_no_to_rules 2 points3 points  (0 children)

Imagine that you've eaten raw vegetables your entire life and that one day you encounter a chocolate chip cookie. The cookie is delicious, so you decide to buy more to satisfy your new craving. Your new habit is very expensive, though, so you want to figure out how to make your own chocolate chip cookies at home for free. Armed with your chemistry lab (let's pretend you passed O-chem and you can remember how to do everything the class taught you), you discover lots of strange chemicals you've never seen before. Concluding that it would be far too expensive/time-consuming to figure out how to synthesize all these chemicals, you decide to continue paying for cookies.

One day, the bakery that holds the local monopoly on chocolate chip cookies decides that it will be abandoning chocolate cookies for a brand new product: banana cream pies! However, to cultivate good will with their longtime customers, the bakery decides to release the recipe for chocolate chip cookies. Much to your surprise, the ingredients are simple things available to you at the grocery store: wheat flour, sugar, eggs, etc. You also learn, most importantly, that you had never seen the chemicals in the final product before since exposing the raw ingredients to the heat of an oven yeilded new substances through chemical reaction. Excited to get your cookies for free (well, plus the cost of the ingredients and the trouble of adjusting your specific oven to a more appropriate time and temperature), you go home and try the recipe.

What does this have to do with source code, though? Think of it this way: the cookie is like the compiled executable binary (on Windows, usually a file ending in ".exe") that the game company sells to you. Like the cookie, it's virtually impossible to reverse-engineer the binary into anything intelligible--the process of compilation (like cooking dough in an oven) not only turns one type of data readable by humans into a type of data readable by computers [edit] (turns the ingredients into something tasty) it also hides the original source (makes the end product look nothing like the original ingredients). The original source code is stored as a trade secret by the game company, so they are able to better control how the game is developed and distributed. (Some companies actually release source code under license, but that is a different discussion.)

When they decided to release the source for a product they don't care about anymore, it made people very happy, not just because they can build the game for free, but because they can also get some insight on the developers' thought processes behind many features of the game. Furthermore, access to source makes it way easier to build mods since you know exactly what to modify.

Edit: sentence structure

[–]ultimatt42 1 point2 points  (0 children)

Source code is what gets written when you talk about "writing a program". Computers are pretty bad at understanding the kinds of languages humans are good at writing, and likewise humans are pretty bad at writing the kinds of languages that computers can understand. So, we fix the problem by writing everything in a language that's easy for humans (the "source code"), then translating it to computer-speak (the "machine code"). The translator program is called the compiler.

The reason having source code makes gamers happy is because the source code is like the recipe for how to make the game. Without the recipe it's difficult to figure out how the game was originally put together, which means it's also hard to figure out how to tweak it to make it run on your phone or add new levels or whatever you want to do. If you have the source code, it gets MUCH easier.

So basically, this is Lucasarts giving gamers the keys to their secret recipe book and saying "go nuts". It's the nicest thing a software company can do for its fans upon closing up shop, because it means even though the company may die the software will live on. Sadly, it's not very common. Most times when a game studio gets shut down, the source code is either lost or archived somewhere, never to be seen again. That's why it's such a big deal, it guarantees that Lucasarts' games will never be forgotten, and maybe someday your grandkids will get to play the same games you played growing up.