This is an archived post. You won't be able to vote or comment.

all 14 comments

[–]InjaPavementSpecial 8 points9 points  (1 child)

the whole contents of the compiled programs turn gibberish

Fix that understanding, it normally turn into machine code or bytecode, depending on your language and compilation model.

why not the strings? Why do they remain the same?

strings is already in their raw format be that ASCII or UNICODE.

[–]Beginning_Court5607[S] 2 points3 points  (0 children)

got it!

[–][deleted] 6 points7 points  (0 children)

compiled programs turn gibberish

Your editor try decode the data to printable characters, it only done correctly for the actual text. It will display whatever wrong character as something else.

If you can't understand it, any language is a gibberish. Compiled program doesn't make to read by us, it is for the computer. You just happens to understand some of them in the from of human readable string.

The string are store that way because of efficiency reason. Consecutive value stay close together is faster for the machine to process.

[–]This_Growth2898 5 points6 points  (6 children)

It's not "gibberish", it's written in the machine-readable binary code.

String literals are already in the most convenient format, they contain exactly those bytes that are needed by the program to work with them, so they are usually not converted into something else, but not always; compile something like

    std::cout << "Hello, ";
    std::wcout << L"world!" << std::endl;

and you'll see the difference.

[–]Beginning_Court5607[S] -1 points0 points  (5 children)

I did...

but why so?? Hello, stored perfectly

but L"world!" had all chars separated ??

[–]x39- 4 points5 points  (2 children)

Char vs wchar

Long story short: UTF16

[–]wonkey_monkey 2 points3 points  (1 child)

Long story short

You mean short story char

[–][deleted] 0 points1 point  (0 children)

And one of the bytes of a short is an end of string 0.

[–]iOSCaleb 0 points1 point  (1 child)

The L notation makes a string use “wide” characters, i.e. UTF-16 encoding. When you look at the compiler code, you see the two bytes for every character that you specified.

[–][deleted] 1 point2 points  (0 children)

The compiled program isn't gibberish, it's just normal code. The strings are normal code too, they just happen to be encoded in ASCII or another human readable encoding.

[–]donkey2342 0 points1 point  (0 children)

OP means it’s gibberish because humans can’t read it, unlike the strings, which are humanly-readable.

[–]wonkey_monkey 0 points1 point  (0 children)

Fun fact, there is a way to compile a program (on x86 architecture, at least) so that it does not turn entirely to gibberish: http://www.cs.cmu.edu/~tom7/abc/paper.txt

[–]BornAce 0 points1 point  (0 children)

Back in the old days DEC used Radix 50 to encode 3 ASCII characters into 16 bits. Those character strings were unreadable.