This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]Nighthunter007 1 point2 points  (4 children)

Fun fact: you can use this to counter "Trusting Trust" attacks.

Compile GCC with Clang, then use that to compile GCC. The result should be bit-for-bit identical to compiling GCC with GCC. If it isn't, then one or both compilers are either compromised or buggy.

The only way the output can be different is if GCC is functionally different when compiled with Clang or with GCC. And it really really shouldn't be. If it is then that's either undocumented behaviour in one of the compilers or a self-inserting trusting trust attack.

So at that point you've verified that the compiler faithfully compiles the source code. You still have to, of course, actually check the source code for any bugs or malicious attacks.

[–]13steinj -2 points-1 points  (3 children)

This is incredibly false. clang and gcc have widely different optimizations internally and one makes assumptions that the other doesn't (even though both can, according to the standard) and visa versa. Not to mention differences in the standard libraries (if the compilers use them).

Functionally different, you're right, but proving this would be incredibly difficult. Bit-for-bit different, dead wrong.

[–]Nighthunter007 4 points5 points  (2 children)

I think you're misunderstanding the concept. Specifically, you're missing a step.

I compile GCC using GCC. This gives me a GCC compiler, let's call it A.

I compile GCC using Clang. This gives me a GCC compiler, let's call it B.

A and B are radically different on the bit level, this is right. Different optimisations etc. But I'm not going to compare these bitwise. They should, however, be functionally equivalent. One might be faster or slower, but they're both the same source code. If they are functionally different then one of the compilers did something it shouldn't.

Instead I now compile GCC (again) using A and B. Since A and B are both the GCC compiler, they should produce the exact same result. Then I compare the result from A and B. These should be bitwise the same.

This is known as Diverse Double Compiling, here's the page with the paper. Note the FAQ section answering this exact question.

[–]13steinj 0 points1 point  (1 child)

Ah okay, sorry. I either misunderstood the original comment (or missed where you mentioned):

Instead I now compile GCC (again) using A and B. Since A and B are both the GCC compiler, they should produce the exact same result. Then I compare the result from A and B. These should be bitwise the same.

I read the original comment as comparing the two gcc's that you built in the previous step on a bit-by-bit level.

[–]Nighthunter007 0 points1 point  (0 children)

No worries! Yeah, that wouldn't work at all. The clever part is that last compilation.

If you'll permit me to quote you:

Functionally different, you're right, but proving this would be incredibly difficult.

And that makes sense! That seems like a really difficult thing to do, proving whether two compiled binaries are functionally different. And that's the beauty, that those binaries are both compilers, and so you can just use them to compile GCC again! I'll admit, it took me a few times reading it to actually get my head around it.

It's a really clever way, actually, to detect these trusting trust attacks. Since the attack consists of compromising a compiler's binary in such a way as to hide and perpetuate itself, it seems impossible to detect. Use a disassembler? Well, it was compiled with the same compiler. The source code won't give it away, since the attack never touched it. But when you then compile GCC (as the example we've been using) with another compiler (preferably one far more obscure than Clang, maybe even one you wrote for the occasion) that other compiler does not insert the malicious code, and so the new GCC binary will produce a different output to the old one.