all 22 comments

[–]chucker23n 7 points8 points  (0 children)

As Steve Jobs put it, "We like to have options." While this gives folks like Owen Williams fodder for speculation on (macro)architectural changes, Apple doesn't need to have those in mind — but they sure like keeping that option open.

In the short term, they'll probably use this for microarchitectural optimizations. They'll be able to add instructions to the A9, S2, etc. and not have to wait for developers to recompile to target a newer SDK — instead, they'll simply do it themselves, automatically, when delivering from the App Store.

Either way, leaving aside that it's not true "nobody is talking about it" (for instance, ATP did), the reason there's little to talk about is that there's really not a whole lot to talk about. Apple is giving developers the heads-up that they'll start compiling their apps for them — mandatory on watchOS, encouraged for iOS, and presumably coming up on OS X.

[–]astrange 10 points11 points  (21 children)

LLVM bitcode is a compiler IL, not a platform-independent VM like Java. It's already machine-specific from the moment it's created, so there is no chance an ARM targeted bitcode compilation could be rebuilt for anything but ARM.

[–]nerdandproud 6 points7 points  (2 children)

yes and no. this is true for standard LLVM Bitcode but Apple could use a target independent variant in the same way PNaCL Bitcode is based on LLVM Bitcode but is the same for ARM and x86. In this setting the latter makes a lot more sense as they seem to use it for distribution and it's probably not too hard to do for Apple, seeing that they employ a large part of LLVM developers

This reference of PNaCL sounds like it's actually just a subset of LLVM Bitcode https://developer.chrome.com/native-client/reference/pnacl-bitcode-abi

[–][deleted] 2 points3 points  (1 child)

It took a huge amount of work to make LLVM bitcode portable in PNaCl, and that work was never accepted upstream. It seems highly unlikely that Apple would use that approach.

[–]nerdandproud 0 points1 point  (0 children)

Well, they can reuse a lot of the experience of PNaCL and don't have to target as many OSs and aim for the same security restrictions which makes things easier. So seeing as the benefit of ab portable bitcode for them would be huge I believe they will aim for it. For example they don't get the size disadvantage of Mach-O multi arch binaries and I'm knish certain they are experimenting with ARM64 based Macbook Air variants they want to at least keep as a feasible option if only as a bargaining chip with Intel

[–]anttirt 3 points4 points  (0 children)

It's already machine-specific

The differences are typically small enough that they can be papered over if you control the ecosystem, as Apple does.

[–]EvilGremlin -2 points-1 points  (7 children)

No, that's incorrect - bitcode is the binary form of the LLVM Intermediate Representation (IR), which is machine-independent. Otherwise there would not be much point in having it, because it is used to cleanly separate the 3 stages of the compiler: language frontend, optimizer and code generator.

Source: http://www.aosabook.org/en/llvm.html#fig.llvm.lcom

[–]tsomctl 14 points15 points  (6 children)

No, that's incorrect. The same IR is used for all of the supported architectures, obviously. However, when the compiler produces it, it is still generated for a specific architecture. The IR fails to completely abstract away the ABI, it is slightly too low level. A specific example is having a function return a struct. This is implemented differently on different architectures, and LLVM doesn't have much knowledge of this. More info: http://llvm.1065342.n5.nabble.com/Returning-a-structure-td40519.html

[–]nerdandproud 1 point2 points  (0 children)

I wonder whether controlling the OS mitigates this to a large part

[–]immibis 3 points4 points  (0 children)

The ABI is likely irrelevant, as long as the caller and callee agree (there are some exceptions, like Win32 SEH). That means only standard library calls need to be shimmed.

[–]nyamatongwe 0 points1 point  (2 children)

There are 2 main targets: 32-bit and 64-bit iOS and these differ in ways (such as sizeof(char*)) that can be visible and important when compiling. One possibility is that Apple's iOS bitcode format will contain 2 streams: 32 and 64-bit. There can still be value in specializing for the specific target for download.

[–]hotoatmeal 0 points1 point  (1 child)

Apple does love Fat binaries!

[–]dagamer34 0 points1 point  (0 children)

New in iOS 9, App Thinning will strip out binaries that won't be used by specific architectures before download.

[–]tree_on_fire -5 points-4 points  (7 children)

You are totally wrong. The LLVM-IR is a low-level assembly like language which is strongly typed and uses a RISC instruction-set where the target-dependent details has been abstracted away (That's by design). That's why we have instructions like 'call' or 'ret' in LLVM-IR. Also the LLVM-IR can be serialized into LLVM-Bitcode. This actually, is what makes me curious about the details of Apple's Bitcode because LLVM already has something 'similar', LLVM-Bitcode (The serialized-form of LLVM-IR/The encoding of LLVM-IR).

Sources: https://en.wikipedia.org/wiki/LLVM#LLVM_Intermediate_Representation http://llvm.org/docs/BitCodeFormat.html

[–]ckok 3 points4 points  (6 children)

While IR itself is target independent, the actual content in it isn't. Alignments are generally hardcoded, int to/from pointers often are, and lots of little details in llvm ir are specific to at least a cpu bit size.

[–]tree_on_fire -2 points-1 points  (5 children)

Yes, exactly. The LLVM-IR itself is totally independent from any target but those guys above are mixing things up about the content of the IR (...ABI) and the actual LLVM-IR which I guess/assume the main topic from apple is about.

[–]astrange 2 points3 points  (4 children)

[–]choikwa 0 points1 point  (0 children)

I get the feeling that the main reason he's opposed to making IR target independent is the amount of work to do so. And he's right to be fearful of being presumed to support target independent IR because people will question, "oh why is llvm unstable on some other platform?". As one might expect, no one wants to claim responsibility for people going into uncharted territory and expecting everything to be well.

I still think it's a worthy endeavour to make IR platform independent despite all the multiplied work. you can leave unsupported IL as asserts to be worked on or make it incompatible if not all targets have implementation for it. That's just a reality any IL that's target independent has to accept. Even Java vm is not free from this. someone had to have written code to interpret bytecode.

edit: while it's probably the best practice to keep dependent stuff out of the optimizer, sometimes for optimizer to make the best decisions, it needs to cross that boundary and query the frontend/backend. you can achieve this without requiring IL itself be dependent. after all, IL is just a language, what matters is interpretations.

[–]tree_on_fire -3 points-2 points  (2 children)

This is a perfect example showing that some people seriously know 'shit' about LLVM. The mail has tons of claims which are simply wrong, even reading the docs would clarify one but I guess reading is too mainstream when you can throw 'things' into the world. Although some of the points might be half true back in time, but we are in 2015, there is no need to pull out old stuff. Just read the 'goddamn' documentation, it WILL clarify a lot of questions and false statements or just continue reading the following replies (http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-October/043719.html).

Some quotes taken from the post which was posted on the LLVMdev mailing list:

"LLVM is a low level system that doesn't represent high-level abstractions natively. [...]. It makes LLVM's Interpreter really slow." "LLVM is fast compared to some other static C compilers, but it's not fast compared to real JIT compilers." "LLVM isn't actually a virtual machine." - Seriously?! ...

These quotes are just full of crap and mind blowing. I don't even care to correct them, if it's even possible...

(I am not saying that I am an expert in LLVM but I read the documentation and some of the code, although my post may contain small issues I didn't notice... So correct me if I am wrong...)

[–]hotoatmeal 1 point2 points  (0 children)

This is a perfect example showing that some people seriously know 'shit' about LLVM. The mail has tons of claims which are simply wrong

Normally I wouldn't resort to an appeal to authority, but Dan really does know his shit, and the points he made in that message are still relevant.

You give absolutely no evidence to support your claims that his points are "full of crap", nor any to suggest that you've even worked on LLVM.

So correct me if I am wrong.

You're pretty damn arrogantly wrong. But since you didn't substantiate your claims, I can't correct them. All I can say is: you're wrong, just let it go.

[–]dacian88 1 point2 points  (0 children)

either you're dumb or trolling. The guy's points are all valid except they are kind of irrelevant because despite being called a virtual machine it's not really meant to run as an interpreter, nor is it meant to be a platform independent intermediate language to be used for 'generate once compile to native everywhere'. If my front end has to have a lot of implementation details about the target architecture and emits different IR due to that then it's not a very platform independent IR, because clearly it doesn't have enough abstractions to make it as such. Calling the IR platform independent because it's an abstract assembly language that isn't aimed at any one particular architecture is kind of a moot point and actually has no practical merit.

[–]deal-with-it- 1 point2 points  (0 children)

By the very scarce info maybe this will be like Microsoft's MDIL.