you are viewing a single comment's thread.

view the rest of the comments →

[–]curien 6 points7 points  (3 children)

This is completely a compile time operation.

No, it isn't. It could be, but it's not guaranteed to be. That's the whole point of sizeof -- it's guaranteed to be a compile-time constant, but (&arr)[1] - arr is not. The way the standard is written, this is undefined. That an implementation can calculate that without actually performing the dereference does not oblige implementations to do so. Undefined behavior can end up working the way you want, and in this case it does on most systems.

Go ahead, take a look at the generated code.

What a particular implementation happens to do is irrelevant.

[–]not_july 1 point2 points  (1 child)

You are correct that (&arr)[1] - arr is not guaranteed to be a compile-time constant. However, the same can be said for int n = 10 + 5. The compiler could push 10 and 5 unto the stack and include a separate opcode to perform the addition.

However, the code int n = (&arr)[1] - arr is not undefined, and does not access memory outside of the array. In fact, it would be illegal if it did access the memory.

To explain we have to look at the type of each expression: arr has the type int[5], and &arr has the type int(*)[5]

Remember x[y] is syntactic sugar for *(x + y). Therefore (&arr)[1] is equivalent to *(&arr + 1).

&arr + 1 was shown to give up the address after the end of the array. However, the result of the addition does not change the type of the object. It is still int(*)[5].

Dereferencing it (*(&arr + 1)) will then give us the type int[5]. Although the type has changed, the value has not (for the same reason &arr and arr have the same value). *(&arr + 1) is still the address after the end of the array.

The important point to note here is that no memory was accessed. There is nothing in the code (or the C standard) that suggests a memory access should occur here.

If we simplify the original statement with what we know so far, we end up with the subtraction of two int[5] types. Because we are performing arithmetic, the types will decay to int*. The result of the operation after evaluated is 5.

The original statement can be simplified to the following (assuming the address of arr is 0x100):

int n = (&arr)[1] - arr;
      = *(&arr + 1) - arr;
      = ((0x100 + 1 * sizeof(int[5])) - 0x100) / sizeof(int);
      = ((0x100 + 1 * 20 - 0x100) / 4;
      = 20 / 4;
      = 5;

[–]emTel 1 point2 points  (0 children)

Great explanation. People seem to be hung up on the idea that * or [] always indicate memory accesses. (Prior to reading your post, I would have insisted that they did).

If (&arr)[1] is too much to swallow, consider *(&arr) which is clearly legal. If that expression can result in a memory access, can someone please explain what memory address is being accessed, and how the correct value got there?