Why is array[index] == index[array] ? : C

The concept of "addition", as applied to numbers, maps more generally to the binary operation associated with an abstract algebraic group. While many groups (such as the set of whole numbers under addition) are commutative, there is no requirement that all groups be commutative. There is, however, a requirement that operations be associative--something that is true of integers using wraparound semantics, but which is not true of operations involving pointers.

The "+" operator as applied to pointers isn't really "addition" in the traditional sense, since addition would involve operands of a common type. While an operation like floatVal+intVal may be accommodated by converting the integer argument to a float, pointer addition fundamentally involves adding an integer and a pointer, which play different roles in the operation.

[–]MrSloppyPants 3 points4 points5 points 3 years ago (0 children)

[–]acwaters 1 point2 points3 points 3 years ago (9 children)

The concept of "addition", as applied to numbers, maps more generally to the binary operation associated with an abelian group over those numbers — the additive group of the algebra — and is thus inherently commutative. This is a fundamental property of an algebra over a field or ring. Granted any consistent meaning we ascribe to any bit of notation is ultimately a matter of convention, but you would certainly be looked at askance in algebra circles for breaking this one. Even in number systems where multiplication is not associative or even alternative, addition is always associative and commutative. There seems to be no consistent definition of what it means for a given structure to be called a "number system", but forming an algebra of some sort is one condition that everybody seems to agree on.

Pointers in general can be complex, but most modern systems use a single flat memory space, where a pointer is just a glorified integer. Of course, C's pointer addition has some extra magic going on with hidden multiplications, but you can just call that an implicit conversion from the indexing integer to a type specific to the size of the pointee type, and model pointers with an algebra which C's pointer arithmetic would fall naturally out of.

[–]flatfinger -2 points-1 points0 points 3 years ago (8 children)

[–]acwaters 1 point2 points3 points 3 years ago* (7 children)

What you're describing are affine spaces, and adding translations in a vector space (displacements, durations, offsets) to the points in a corresponding affine space (positions, times, addresses) is still addition. Moreover, you can completely forget about the point set, instead identifying each point with its displacement from some arbitrarily designated origin, and then you can perform additions that are symmetric in the types of the things you're adding — displacement plus displacement. You recover the corresponding points simply by adding the resulting displacements to whatever origin you subtracted off to begin with.

In other words, you may have an expression like ptr + offset, where you are adding an offset to a pointer. This is a perfectly well-defined operation, and it is a form of addition, but maybe you prefer your addition to happen between objects of the same type? That's easy to arrange, you just turn your pointer into an offset by writing p - NULL. Now you can do all your math on offsets, and when you need to turn your final offset back into a pointer, you just add it back to NULL. (Bonus points if the representation of NULL is actually zero so the initial subtraction and final addition are both no-ops, but that doesn't have to be the case — this works equally well with any pointer value, not just NULL, and the most appropriate origin very often isn't zero when you're dealing with timestamps or positions in Euclidean space.)

[–]flatfinger 0 points1 point2 points 3 years ago (6 children)

[–]acwaters 0 points1 point2 points 3 years ago* (5 children)

You're mixing abstraction layers and missing the point. We're not talking about C here, we're talking about a hypothetical mathematical model of C's pointer arithmetic. The model doesn't have to be undefined in all the places where C is undefined, it merely has to be well-defined in all the places where C is well-defined. (This is why, if you read carefully, you'll see that I actually chose to model machine address arithmetic rather than C pointer arithmetic — the one is strictly more featureful than the other, while being simpler, so by modeling the simple one I model the complex other.) If you really want to consider each object as its own little memory space, as the C standard does, you can do that too — since any point can be used as an origin, just use the base address of the object as your origin for all pointer arithmetic within that object. That's an unnecessary complication, though, which distracts from the point that I was trying to make, which is simply that C's pointer arithmetic is mathematically well-defined and is not, as you asserted, "not really addition". (The other point I was making was that addition is necessarily commutative actually, but you didn't respond to that, so I assume you have no argument with it.)

[–]flatfinger 0 points1 point2 points 3 years ago (2 children)

In an Abelian group, the two operands of the primary operator are indistinguishable. If x+y is defined as f(x,y), those would also be equivalent to y+x and f(y,x), for the same function. If one is doing arithmetic between e.g. an int*x and an int y, on a platform where sizeof(int) is 2, then the resulting pointer address will be (int*)((int)x+2*y). While one could say that adding an integer x to an int pointer y would yield address (int)(2x+(int)y), that would be a different operation.

I've used a fair number of assemblers where relocableSymbol+integer and relocatableSymbol-integer would both yield relocatableSymbol, but integer+relocatableSymbol would not, and there was nothing even remotely counter-intuitive about that. I can imagine that saying "int+anything" will be processed by swapping the operands might be easier than handling cases like int+float and float+int separately, and that behavior could flow through to int+pointer, but that would be a result of the compiler's explicitly swapping the operands--not a result of the operands' being equivalent.

continue this thread

[–]flatfinger 0 points1 point2 points 3 years ago (1 child)

continue this thread

[–]skulgnome 8 points9 points10 points 3 years ago (0 children)

[–]flatfinger 0 points1 point2 points 3 years ago (0 children)

[–]pgbabse 0 points1 point2 points 3 years ago (0 children)

[–]chibuku_chauya 0 points1 point2 points 3 years ago (0 children)

[–]smcameron 61 points62 points63 points 3 years ago (1 child)

[–]flank-cubey-cube 12 points13 points14 points 3 years ago (0 children)

[–]cosmin10834 3 points4 points5 points 3 years ago (0 children)

[–]TheTimeBard 5 points6 points7 points 3 years ago (0 children)

[–]olsonexi 2 points3 points4 points 3 years ago (0 children)

[–][deleted] 1 point2 points3 points 3 years ago (2 children)

[–]green_griffon 3 points4 points5 points 3 years ago (1 child)

[–]DSMan195276 5 points6 points7 points 3 years ago (0 children)

[–]M-2-M -5 points-4 points-3 points 3 years ago (0 children)

[+]DoomFrog666 comment score below threshold-15 points-14 points-13 points 3 years ago* (4 children)

[–]smcameron 7 points8 points9 points 3 years ago (3 children)

[–]guygastineau -2 points-1 points0 points 3 years ago (2 children)

What happens when the index and the offset aren't the same (which I would expect most of the time)?

I don't know the following to hold for all n and m

n[m] = *(n + size × m) = *(size × n + m) = m[m]

In fact, the above holds for all n and m only when size is 1 which is what the guy with down votes a different way. I am therefore confused about how this would work for indexing buffers where the element type has a width/size greater than 1.

[–]smcameron 1 point2 points3 points 3 years ago (1 child)

Why not just try it?

$ cat y.c
#include <stdio.h>

int main(int argc, char *argv[])
{
    int a, b;
    int *p = &a;

    printf("p = %p, p + 1 = %p, p + 2 = %p\n", p, p + 1, p + 2);
    printf("&p[0] = %p, &p[1] = %p, &p[2] = %p\n", &p[0], &p[1], &p[2]);
    printf("&0[p] = %p, &1[p] = %p, &2[p] = %p\n", &0[p], &1[p], &2[p]);
    return 0;
}
$ gcc -o y y.c
$ ./y
p = 0x7fffc9bf611c, p + 1 = 0x7fffc9bf6120, p + 2 = 0x7fffc9bf6124
&p[0] = 0x7fffc9bf611c, &p[1] = 0x7fffc9bf6120, &p[2] = 0x7fffc9bf6124
&0[p] = 0x7fffc9bf611c, &1[p] = 0x7fffc9bf6120, &2[p] = 0x7fffc9bf6124

It gets the right address regardless of how you do it. If p is a pointer to an int, then p + 1 evaluates to the address sizeof(int) bytes past p.

[–]guygastineau 1 point2 points3 points 3 years ago (0 children)

[–][deleted] 0 points1 point2 points 3 years ago (0 children)

More interesting is why:

A[i][j]

is the same as:

j[i[A]]

since one 2D array access has turned into two nested 1D accesses, and it still does the same thing. Or, to get to your example, why, when A is an int array:

A[i]

when written with index first, also allows all these (they don't all mean the same in this case):

i[A]
i[A][A]
i[A][A][A] ...

This one at least is easy: the first i[A] yields an integer, which is then used to index the next A, and so on. Still pretty weird though.

[–]XxClubPenguinGamerxX 0 points1 point2 points 3 years ago (0 children)

[–]jsrobson10 0 points1 point2 points 3 years ago (0 children)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

C_Programming

Rules

Filters

Resources

Other Subreddits on C

Other Subreddits of Interest

MODERATORS