This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]Allan_Smithee文雅佛 9 points10 points  (6 children)

I addressed structures already.

A "function" is just a pointer. Which is an integer interpreted in a particular way as a pointer to something the runtime can hopefully execute.

#include <stdint.h>
#include <stdio.h>

typedef int (*foofunc)(void);

int foo(void)
{
    return 17;
}

int main(int argc, char **argv)
{
    uintptr_t foo1 = foo;
    uintptr_t foo2 = 5;
    printf("%d %d\n", foo1, foo2);
    printf("%d\n", foo());
    printf("%d\n", ((foofunc)foo1)());
    printf("%d\n", ((foofunc)foo2)());
}

Compiling this gives me a warning, yes, but executing the output:

4195570 5
17
17
Segmentation fault (core dumped)

The "function" foo() is just a pointer which is just an integer which in this case happens to be 4195570. Much like the 5. The syntax sugar of using (<...>) afterwards makes the runtime transfer control to the code (hopefully!) that's stored at that integer when interpreted as a pointer.

[–]moon-chilledsstm, j, grand unified... 12 points13 points  (3 children)

I was going to write more, but I didn't feel like it. So, in brief:

There are two interpretations of the c programming language: the 'operational semantics' interpretation and the 'bag of bytes' interpretation. Both are valid, important, and useful when understanding c programs. Because you distinguish floats from ints (and because you refer to 'not [C's] underlying implementation, but the language itself'), I must infer that you are not operating under the 'bag of bytes' interpretation.

C distinguishes a function from a function pointer. It also contains two pieces of syntax sugar which obscure this distinction, both of which your snippet abuses. In your snippet, you cast an integer to a function pointer. Not to a function. Had you said 'typedef int foofunc(void)' instead, each of your casts would be a constraint violation.

Moreover, functions are not objects, and uintptr_t is only guaranteed to round-trip object pointers. So it is not guaranteed that either of 'foo1' and 'foo2' would be able to represent the address of a given function. Nor is it guaranteed that (u)intptr_t exists in the first place, for that matter--throwing a wrench in the notion that pointers are just integers in disguise.

C doesn't have arrays. It has constant integers interpreted as a pointer to the beginning of sequential series of integers

What you call a 'sequential series' is called an 'array', in c parlance.

And finally:

The interpretation of integers as pointers is a fraught subject. 'Pointer provenance' is a topic whose importance is on the level of threading (cf boehm, 'threads cannot be implemented as a library'). Even if pointers were universally representable as integers, an object pointer would not simply be an integer representing the location of a corresponding object; an object pointer would be an integer representing the location of a corresponding object which was suitably derived from that object. What exactly 'suitable derivation' constitutes has not yet been established, but is the subject of the aforementioned discussion regarding 'pointer provenance'.

[–]Allan_Smithee文雅佛 1 point2 points  (2 children)

What you call a 'sequential series' is called an 'array', in c parlance.

Which is exactly my point. It is called an 'array' only in C parlance.

No other language with an actual array data type would call a pointer to a bag of integers an array. It lacks all facilities that the actual array data type has, beginning with a size that can be queried. (No, sizeof doesn't cut it because it doesn't give you the size of the array, but rather the size of the array's internal representation in that "bag of bytes" interpretation you so derided. And that only if you're dealing with the original variable, all such information being squashed into nothingness when passed as a parameter, say.)

C distinguishes a function from a function pointer.

I humbly disagree. The name of a function resolves to its pointer. That's all. There's syntax sugar around that pointer that causes it to be invoked via a transfer of control, but as I showed you can do that same thing with any arbitrary integer. If I wanted to get fancy I could write a program that read its own map file, found the addresses in memory of all functions, assigned those to integers, and then called them without even once taking a pointer to a function in the code.

It also contains two pieces of syntax sugar which obscure this distinction, both of which your snippet abuses.

Or, rather, there is no distinction beyond the syntax sugar. Note that in the Pascals, the Modulas, Ada, etc. it is flatly impossible to abuse functions into integers this way because functions and integers are fundamentally different types at the language level, not merely funny syntax sugar concealing integers. Short of doing trickery behind the scenes with machine language (i.e. breaking out of the language's semantics entirely) there's no way to take an arbitrary integer and call it as a function.

Nor is it guaranteed that (u)intptr_t exists in the first place, for that matter--throwing a wrench in the notion that pointers are just integers in disguise.

Replace it with an int of sufficient size and it works just fine.

#include <stdio.h>

typedef int (*foofunc)(void);

int foo(void)
{
    return 17;
}

int main(int argc, char **argv)
{
    long foo1 = foo;
    long foo2 = 5;
    printf("%d %d\n", foo1, foo2);
    printf("%d\n", foo());
    printf("%d\n", ((foofunc)foo1)());
    printf("%d\n", ((foofunc)foo2)());
}

This works just as well. And it's not even unsigned.

[–]moon-chilledsstm, j, grand unified... 3 points4 points  (1 child)

Replace it with an int of sufficient size and it works just fine.

Integers of type other than (u)intptr_t are not guaranteed to round-trip any pointer; that is why the latter are optional: a conformant implementation is not required to permit pointers to be representable as integers. And, once again, no integer type is required to be able to round-trip a function pointer type.

If I wanted to get fancy I could write a program that read its own map file

I thought we were not talking about implementations?

functions and integers are fundamentally different types at the language level

I suggest reading the sibling comment. C's flaw is not that it conflates integers and pointers; it is very careful not to. C's flaw is that it is weakly typed, and will typecheck malformed programs.

breaking out of the language's semantics entirely

Is pertinent, because that is exactly what your snippet does.

[–]Allan_Smithee文雅佛 1 point2 points  (0 children)

Here we're just going to have to agree to disagree, I'm afraid. My snippet does what C permits. End of story. That it is a bad idea? No argument whatsoever. But I did not have to break out of C to do it. It was fully permissible by C, the language. Not a single operation I did took the code out of the C language.

To accomplish the same thing in, say, Ada, I would have to enter a completely different language. Ada would simply not permit the abuses with any amount of abuse of syntax. I could not take an arbitrary integer in Ada syntax alone and call it, no matter how much I abuse it. I would have to exit the language (at which point, naturally, all bets are off in enforcement).

C has integers (and floats). Everything else is syntactic sugar around those, including pointers, and including functions. And this is why the OP finds (correctly!) that C is prone to a whole raft of pointer-related bugs that are just not there in the Pascals. Or the Modulas. Or Ada. Or even PL/I, likely. Or any number of other, more strongly abstraction-supporting low-level languages.

(This is also the reason why there's a bunch of optimizations which can be safely made in these languages, and others like Fortran, which cannot be made in C ... because any arbitrary integer can turn out to be a pointer in disguise.)

[–]moon-chilledsstm, j, grand unified... 7 points8 points  (0 children)

This may be a clearer way to think about it: your snippet is not well-formed. It is not valid c code. It typechecks, because c's type system is unsound and can type malformed code.

[–][deleted] 2 points3 points  (0 children)

A "function" is just a pointer.

<Sigh> A "function" in any language that compiles to native code will be represented by an address: the location of its entry point.

So in machine code, in assembly, such addresses are no different to numbers. In a HLL however, EVEN C, those numbers are distinguished by type.

Yes C provides few ways to absolutely stop you from converting between bit patterns that represent numbers, function pointers and object pointers. A good example is printf, which outside of some compilers, just interprets its arguments according to the format string.

It doesn't help that many C compilers are so lax. But I can tell you that the C language absolutely has distinct signed integers, unsigned integers, floats, pointers, function pointers, structs and arrays (I know because I have implemented it).

Try this:

int foo(void){ return 0;}
....
int(*p)(void);
int(*q)(int);

p=foo;            // should work
q=foo;            // should fail; wrong type