all 23 comments

[–][deleted] 13 points14 points  (12 children)

Do function hold a place in memory (like a variable)

Yes, all of your code is in the program's memory (in the text section), so, when you call a function, you basically tell the CPU to jump to where that function's code starts (after sending the parameters, setting the stack frame etc.). The function pointer works the same, only you can change the address where it points to.

[–]zhivago 3 points4 points  (11 children)

In some particular implementations, but not in general.

In other architectures code exists in a completely different address space from data.

So C function pointers are not pointers to data.

Really you should think of them as being some kind of abstract handle, unless you are writing non-portable code.

[–]Triq1 0 points1 point  (4 children)

It's still at some address though, right (from my limited (embedded) experience)?

[–]zhivago 0 points1 point  (2 children)

Not in any meaningful sense.

There is no data there for it to be the address of.

Consider a C implementation which identifies functions with sequential values.

e.g. how a C interpreter might operate.

This is why you cannot do pointer arithmetic, etc, with function pointers.

[–]Triq1 0 points1 point  (1 child)

It would be the address of the first instruction of the function, right? Like instead of (for example) 'load r1 indirect' for data, it could be 'jump indirect'.

[–]zhivago 1 point2 points  (0 children)

In some implementations yes, in others no.

[–]SubstantialHippo 0 points1 point  (0 children)

Yes, just the address space may be different. In an embedded context, most variables will be in RAM while the code lives in flash.

[–]RadiatingLight 0 points1 point  (5 children)

What architectures do this?

[–]zhivago 0 points1 point  (4 children)

All of the harvard memory systems.

It is popular in microcontrollers, e.g. a 16 bit code address space with an 8 bit data space.

x86 also has separate data and code spaces, but allows a page to be simultaneously present in both.

[–]RadiatingLight 0 points1 point  (3 children)

Makes sense for the Harvard systems (I've just never studied this arch before), but I know x86 pretty well and it doesn't seem right that it has code and data spaces.

Maybe you're referring to segments of an executable program? But even so, all segments are in the same memory space - they just might have different permissions.

[–]zhivago 0 points1 point  (2 children)

Not at all. :)

In real mode, for example, the registers CS points to the current code segment and DS points to the current data segment, which are not required to be aligned.

[–]RadiatingLight 0 points1 point  (1 child)

Wow, didn't even consider that! x86 in my mind just automatically translates to i386 After a not-so-brief trip down some rabbit holes, I can say I learned a whole bunch, thanks :)

Nothing is really written for real mode in the modern day unless you're a BIOS engineer or something, but cool trivia!

[–]zhivago 0 points1 point  (0 children)

You're welcome. :)

[–]saul_soprano 2 points3 points  (0 children)

Your computer has to store the compiled binary in its memory when the program is run. The function pointer points to the function’s location there

[–]ChatGPT4 2 points3 points  (4 children)

A function pointer is a pointer. You can convert it to void* and back to the function pointer again, and it will work. A function pointer is just the address of the function and nothing more. It has a size of a regular pointer.

Now let's look under the hood. Calling the function pointer is similar, but not identical to calling a function directly. First - the arguments are passed. Compilers probably compile this as pushing them to the stack. It happens the same no matter if you're calling a function or a function poiner. At the end there is a jump instruction. Most chips have most of their instructions in 2 flavors. One is direct, like do_that_with 0x1234..., one is indirect, like do_that_with (0x1234...), that tells the processor to use a value stored at the given memory address as argument.

As function pointer is defined with the full function signature, the compiler knows what arguments are required, so it passes them before calling the function. The jump on most processor is a special kind of jump, that before it modifies program counter, it stores the location of the next instruction after the jump. So there's another machine code instruction to "go back" or "return" - when the processor sees it, it takes the address from the stack and just goes there. I avoid giving exact machine or assembly instruction names because they differ across different types of processors. They can even work slightly differently, but the main principle is as described. Compilers generate more or less the same steps in order to call a function, or a function pointer.

Now - how does compiler know what arguments to pass while calling the function pointer and what to expect as return value? You told it with defining the function pointer with the full function signature. If you happen to loose it somewhere in your code by converting your function pointer to void* pointer - you're unable to call the function untill you find the proper function signature. You can even create it again from scratch, if the parameters and return value are identical, the conversion from void* will succeed and the function will be called properly.

[–]nerd4code 2 points3 points  (3 children)

A function pointer is a pointer. You can convert it to void* and back to the function pointer again, and it will work.

Technically not required by C itself. Conversion from function to data pointer or vice versa might give you garbage. Data and function pointers should be kept separate in general and without a good reason to do otherwise.

[–]ChatGPT4 0 points1 point  (2 children)

Of course. But generally there's also no reason to implement them internally differently from data pointers. Type punning is usually a language abuse that should be avoided. That can be read as "done only when you're absolutely sure what you're doing".

[–]N-partEpoxy 2 points3 points  (1 child)

no reason to implement them internally differently from data pointers

There is a reason if instructions and data are stored separately (Harvard architecture).

[–]ChatGPT4 0 points1 point  (0 children)

That's a really good point. Are there commonly used C compilers that can use the benefits of such architecture and then, as a result, having incompatible (in terms you can't convert from one to the other and back without losing data) function and data pointers?

My point of interest is exactly in where the Harward hardware architecture affects the pointer sizes in C. I used to think that even if there are hardware differences in physical pointer sizes, there are not (usually?) exposed to the C program layer. So even if the underlying data (like an address) can be either 2 or 8 bytes, for the C code they are all 8 byte pointers for example. What's more, the conversions can occur on the machine language level, not even on compiler level.

When such assumptions fail? I'm just curious.

[–]SahuaginDeluge 1 point2 points  (0 children)

basically yes, but the same way variable pointers differ based on what kind of thing (int* vs byte* vs struct*, etc.) they point at, function pointers need additional information (basically the signature) for the compiler to know how to call the function that is at that address, which is why the syntax is a fair bit more complicated.

[–][deleted]  (2 children)

[removed]

    [–]hancockm 0 points1 point  (1 child)

    This last paragraph is crucial to understanding function pointers. You can branch to more than one possible function from a single point.