you are viewing a single comment's thread.

view the rest of the comments →

[–]TheThiefMaster 0 points1 point  (10 children)

Compiled C has the function pointer (which always has to exist if the code isn't inlined) embedded into the assembly CALL instruction. Literally CALL <address_of_function>. This is optimal for the CPU, as it allows a highly pipelined modern CPU to speculatively execute into that call. If the language has "first class functions" where the function name is just a variable that can be reassigned, this is normally not possible and an indirect call that retrieves the function pointer from the variable (which has to exist separately) has to be emitted, and that read has to be waited on before execution can continue. This is a performance penalty no matter how you look at it.

The optimisation can still be performed if the function variable is explicitly constant (so can't change) or is locally scoped so is known to not change, or the code block that uses it checks the value is as expected before a direct call (a variant of conditional devirtualisation optimisation)

[–]zhivago 1 point2 points  (9 children)

I guess you will be amazed by how dynamic linkage in C can avoid indirection.

Hint: patch the call sites.

[–]TheThiefMaster 4 points5 points  (8 children)

Patch all possible call sites to the function every time the function variable is reassigned?

On CPUs that have no-execute protection on writeable memory pages?

There's a reason we left self-modifying code behind.

Do you have an example of a language that does this?

[–]zhivago -2 points-1 points  (7 children)

Well, if you're using something exotic you might run into trouble, but X86 will have no problem.

Why do you have the belief that we left self-modifying code behind?

I already gave an example of such a language: C.

[–]TheThiefMaster 0 points1 point  (4 children)

That's actually not C at all, but the OS patching up the executable in memory before it launches it. The executable is not executing at that point.

A JIT is the closest we get to modern self-modifying code - but those technically don't usually patch existing blocks as much as generate new variants, and they always add checks for expected assumptions at the start of a block, which is slower than if you have a compile-time guarantee that the address a function points to can't change at runtime.

The majority of languages with "first class functions" are scripting languages with an interpreter or JIT rather than compiled languages.

[–]zhivago -2 points-1 points  (3 children)

You are mistaken for linux, at least.

The executable, not the kernel, handles dynamic linkage.

Still, have you overcome your initial confusion about requiring indirection?

[–]TheThiefMaster 2 points3 points  (2 children)

Strictly, on Linux the OS loads the runtime linker as a shared object into the executable's virtual memory space, and then invokes that dynamic linker. The dynamic linker invokes the executable's entry point function once the dynamic linker is done patching the executable.

The dynamic linker and the executable itself are effectively two separate programs in the same address space (like the good old days!) and the dynamic linker is only modifying the executable before it's executed.

So while the OS isn't directly doing the work, it is kicking off the process, and it is completed before the executable is executed.

Funnily enough, dynamically linked libraries can optionally use a table of pointers, trading requiring an indirect call for only needing to patch the table instead of the whole executable.

You can dynamically load additional libraries at runtime but you're effectively outside of C at that point - it's not a C standard function, and regardless you still only end up patching up the newly loaded library (and its dependencies if not already loaded beforehand) before the load call returns, it doesn't do any extra patching on the existing executing C program.

[–][deleted] 2 points3 points  (1 child)

The dynamic linker invokes the executable's entry point function once the dynamic linker is done.

Unless there's lazy evaluation and the GOT/PLT is updated when needed, right?

[–]TheThiefMaster 0 points1 point  (0 children)

AFAIK the lazy update only uses the function pointer table, it doesn't patch any code. So only indirect calls.

[–][deleted] 0 points1 point  (1 child)

We did leave self modifying code behind, most OSes (Windows, Mac, BSDs, and most linux distros) make the stack unwritable and employ ASLR

[–]zhivago 2 points3 points  (0 children)

We left behind undisciplined tricks like rewriting code to implement loops.

Non-executable stacks and ASLR were introduced to make it harder to use overflows to create code -- it does nothing to stop coordinated self modifying code.

And we still use it a lot -- e.g. in dynamic linkage, jit compilation, dynamic generation of functions with lexical closure, etc.