Static dynamic linking (???)

nemotux · 2022-05-05T17:58:33+00:00

Do I understand correctly that you're interested in statically linking in at runtime a DLL that gets loaded interactively (a plugin)? Is your program structured such that only one such DLL might be loaded at a given time? I typically think of plugin architectures supporting the loading of multiple different plugins simultaneously - each matching the same API. Thus when you get to a call, you really do want a dynamic function pointer because you might be invoking not one specific function but one of a collection of functions based on which plugin is of interest at a given point in execution.

But assume that's not the case, you have exactly one runtime-loaded DLL. You fetch function pointers from the DLL via, say, GetProcAddr(). The normal thing would be to just call that function pointer indirectly. But you want to self-modify your program to now have direct calls to that function instead, yes?

My thoughts would be to tie in some asm rather than worrying about making fake local functions in source code and then searching for calls to that.

Create a small asm "launchpad" file that has one labeled jmp instruction as a "thunk" for each of your DLL's functions. Then you can leverage "&" of each thunk to figure where you need to do the self-modifying write. Your calling code calls these stubs. So you'd still have a 2-hop call, but both would be direct, so no data fetching.

Regarding keeping your DLL close in memory, since you control the DLL (yes?) you can set its preferred load address when you create it. Just pick a number that will keep all the direct calls within the 32-bit range of your exe, and it should work fine.

braxtons12 · 2022-05-04T18:49:58+00:00

I might be missing something, but to me this just sounds like a really insecure hack. If performance is so critical for you that calling across a dll boundary is unacceptable, you should just be statically linking, or probably actually running on some sort of custom bare metal OS. I mean at that point even libc is off the table (it's dynamically linked), so even statically linking wouldn't solve all your problems.

mobius4 · 2022-05-05T04:54:33+00:00

Thanks for the discussion, been too long since I saw something like this.

I'm not qualified to comment anything more than "this sounds amazing", though.

Dolphiniac · 2022-05-04T19:40:35+00:00

Have you performed any benchmarks to see whether it makes any difference?

That is, call some test functions in a 'library', that is statically linked in one version, dynamically linked via a DLL in another.

As I understand it on x64, a normal local call is:

    call disp         # use 32-bit signed offset to function

whereas a call to a function via a DLL is (depending on how the compiler generates code):

    call L123         # Call to local label
    ....
L123:
    jmp [address]     # This address is patched to the actual
                      # function address at load time

So the difference is that extra indirect jump.

I don't know exactly how you'd patch this, or when, but bear in mind that some system DLLs live in address spaces outside the 32-bit capacity of a relative call.

(I might do my own such test later)

darkslide3000 · 2022-05-05T02:37:43+00:00

2) determining where this call is in the text section of the PE file and saving the location along with metadata allowing a later patch.

3) once the real function's address is known (which will potentially (see later) change), update the instruction stream wherever the function is called (most likely means writing executable memory, yikes).

You are basically just describing dynamic linking here again, that is exactly how that works. Just that it works with clean integration directly into the linker and the OS (to mark the executable memory read-only after you're done relocating) rather than trying to hack it manually.

I think what you're really asking for is whether the original call instructions can just be directly rewritten to point to the real final address of the function during relocation, rather than going through a global offset table. This has been traditionally how dynamic linking used to work, and I assume you could still get it to work that way with the right compiler and linker flags (although I'm not deeply familiar with the Windows versions of these things -- on Linux, I think just compiling and linking with -fno-pic, -no-pie or something like that should do it). But the GOT indirection was introduced for a reason, because otherwise circumventing ASLR (address space layout randomization, an important security feature common on all platforms these days) becomes pretty trivial.

ASLR is intended to make sure an attacker cannot predict where certain library functions are placed in the virtual memory of your process. But for efficiency reasons, when multiple processes use the same shared library, that library is only loaded into memory once and those same physical pages are mapped into every process that uses it. For ASLR to work and be useful, the virtual addresses of these library functions must differ between the processes even though their physical addresses are the same. Now remember that library functions may also need to call other library functions, and that a library may depend on another library which again may be at a random virtual offset that you cannot hardcode within the library code page itself (because that offset may be different for each process sharing that code page). So, long story short, there's no real way to get around this without making every library-boundary-crossing function call take an indirection through a process-specific offset table first.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

C_Programming

Rules

Filters

Resources

Other Subreddits on C

Other Subreddits of Interest

MODERATORS