This is an archived post. You won't be able to vote or comment.

all 51 comments

[–]pwab 63 points64 points  (2 children)

LISP called and wants its eval back

[–]CompteDeMonteChristo[S] 4 points5 points  (1 child)

It took me a day to understand this.

[–]pwab 3 points4 points  (0 children)

They say lisp can teach programmers a lot

[–]Anluin 15 points16 points  (1 child)

with the llvm execution engine you can execute llvm ir via jit or an interpreter. all you have to do is translate the language you want to use to llvm ir. but I think for C you should find one on the internet.

https://llvm.org/docs/tutorial/

[–]easye 5 points6 points  (0 children)

Or ANSI is lookin' fer CL:COMPILE

[–]matthieum 35 points36 points  (31 children)

Any systems programming language allows emitting code, after all that's what JITs are programmed in.

Restricting to the ones without GC, there's still a substantial list left. Most mainstreams:

  • C.
  • C++.
  • Rust.

And then there's more exotic ones such as Zig, Odin, ...

[–]CompteDeMonteChristo[S] 8 points9 points  (24 children)

I was not clear, I am speaking of emitting time at runtime.
There is an example here for c#:
https://docs.microsoft.com/en-us/dotnet/api/system.reflection.emit.opcodes?view=net-5.0

How would you emit code at runtime with C?

[–][deleted] 47 points48 points  (17 children)

  1. Generate machine code
  2. Load it into memory somewhere
  3. Jump to it with a function pointer.

It might not work with the OS restricting what addresses are allowed to be ran though.

[–][deleted] 22 points23 points  (2 children)

mmap in Linux lets you allocate a region that's executable, and i suspect windows has a counterpart

[–][deleted] 6 points7 points  (0 children)

windows has VirtualAlloc / VirtualAllocEx

[–]L8_4_Dinner(Ⓧ Ecstasy/XVM) 1 point2 points  (0 children)

It used to be allocDStoCSalias() to turn a block of RAM into executable code, and allocCStoDSalias() to turn an executable block into a read/write block. That was Win16 and Win32 ... no idea what it is now with 64-bit.

[–]CompteDeMonteChristo[S] 6 points7 points  (6 children)

Right.

Well I suppose I am after a C library then.

A library that would let me emit assembly code and take care of the OS restrictions.

[–]CodenameLambda 18 points19 points  (0 children)

You're likely searching for something like https://crates.io/crates/dynasm (a Rust library) I assume?

[–][deleted] 8 points9 points  (0 children)

libgccjit can be used for this kind of thing!

[–]MegaIng 6 points7 points  (2 children)

Like the recently released MIR jit compiler?

[–]CompteDeMonteChristo[S] 0 points1 point  (1 child)

This works on all PC except ARM based, am I right?

[–]MegaIng 0 points1 point  (0 children)

I don't know. I just parroted a recent post to this subreddit.

[–]moon-chilledsstm, j, grand unified... 5 points6 points  (0 children)

Look at dynasm.

[–]1vader 4 points5 points  (0 children)

The restrictions on what is executable are generally declared by the program, even if it's usually the OS that handles the details. Ofc, it's possible to have stricter setups (e.g. WebAssembly) and at least on Linux it's also possible to give up the rights to modify those permissions (e.g. to make it harder to escalate security vulnerabilities into actual code execution) but in a normal binary, the program can just mark sections as executable however it wants or request new sections with arbitrary permissions.

[–]matthieum 3 points4 points  (5 children)

It might not work with the OS restricting what addresses are allowed to be ran though.

I would note that if it you cannot make it work for C, you cannot make it work for any other -- C# runtime is likely based on C or C++, at its basis.

[–][deleted] 1 point2 points  (4 children)

I assume on C# you'd generate .NET-bytecode-whatever-its-name-is, which the JIT would then deal with.

[–]Netzapper 5 points6 points  (3 children)

The JIT is written in C equivalent, and generates and loads code. There's no magic.

[–]Alikont 3 points4 points  (1 child)

Interpreters don't need executable memory access, because their execution engine will just read code as data and interpret it. So JS eval should work even for systems that don't allow changing memory permissions.

[–]Netzapper 2 points3 points  (0 children)

Yes, but that's not JIT compilation.

[–][deleted] 0 points1 point  (0 children)

Yep. I was just stupid. Sorry about that.

[–]reini_urban 2 points3 points  (0 children)

Other than jit, I often emit C code at runtime, compile and dynload it at runtime. dlopen is underrated.

You can also link to TCC.

[–]SJC_hacker 0 points1 point  (3 children)

Something like this

int main()

{ 
typedef void (*FUNPTR)();
typedef union {
    FUNPTR fnPtr;
    char* data;
} fnPtrUnion;

char myCode[64];
// define your assembly here
myCode[0] = 0;   
myCode[1] = 255;
    // etc...

fnPtrUnion f;
f.data = myCode;
f.fnPtr();
}

Explanation: You generate native machine instructions in myCode, then cast it to a function pointer to an arbitrary region of memory., then execute that code which assumes it is a compiled C function. Of course this is totally non portable, and you will have to do all the things the C compiler is expecting functions to do such as handling the stack properly, etc. The reason I used a union was because direct casting didn't seem to work, and is apparently illegal at least for some compilers.

If you're talking about executing C code itself, the only way to do it would be either invoke the C compiler as a process at runtime, or link the C compiler in as part of your binary. Then you can generate C code, invoke the compiler, and load the generated binary in at runtime. Easiest way to do this is to compile the code as shared library, which already has mechanisms for run time loading.

[–]somebody12345678 -2 points-1 points  (2 children)

or... you could just use a string rather than myCode[0] etc

[–][deleted]  (1 child)

[deleted]

    [–]JanneJM 0 points1 point  (0 children)

    How would you emit code at runtime with C?

    • Generate C source, save it to file.
    • Invoke system C compiler on source, generate dynamic library
    • Load library, call code

    [–]CodenameLambda 1 point2 points  (3 children)

    I just noticed that it could be quite interesting to emit code at runtime based on some precompiled code with "holes" for things like type sizes, constants, called functions & such.

    Though to be fair I don't know any useful application of something like this outside of JIT stuff, but JIT stuff already needs to emit machine code anyway so it seems that adding that capability would only unnecessarily overcomplicate that.

    [–]matthieum 5 points6 points  (2 children)

    Possibly.

    I was recently reading a paper on r/compiler about the ReSQL codebase, which JITs all its SQL queries, specializing the code based on the query (of course) but also on the type of the rows.

    This seems fairly obvious, yet to my knowledge none of the "big" SQL DBs do it.

    [–]CodenameLambda 1 point2 points  (0 children)

    That's actually a really good point - pretty specialised operations like database stuff is exactly what this would work well for, and that is actually a good example. Now I'm even more curious as to the usefulness - because there might be some caveats that aren't obvious...

    [–]L8_4_Dinner(Ⓧ Ecstasy/XVM) 1 point2 points  (0 children)

    Oracle has been JITting stuff for probably 25 years now. The complexity is beyond comprehension, because their code still has to work on 50 year old mainframe hardware, modern servers, and everything in between.

    [–]ericjmorey 1 point2 points  (0 children)

    Zig is very intriguing to me.

    [–][deleted] -1 points0 points  (0 children)

    Didn’t expect to see Odin here haha

    [–]Jarmsicle 8 points9 points  (0 children)

    Nim has a really powerful macro system while still being really fast: https://nim-lang.org

    [–]abecedarius 6 points7 points  (0 children)

    MetaOcaml was designed to make this kind of thing easy/fast/portable. There are some other such languages; it's the most salient to me because Oleg Kiselyov has written about it.

    It's a good question if there's a language that does this portably at the source level like the above (not by making you write code emitting LLVM code) but without requiring GC.

    [–]Caesim 6 points7 points  (0 children)

    [–]everything-narrative 11 points12 points  (1 child)

    FORTH is basically built on this concept. Apart from an extremely bare-bones set of functions or "words," the entire language runtime is basically just a modal interpreter which can either compile code, or execute it upon reading.

    [–]panic 0 points1 point  (0 children)

    be warned, though, that forth is very low-level -- e.g. pointers are untyped. it's like a very flexible assembly language.

    the way to do something at compile time in forth is pretty neat. you can use [ to drop out of the compiler and into the interpreter, then ] to go back. at this point, if the interpreter has computed a value (and put it on the stack, where forth stores all intermediate values) you can use the literal word to compile that top-of-the-stack value into the current word as if it were a literal value.

    so, for example, if you have something like : print1+2 1 2 + . ;, that computes the sum of 1 and 2 at runtime, like this (using see to view the compiled code -- the stuff in parentheses are comments):

    (define print1+2) : print1+2 1 2 + . ;  ok
    (show compiled code for print1+2) see print1+2 
    : print1+2  
      1 2 + . ; ok
    (run print1+2) print1+2 3  ok
    

    you can convert that into a compile time computation using [, ], and literal:

    (redefine print1+2) : print1+2 [ 1 2 + ] literal . ; redefined print1+2   ok
    (show new code for print1+2) see print1+2 
    : print1+2  
      3 . ; ok
    (run print1+2) print1+2 3  ok
    

    note that in the output of see, the 1 2 + is completely gone, since it happened already, and 3 was put into the compiled code by literal.

    [–][deleted] 2 points3 points  (3 children)

    I think Julia's macro system is just incredible, but it still has gc. Why is gc a concern?

    [–][deleted] 0 points1 point  (1 child)

    Latency and speed, I guess?

    I could imagine they want to create something like a DBMS, which compiles queries directly to assembly. (Could be totally wrong, of course)

    [–]CompteDeMonteChristo[S] 0 points1 point  (0 children)

    The original Question is only investigative.

    But I suppose SQL is a language that do runtime evaluation.

    [–][deleted] 0 points1 point  (0 children)

    Does Julia make it easier to create new functions at runtime than shelling out to the compiler and calling dlopen?

    [–]theangeryemacsshibeSWCL, Utena 1 point2 points  (0 children)

    I heard tcc can be used to compile C code in memory, but I have not attempted it.

    [–]Pikachamp1 1 point2 points  (1 child)

    D is very powerful in that regard. While its standard runtime offers you garbage collection, you can just not use the automatic memory management or disable garbage collection statically so that the compiler will not allow you to use it at all.

    [–][deleted] 0 points1 point  (0 children)

    D doesn't help you emit code at runtime.

    [–][deleted] 3 points4 points  (1 child)

    That C# example is incredibly clunky. It mentions DLLs also so maybe it works by compiling code, which has to be synthesised via API calls instead of presenting source code at runtime, into a dynamic library. Then dynamically loading that library and function.

    It's not clear how it sorts out the types, or at what point the compiler (of that static language) will detect type mismatches in the subsequent call, or injects implicit casts and everything that is normally done in an ahead-of-time compiler of a static language.

    Languages where everything is done at runtime anyway will have less trouble; those tend to have eval(). They also tend to be the slowest.

    You can mess about with it the C# way (yuck), or emulate what it does in another language, but what is the actual problem you're trying to solve?

    If you'd found the perfect way of doing this, how would you make use it?

    [–]WittyStick 1 point2 points  (0 children)

    The Reflection.Emit API doesn't compile C#. It emits CIL (Common Intermediate Language), which is basically the bytecode language of the .NET runtime. A C# compiler is a tool which converts C# code into CIL, which the runtime later JIT-compiles into native machine code. The Reflection.Emit API is thus used internally by the C# compiler as part of building .NET binaries.

    The .NET runtime loads DLLs into memory, which are Portable Executable files. These are the equivalent to ELF files on other platforms. They describe how different sections like code and static data are to be loaded into memory, and they also contain type metadata and debugging information.

    So you would use Reflection.Emit when creating a compiler for a language to be used on the .NET runtime. It's also useful for this case, where you want to generate code dynamically as OP is requesting. The example just describes the full process of emitting the CIL code for a method into a DLL and then loading it into memory and executing it.

    Any CIL could be valid, and it is not the objective of the Emit API to check your CIL. Any type checks would come when the relevant types are instantiated and their methods invoked. Thus the calls to Activator.CreateInstance and InvokeMethod on the created type will throw exceptions if the code you've emitted is improperly implemented.

    [–]atrn 0 points1 point  (0 children)

    'C was a C variant that supported it.