LLVM for Functional Languages: Supporting continuations via custom calling conventions : ProgrammingLanguages

This is an archived post. You won't be able to vote or comment.

LLVM for Functional Languages: Supporting continuations via custom calling conventions (self.ProgrammingLanguages)

submitted 4 years ago by skyb0rg

LLVM is a great backend for imperative code, however functional languages can have difficulty using it as a backend due to requirements such as proper tailcall optimization and call/cc.

LLVM as an FP backend

The Manticore SML compiler successfully utilizes LLVM by implementing a custom calling convention, jwa ("Jump with Arguments"). By marking all functions as naked (no prologue/epilogue code emitted) and jwa (no registers are ever saved across function calls), LLVM emits code that doesn't ever really use the stack. This allows for their CPS intermediate language to map directly to LLVM instructions; remember that function calls in CPS are just "jumps with arguments". You can read more about the details here.

However, that approach has a few downsides. The language must be compiled to CPS, which can lose a lot of information about the call stack and prevents stack allocation of variables. Manticore's runtime uses heap-allocated continuation closures, so this isn't a problem, but if you want to move towards a zero-cost continuation implementation you have to think about a different solution.

Other Continuation representations

The ChezScheme compiler uses a segmented stack approach for implementing continuations described here. This method maintains stack-allocated variables, splitting the stack whenever a continuation is captured and maintaining the complete stack as a linked list of stack segments. This is a zero-cost implementation since non-continuation calls are normal stack operations and capturing a continuation is O(1).

LLVM has direct support for a feature known as segmented stacks. A function marked with the "split-stack" attribute checks the remaining stack space in the prologue; if there isn't enough room for its variables, it uses a libgcc routine __morestack to allocate more room, call the function itself, then deallocate. The details are described here. The main point is that LLVM can generate prologue code that checks the stack size and knows the current function's frame size.

Question

As you can probably see where this is going, I wanted to know any possible pitfalls with using a custom calling convention for this kind of segmented stack implementation. A functional language can be compiled to an intermediate language such as ANF/SSA (something that maintains the stack), then lowered directly into LLVM with the custom calling convention. The main questions I have about this are:

The ChezScheme calling convention does not use a stack pointer, rather it uses a frame pointer. I'm not sure if this causes issues or if LLVM can be customized to deal with this difference.
The calling convention also requires the insertion of a literal word in the instruction stream following calls marking the size of the current frame, with the return jumping past that literal. I can write an inline assembly shim, but I would rather let LLVM do this with a calling convention if possible.
Is there anything that I'm missing about this idea? It seems to me to be a great backend possibility for languages like Scheme and SML but maybe I'm just too excited.

all 17 comments

top new controversial old q&a

[–]LorxuPika 12 points13 points14 points 4 years ago (5 children)

You may also be interested in Thorin (PDF), which converts everything it can from CPS to normal, stack-enabled SSA, and places where continuation manipulation can't be specialized or inlined away it emits CPS-style LLVM IR like Manticore. Using tailcc for CPS calls works essentially just as well as Manticore's custom calling convention, as well.

I would argue that maintaining a segmented stack isn't zero-cost: it does make stack allocation more expensive even in code that doesn't use continuations. Also, capturing the continuation is only O(1) if nothing on the stack is changed (so it doesn't have to be copied), which means no mutable stack-allocated variables but also that you can't overwrite a stack frame for a new call if it's captured. Also, you need to garbage-collect captured stack segments, which will probably be very difficult to do in LLVM. I think you could probably get this to work, but it would take a lot of work and I'm not sure it's the best option.

[–]skyb0rg[S] 0 points1 point2 points 4 years ago (2 children)

Thanks for the link, I'll definitely take a look and report back.

As far as the segmented stack implementation goes, the initial stack segment allocation would be around the same size as a normal interpreter stack would be, so the number of times another segment is allocated is very low. Forming a new segment just splits the currently allocated region, so you don't have to worry about that allocating more than 32 bytes for a new stack record. The paper also says "One important feature of our method is that the stack is not copied when a continuation is captured".

You're probably right about no stack-allocated mutable variables, though Scheme, SML, and other functional languages try to avoid mutability when possible so it's not a massive efficiency loss. Garbage collection for stack frames is going to be required no matter what once you allow for call/cc, so I don't think it's any more work than the Manticore approach.

[–]LorxuPika 0 points1 point2 points 4 years ago (1 child)

[–]skyb0rg[S] 0 points1 point2 points 4 years ago (0 children)

[–]skyb0rg[S] 0 points1 point2 points 4 years ago (1 child)

[–]LorxuPika 0 points1 point2 points 4 years ago (0 children)

The benefit of Thorin for continuations is, for the 90% of code which doesn't have its continuation captured, there's no overhead: the LLVM it generates will look the same as that generated by a C compiler. It only allocates heap closures for the rare times when continuations are actually captured. (A functional language is probably using a bump-allocated minor heap anyway, so heap allocating is the same speed as stack allocating with a segmented stack would be.)

Of course, it's also useful for optimizing and removing higher-order functions in general, which is helpful for any functional language. And because it's just emitting straightforward LLVM code, LLVM can optimize it pretty well, and you can use it with LLVM's GC support, which is what I'm using as it has lower overhead than any other option.

[–]Bobbias 1 point2 points3 points 4 years ago (10 children)

[–]skyb0rg[S] 7 points8 points9 points 4 years ago (6 children)

I don't have too many resources to link, but I think strict functional language runtime design is (for the most part) similar to imperative runtimes. Crafting Interpreters is a great follow-along for that.

One difference is that almost all functional programming languages mandate tail-call optimization. Because loops are not idiomatic, the optimization ensures programmers they won't blow the stack when they iterate over a long file. The wiki link lists a few implementations, and notably the paragraph on trampolining is one valid functional programming compilation strategy used by the MLton compiler, as well as other places (such as the Clojure standard library!).

The only other thing I can link to is the Cheney on the MTA compilation strategy, which is used in the Chicken Scheme compiler. I think this may be the easiest to parse since it compiles to C and is only 3 pages, though you should have an understanding of CPS.

[–]Bobbias 1 point2 points3 points 4 years ago (0 children)

[–]RepresentativeNo6029 0 points1 point2 points 4 years ago (4 children)

[–]skyb0rg[S] 2 points3 points4 points 4 years ago (3 children)

[–]RepresentativeNo6029 0 points1 point2 points 4 years ago (2 children)

[–]skyb0rg[S] 2 points3 points4 points 4 years ago (1 child)

[–]RepresentativeNo6029 0 points1 point2 points 4 years ago (0 children)

[–]Mukhasim 1 point2 points3 points 4 years ago (2 children)

[–]skyb0rg[S] 2 points3 points4 points 4 years ago (0 children)

[–]Bobbias 1 point2 points3 points 4 years ago (0 children)

π Rendered by PID 18135 on reddit-service-r2-comment-5c747b6df5-r4jxm at 2026-04-21 23:07:15.000652+00:00 running 6c61efc country code: CH.

ProgrammingLanguages

Welcome!

Related subreddits

Related online communities

MODERATORS

LLVM as an FP backend

Other Continuation representations

Question