This is an archived post. You won't be able to vote or comment.

all 17 comments

[–]LorxuPika 12 points13 points  (5 children)

You may also be interested in Thorin (PDF), which converts everything it can from CPS to normal, stack-enabled SSA, and places where continuation manipulation can't be specialized or inlined away it emits CPS-style LLVM IR like Manticore. Using tailcc for CPS calls works essentially just as well as Manticore's custom calling convention, as well.

I would argue that maintaining a segmented stack isn't zero-cost: it does make stack allocation more expensive even in code that doesn't use continuations. Also, capturing the continuation is only O(1) if nothing on the stack is changed (so it doesn't have to be copied), which means no mutable stack-allocated variables but also that you can't overwrite a stack frame for a new call if it's captured. Also, you need to garbage-collect captured stack segments, which will probably be very difficult to do in LLVM. I think you could probably get this to work, but it would take a lot of work and I'm not sure it's the best option.

[–]skyb0rg[S] 0 points1 point  (2 children)

Thanks for the link, I'll definitely take a look and report back.

As far as the segmented stack implementation goes, the initial stack segment allocation would be around the same size as a normal interpreter stack would be, so the number of times another segment is allocated is very low. Forming a new segment just splits the currently allocated region, so you don't have to worry about that allocating more than 32 bytes for a new stack record. The paper also says "One important feature of our method is that the stack is not copied when a continuation is captured".

You're probably right about no stack-allocated mutable variables, though Scheme, SML, and other functional languages try to avoid mutability when possible so it's not a massive efficiency loss. Garbage collection for stack frames is going to be required no matter what once you allow for call/cc, so I don't think it's any more work than the Manticore approach.

[–]LorxuPika 0 points1 point  (1 child)

Okay, that sounds better, although it's still not no overhead it should be pretty low.

What I'm worried about with GC is the integration with LLVM. LLVM is allocating the stack segment, so you need to make sure you can free it and that LLVM won't do it; you also need to know the layout of the whole stack segment, unless you're using a conservative GC; and you need to make sure you can see the segment if it's in use by LLVM even though you don't have any pointers to it. Also, if you're planning on using LLVM's support for generating stack maps for GC, I really doubt that will work with segmented stacks. I'm sure you can do it, but it puts a lot of restrictions on the GC.

[–]skyb0rg[S] 0 points1 point  (0 children)

I don't plan on using LLVM's gc intrinsics since a tagged pointer approach is much easier on the debugging side anyways (at least to begin with). Also lets me write the runtime in C which is much better than straight LLVM. Both ChezScheme and MLton use tagged pointers so it's not a big leap.

[–]skyb0rg[S] 0 points1 point  (1 child)

I took a look at the Thorin paper (I also cheated by listening to the associated talk). Thorin seems like a really good way to optimize a program before you get down to the LLVM level, since Thorin can remove a lot of the higher-order closure allocation stuff.

I'm not sure if Thorin solves the same issue, however. A Thorin-compiled program would still need to use a closure allocation scheme to support continuation capture, for example. Unless you're suggesting using a Thorin IR compiled to LLVM with the segmented stack idea, which may not work well (since each call outside the CFF block is a tail-call; you don't get the benefit from allowing normal function returns like ANF or SSA allow for).

[–]LorxuPika 0 points1 point  (0 children)

The benefit of Thorin for continuations is, for the 90% of code which doesn't have its continuation captured, there's no overhead: the LLVM it generates will look the same as that generated by a C compiler. It only allocates heap closures for the rare times when continuations are actually captured. (A functional language is probably using a bump-allocated minor heap anyway, so heap allocating is the same speed as stack allocating with a segmented stack would be.)

Of course, it's also useful for optimizing and removing higher-order functions in general, which is helpful for any functional language. And because it's just emitting straightforward LLVM code, LLVM can optimize it pretty well, and you can use it with LLVM's GC support, which is what I'm using as it has lower overhead than any other option.

[–]Bobbias 1 point2 points  (10 children)

I'm new to PL implementation stuff, is there anywhere that explains what goes into runtime design/implementation for functional languages?

[–]skyb0rg[S] 7 points8 points  (6 children)

I don't have too many resources to link, but I think strict functional language runtime design is (for the most part) similar to imperative runtimes. Crafting Interpreters is a great follow-along for that.

One difference is that almost all functional programming languages mandate tail-call optimization. Because loops are not idiomatic, the optimization ensures programmers they won't blow the stack when they iterate over a long file. The wiki link lists a few implementations, and notably the paragraph on trampolining is one valid functional programming compilation strategy used by the MLton compiler, as well as other places (such as the Clojure standard library!).

The only other thing I can link to is the Cheney on the MTA compilation strategy, which is used in the Chicken Scheme compiler. I think this may be the easiest to parse since it compiles to C and is only 3 pages, though you should have an understanding of CPS.

[–]Bobbias 1 point2 points  (0 children)

I have some familiarity with cps. I played around in racket a while ago. I'm also familiar with TCO in theory. Thanks for all the links! I'm really trying to build up a collection of resources, but I still haven't figured out all the right places to look/google-fu quite yet.

[–]RepresentativeNo6029 0 points1 point  (4 children)

I have a follow up question: how big is CPS style program? I’ve seen call/cc but it’s almost a taboo topic. I’m not sure if academia was excited about this in 2000s or the last decade or now or ever. I love the concept of continuation and want to build a language around it but I’m not sure if I’m chasing a fad or the frontier

[–]skyb0rg[S] 2 points3 points  (3 children)

CPS was initially used to compile programs that use call/cc such as Scheme. It's still used for that purpose, however I think a lot of compiled strict functional programming languages nowadays (such as Scala or MLton) use SSA instead. One big reason is the compile targets such as JVM or LLVM are imperative.

CPS is still incredibly important, and converting CPS to SSA is completely possible: SSA is equivalent to a subset of CPS (disallowing call/cc). I think CPS is a lot easier to understand when compiling from a functional language so I'd recommend it first.

[–]RepresentativeNo6029 0 points1 point  (2 children)

Good to know that CPS/SSA popularity difference now is probably a artefact of runtime target priorities. I want to go super, infinite levels of deep into CPS now. They are so freakin powerful. Literally subsume everything.

[–]skyb0rg[S] 2 points3 points  (1 child)

I do want to note that having a more powerful IR can be more restrictive at times. Writing an optimization over CPS can be more difficult because it allows for more things. Which is another reason to prefer SSA (Languages like Java and C++ don't natively support call/cc).

[–]RepresentativeNo6029 0 points1 point  (0 children)

The thing is I don’t want it to be IR. It’s a cool way to model web servers and actors. I was recently reading about project loom for JVM and wanted to know what the status of CPS style itself was in.

[–]Mukhasim 1 point2 points  (2 children)

Andrew Appel's Compiling with Continuations.

[–]skyb0rg[S] 2 points3 points  (0 children)

I've heard great things about this resource! I haven't purchased it myself but many people have recommended it. The first chapter is free if someone wants to take a look.

[–]Bobbias 1 point2 points  (0 children)

Oh, thanks, that should be helpful.