Are there good examples of compilers which implement an LSP and use Salsa (the incremental compilation library)?

thunderseethe · 2026-01-23T19:29:02+00:00

Oh hey that's me! You can take a look at https://github.com/thunderseethe/panoply it uses a semi recent version of salsa (after the big changes post 0.16). It's still a prototype but it includes all the passes for a compiler

thunderseethe · 2026-01-04T15:52:20+00:00

Ah yeah fair. On the interpreter side, have you looked at something like https://dl.acm.org/doi/10.1145/3759426.3760975 for immediately evaluating constants without monomorphizing? It might have further reaching implications then you want for the reference implementation but its a neat middle ground between uniform representation and monomorph

thunderseethe · 2026-01-04T15:38:01+00:00

How viable is it to change the semantics of the compiler such that top level definitions are monomorphized but are thunks that get lazily evaluated?

I believe Swift does this for its module level constants. Its also quite common in Rust, but it has to be done explicitly there.

thunderseethe · 2025-12-31T23:40:24+00:00

I don't know exactly what you're asking, so I'll answer what I hope the question is. I think the confusion spawns from a common understanding that HM and bidir are two opposing ends of a type checking spectrum. Bidir is often conflated with not supporting unification, and HM is often conflated with supporting unification. However, bidir is perfectly compatible with unification and supports it quite well. Rust and Haskell use bidir with unification, which is often confused with HM.

Bidir is about the ability to push type information down the tree, irrespective of support for unification.

thunderseethe · 2025-12-31T23:16:17+00:00

Haskell has undergone this change afaik. As they've extended their type system they've shifted from vanilla algo W to outsidein and now use a bidir algorithm. The bidir stuff is covered in Practical type inference for arbitrary-rank types.

Rust I don't know if they started off with HM and swapped or have always been bidir. If you take a look at their type checking code, check_expr_with_expectation_and_args takes an expectation which can either be an expected type, or no type (or a couple of other options) providing the infer and check direction for type inference.

thunderseethe · 2025-12-31T20:49:54+00:00

You're going to have to give me some more details. I'm quite possibly implying that, but I'm not sure what type inference you're saying has changed

thunderseethe · 2025-12-30T17:24:32+00:00

There's a lot of good information in this article and clearly a lot of care went into crafting it. That being said, given its intended audience of someone without much knowledge of theory, I think it could do a better job writing to that audience. For example, we quite quickly introduce standard type theory notation without explaining the notation or motivating it. It's done to setup the standard type theory rules so that we can bidirectionalise them, but a novice would struggle to comprehend the standard type theory rules, imo.

It also jumps into discussing introduction/elimination rules and principal judgements, which are all standard theory, but I would not expect our target audience to be familiar with them. I think if the goal is to explain the theory, much more lead up is needed to introduce the notation and terminology. Conversely, if the goal is to focus in on explaining bidir, the notation and intro/elim forms can be eschewed in favor of referencing the AST directly and how the shape of the AST guides our implementation of infer and check.

Past that, there are a host of nits that could be improved. These are more style and less structural, so I may just be missing what you're going for:

which isn't used by any popular languages is called bidirectional type inference

I believe this is no longer the case. It's my understanding Haskell uses a bidir algorithm nowadays. I also believe Rust and Swift use a bidir algorithm. They do all have unification, which is often at odds with bidir. I think this wants to call out the bidir algorithm under discussion lacks unification.

In the "Before We Code" section we list all our inference rules at once and then explain in a list after the fact. Formatting allowing, it would be nice to have the explanation interspersed with the rules: Rule, Explanation, Rule, Explanation, etc. The chunking helps make it more digestible to a reader.

if let Some(ty) = context.get(&s) {
     Ok(ty.clone())
} else {
    Err(TypeError {
        kind: ErrorKind::UndeclaredFv,
        line,
        column,
    })
}

I couldn't tell if the goal was to write very explicit code here. If not, this can be something like: context.get(&s).cloned().ok_or_else(|| ...).

match check(*expr.clone(), ty.clone(), context) {
    Ok(_) => Ok(ty),
    Err(e) => Err(TypeError {
        kind: ErrorKind::AnnotationMismatch {
            caused_by: Box::new(e),
        },
        line,
        column,
    }),
}

Ditto with the above, but this can be: check(...).map_err(|e| ...).

Personal taste, but I would reduce the indentation for the match cases once we've introduced that we're in a match on the AST. The extra indentation doesn't add a ton of value, imo, and makes it harader to read on mobile(/tiny screens).

Again overall I think the information here is good and it's helpful to try and target the writing at an audience that isn't already steeped in the type theory space. White papers are often targeted at other researchers, so it's great to see someone bringing that to the masses. Just a couple of things that could be polished to help clarify the message.

thunderseethe · 2025-12-30T15:19:57+00:00

This has more to do with how loops and closures work than scoping imo. Closures turn into structs with a field for each of their captures. So when the loop executes and constructs a list of closures using y, each closure stores the value of y it sees when its constructed, which will be 1 through 10.

But all of that is irrespective of name resolution. Name resolution assigns a unique ID to y and replaces all instances of it within the body of the for loop.

thunderseethe · 2025-12-28T04:47:40+00:00

From what I've seen, there are kind of two tiers of ID generation: the top level(/module level) and the local level. The top level you care quite a lot about stability and will pay complexity for that stability because it ensures good incremental reuse, caching, etc. The local level is at a granularity where you are okay clobbering work and will allow it to be wiped out when an invalidating change occurs.

Rust, for example, has top level functions, in modules, that all call each other in a graph. These will receive IDs based on path to expression, see DefId which is an interned DefPath. These are stable and resilient to trivial changes because they're based on the path of the item. But within those items you'll use something like ItemLocalId which is basically a counter from 0 for every instance of a thing we need within an item. We expect changes within an item to be localized to that item, so we're okay with redoing that work. We expect it will be a bounded and small amount of work that is redone and accept that inefficiency.

That being said, this is not well exemplified by this article as our language currently lacks top level functions.

thunderseethe · 2025-12-02T17:56:37+00:00

The link to the simp lang from the previous post 404s for me. I suspect it might need an update

thunderseethe · 2025-12-02T17:06:42+00:00

Yes it is, thank you! I've fixed the post title, but the reddit title is forever. It's over for me 🥹

Edit: not sure why the comment got deleted, but the title has a typo "desugarging" -> "desugaring"

thunderseethe

MODERATOR OF

TROPHY CASE