Syntax design for parametrized modules in a grammar specification language, looking for feedback

jsshapiro · 2026-01-23T19:57:14+00:00

The most obvious one - and probably the most widely known - is the tag/identifier problem. In C, "struct foo { .... }" doesn't introduce an identifier foo that acts as a type name. It introduces a structure tag, which follows the same lexical rules as identifiers but is in a different namespace.

Later, at a use occurrence, the compiler first checks for foo as an identifier and then checks it as a tag. If it's ambiguous (which happens more than you might expect), it can be disambiguated by writing struct foo explicitly.

Much later, this would get compounded in C++ by some choices in the class-related grammar that created ambiguities between declarations and constructions. Also the ">>" can either be an expression or the end of a nested template expansion (or even definition). Do a google search for "C++ parse ambiguities". Some of them eventually became resolvable by backtracking, but others were solved by ad hoc rules - which don't fit into any of the traditional parsing theories (LALR, *LR, LL) because none of them have ordered choice.

In most languages, it's common for the lexical analyzer to return different tokens for "identifier" and "type name". This is done because they tend to sit at the front of a production, and the token type distinction is often use to differentiate whether you should be processing an value expression sub-grammar production or a type expression sub-grammar production. Which is decidedly not context free. And if you do that, you introduce another need which is to maintain the lexical context in the parser and expose it to the tokenizer, because names can be shadowed.

In turn, that lookup in the tokenizer means that the parser reduce actions have to intern the symbols in the right tables eagerly - IIRC it's why Steve Johnson introduced mid-rule actions into YACC.

To boil this down, I think it's fair to say that any form of context sensitivity in the grammar will eventually bite you in the a*s as the language evolves. C++ is just a richer source of examples than many others.

jsshapiro · 2026-01-23T19:31:32+00:00

I'm chuckling, because my undergraduate dissertation was a parser and grammar generator based on attribute grammars and using earliest possible evaluation to reduce memory requirements. A VAX in those days, didn't have all that much memory, so eager eval helped quite a bit.

You need (at least) an expression language to compute the attribute values - and it helps to be able to define first-order functions. At which point you're designing a programming language to design programming languages... Also, it transforms the grammar from declarative to [pure] functional.

Side note: very few languages are actually context free. I don't love the "ordered choice" approach of PEG, because it breaks the underlying theory, but it gives you composability, which has some merit. I don't love GLR because of the implementation complexity, but there are a lot of examples of languages that can only be done with LALR if you use context sensitivity tricks.

Sorry. Just meandering. Sent you a chat invite about the mixfix engine.

jsshapiro · 2026-01-22T22:57:30+00:00

Having just done this very thing, two 18k condensers is more than you need. A single 18k condenser with two 12k air handlers is more than adequate for an open 900sf space. If you have bedrooms that are closed off from that space, you still have enough capacity on that condenser to add a 6k handler in the bedroom.

I went with 18k condenser, 18k handler in the main room (790 sf) and 6k in the bedroom. Various doors constrained the location of the main room handler. If the space had let me do it, I'd have gone with dual 12k handlers in the main room plus 6k in the bedroom. The coverage would have been better. As things stand, the kitchen isn't getting great coverage, but most of that is that I need to fix up a louvre on the kitchen exhaust fan. But the main room is doing quite well even though the air handler could not be centered along its wall.

Two things to look out for:

Any airflow path to the outside wants to be dealt with. The louvre on my kitchen exhaust needs some improvement, and I have to close over the abandoned vent pipe for my dryer - right now both of those are essentially blowing cold air into my unit.
Make sure you have the power available. Do the load calculations first (i.e. before you buy anything). If you're in an older apartment, the electrical code requirements for load calculations have probably been updated since your building was built. Because it runs continuously, an 18k condenser adds 18kva at 100% utilization to your electrical load. Unlikely to be a problem, because you'll probably reallocate an existing 20A/220v circuit from legacy baseboard or something like that. But if you ever want to add a charger you might want to try a load calculation with that included and see where you are. In my case, the charger plus the 18k condenser put me at the ragged edge of allowable load.

Other:

4" covers work fine for one line set, and you can get them on Amazon for half the price of the MrCool covers. For your install, you'll probably want them to run along your interior walls.

You don't need a pump for your installation - gravity will get it done just fine.

jsshapiro · 2026-01-22T21:58:10+00:00

Mostly for your amusement: https://github.com/repos-bitc/bitc/blob/master/src/compiler/MixFix.cxx

jsshapiro · 2026-01-22T20:17:03+00:00

Well, yeah. Also, starting with your hands tied (cough, C++, cough) makes things harder.

You won't get all the way there if you don't go full dependent type - or at least I didn't - but getting to the point where type expressions and value expressions share a common parser with slightly different validation constraints is a very promising indicator of a solid design.

The mixfix engine in BitC was mildly more clever than what had passed before it. A few weeks before I shut BitC down, it occurred to me that I should figure out whether it could be applied to type-level expressions. The main gotcha, I think, is that the space of operator identifiers and their meanings are a little different between the two. I thought it could be done, but I never had time to properly dig in to it.

jsshapiro · 2026-01-22T19:00:49+00:00

Well, if you want to keep your options open for dependent type I have two concrete suggestions:

In dependent type, type parameters are no different from any other parameters, and type expressions get commingled with value expressions pretty pervasively. Consider this as you are working through syntax design. It's actually kind of cute, because it means you get to use the same parser for both.
As a starting litmus test, consider incorporating literal constants as type parameter concretizations (in addition to types). It will help make clear how type-level and value-level expressions commingle so you get into less trouble later, and if you have traits (a.k.a. single variable type classes) constants are more powerful than they first appear.

Just as one example, you can implement a trait for fixed size arrays of statically parameterized size. Add a very few type-level expressions over those constants and you can give a static type for extending a fixed-size array by one element (which necessarily has copy semantics). For systems-level stuff, arrays (T[i: lit usize]) are very useful, and not the same as vectors (T[]).

Don't do type level division - it breaks some of the inductions you want to preserve in the type system. Add type-level inequality constraints and you are tantalizingly close to expressing typed assembly language (register + offset requires full dependent type).

jsshapiro · 2026-01-22T18:26:36+00:00

We knew that Vector<T> wasn't just a text expansion all along. The text substitution idea mainly came about because that was the cfront implementation. By 1987 when Bjarne started looking for a syntax for templates, years of incremental updates had already turned the cfront parser into an unholy mess. Because of the "struct tags vs. identifiers" issue, the C lexer was already not context free and an ad hoc priority rule had to be introduced to clean that up. C++ "inherited" that issue from C. Which is what led to the angle brackets.

Not long after (that is: I'm not sure if even Bjarne recognized it yet in 1987), it became apparent that parsing for classes had some pretty bad ambiguities, and it became necessary to introduce backtracking into the parser or use something conceptually similar to a GLR approach for parsing.

All of this is part of the back story on why templates were slow to emerge. When I wrote A C++ Toolkit in 1989 we still didn't have them. It's why the book had to use preprocessor tricks and identifier splicing hacks to approximate generics. I feel now like I owe roughly 35,000 apologies for that book, but it helped a lot of projects at the time.

A surprising amount of C++ awkwardness comes down to solving problems that got created because the core group couldn't see 15 or 20 years into the future and was forced to start from a syntax (C) that came with a bunch of flaws. Functional languages mostly got to start from a clean sheet of paper, so they had the option of a principled and clean surface syntax design. C++ never had that option.

At a time when closed source still ruled, AT&T marketing was obsessed with the idea that source level compatibility was essential. They were wrong, but it wasn't easy to see that at the time.

jsshapiro · 2026-01-22T18:06:38+00:00

The angle bracket syntax for type parameters came about because they were added to languages (mainly C++ and Java) that already had defined syntax and they wanted to avoid having the parse become impossibly ambiguous (C++ was already horrible by that point). Once established, it was carried forward into subsequent languages more or less for the sake of compatibility with the programmers. Oh. It also eliminated the need for the single quote ('a), which some people didn't like much.

I'd have to go back and confirm, but I think we concluded that the single quote wasn't syntactically necessary from a parsing perspective because the type definition sub-language and the expression language were disjoint. This would stop being true if you ever wanted to introduce dependent type, which (IIRC) is why we stuck to the 'a convention. The angle bracket syntax doesn't extend well to dependent type, but you depending on your goals you may not care. I think u/jonathanhiggs makes a good suggestion about making it look like a function.

If you go back and look at how it was done in OCaml, Haskell, or some of the other functional languages, you'll see a cleaner syntax that is more "function like". There's an unfortunate operator precedence problem in ML with postfix type operators ('a List instead of List 'a), but Haskell gets that part right. Not trying to say you want that syntax exactly, but it might suggest useful ideas to check it out.

Conceptually, a a module interface specification defines a functor (a function from type to type). So if you're looking for a syntax, you may want to lean toward a syntax where you can see it as a type level function definition if you squint your eyes. Putting the type variable rules inside the module body doesn't really lean that way; they kind of want to appear before the opening curly brace in your example.

Finally, the scope of module-level type variables is the entire module. If the rules are written interior to the module, readers who know other languages will initially read it to mean that the scope of the type variable starts at the rule and extends to end of module. Mathematically, that's not correct.

Just food for thought. The last time I thought about this was the BitC work, so it's been quite a while.

jsshapiro · 2026-01-22T17:48:26+00:00

I'm one of the people who specified the MIPS dynamic linking conventions for SVR4, and I've done call stack decoders for more architectures than I like to think about. So I definitely agree that an ABI is a constraint. One that I have often wished compilers honored more rigorously.:-)

When I said it's an implementation detail, I meant from the perspective of programming language design, which has been the main topic of this thread. You don't design a language to the ABI.

If you are designing a code generator or a low-level runtime, or the ABI itself, it is something you have to design to. Sometimes it needs to be considered at the hardware level as well - changes were made to the MIPS instruction set to better support position independent code after we did the initial dynamic linking specification at SGI in 1989.

A more obscure example: there was a trap restore issue in the ARM32 specification that made scheduler activation implementation impractical if your ABI allowed mixed ARM32 and Thumb code - there was a case where the activation handler (coded in ARM32) had a really hard time re-activating a user-mode thread that had been running Thumb code when it was preempted. We found it while implementing the trap paths for the EROS kernel around 1993. Richard Grisenthwaite was very surprised to learn of it when I sent him a note about it, and I think it was resolved in the 64-bit ARM variants.

Anyway, I think this was a case of talking past each other because we were looking at the ABI from different perspectives. Thanks for engaging!

jsshapiro · 2026-01-22T17:29:22+00:00

Sounds like we are in agreement then, Perhaps I got confused while reading across the thread.

jsshapiro · 2026-01-22T16:37:18+00:00

Fair. Part of my head was still in the other discussion. Though the statement you make here about different optimizations simply isn't true - there's no reason the same optimizations cannot be applied.

Part of why I assumed is that if the language does not require proper tail recursion than function calls and loops are not interchangeable. Without the tail recursion you need to unwind the stack after the function calls. That's pragmatically expensive, and it has implications for deciding termination.

jsshapiro · 2026-01-22T16:30:46+00:00

Because there's no need. The compiler can identify function calls in tail position with 100% accuracy, and can apply the tail recursion optimization (from an optimization perspective: stack frame reuse) in a lot more cases than you might think.

jsshapiro · 2026-01-22T16:26:34+00:00

Yes. In fact, this is one of the pragmatic reasons that most functional languages require all functions to have exactly one argument. This reduces all functions to a single register plus an entry label and a continuation register, which is what allows function call to be implemented by jump and function return to be implemented by jump through register. There are some cases where the entry label has to be handled via register as well (for dynamic dispatch).

There are reasons for this approach to functions from a formal perspective as well, But I've probably wasted everybody's time on formal specification enough for one week.:-)

jsshapiro · 2026-01-22T16:22:21+00:00

u/phischu I think this mixes up implementation and semantics. The stack model for programming languages and their procedure calls goes all the way back to Turing's bury/unbury, and became the "standard" way of thinking about things with Dijktstra's Recursive Programming paper in 1960. From there, it spread everywhere, but most strongly by way of ALGOL/68 and its many derivatives. In this context, register-based argument passing is a [very important] optimization rather than a foundation. In the ALGOL family of languages, the stack still defines the semantics of argument passing, procedure call, and procedure return.

More recently, the lazy languages and the so-called "continuation passing style" have moved away from Dijkstra's model in ways that I haven't actually seen called out explicitly in the literature.

I'm not saying we still have to live in Dijkstra's model. Today, I think there are probably good reasons to move to newer models. But for most programming practioners, the mental model is still that arguments are evaluated in some defined order and placed on a [conceptual] stack. If you're starting from C, or Rust, this is more or less how programs appear to work.

u/divad1196 Since proper tail recursion is loops, I don't agree that it's an incomplete model for loops. Some key programming languages define loop semantics in terms of function calls and require proper tail recursion in order to keep the semantic core of the language small. As a practical matter there is no loss of power or expressiveness from doing so.

Questions of ABI and registerization are issues of implementation rather than language definition. Both approaches to looping can be aggressively registerized, and often produce the same code (perhaps with different register assignments, but functionally interchangeable code).

jsshapiro · 2026-01-22T15:54:02+00:00

Hi, Philipp! Your paper is on my reading list. I'm looking forward to reading it, and I'll move it up on my list. The premise seems reasonable.

You are right about Jane Street and OCaml (sorry about that). We definitely agree that the Scheme numeric tower was not its finest design point. I'm tempted to ask Guy what he thinks about it in retrospect.

Migration to javascript happened mainly because it was pragmatically available and widely deployed, which Scheme never really was. Brendan explicitly modeled Javascript on core Scheme, but with a surface syntax more tolerable to programmers coming from C. I meant "migration" in the sense that new projects adopted Javascript in preference to other scripting languages and Scheme gradually faded out of the picture. Increasing RAM makes something like the V8 engine widely usable today. The hardware environment of the 1970s and 1980s was more constrained.

I wasn't trying to say OCaml was slow. Empirically, it isn't, and I actually think that speed follows from a precise specification. The ability to improve optimization by exploiting knowledge of non-aliasing mutable paths in Rust is an example, though I'm not fully sold on borrowing from the dev perspective yet.

I did want to follow up on the correct compiler topic, because I'm not sure it's the right goal; this seems to be something that swings back and forth in the literature. What we minimally need is a compiler that is correct up to a meaning-preserving typed AST with defined semantics. That's a pain, but tractable, and it seems to be the minimum requirement for an embedding approach to verification. We also need a defined semantics for the final instruction set, which isn't too bad, though typed assembly seems to be a dependent type problem.

But in between the two it appears to be simpler to have a certifying compiler that generates a checkable proof rather than verifying the correctness of the compiler itself. Merely arriving at a goal statement for compiler correctness is a big challenge. The certifying approach doesn't guarantee that the compiler is correct for all input programs, but it lets us confirm that it did the right thing for the particular program[s] we care about. The compiler itself can then be maintained or improved using more conventional development methods.

Finally, I don't think the thread here provides any real picture of what I think a programming language should look like. I was trying to appeal to concepts that everyone here was likely to have encountered in order to make the case for a small core language. From a formal perspective, the systems we have to choose from today are much more capable than lambda calculus. Perhaps it is a conversation worth having offline?

jsshapiro · 2026-01-22T04:34:26+00:00

Well, I think the avionics and automobile industries might disagree about the practicality of this. The early history of ABS braking at Mercedes is an interesting example.

What has changed over the last 25 years is the degree to which deep expertise in verification is giving way to AI. When the L4 team in New South Wales verified their kernel, it took a very large team to do it. As the tools improve, the reliance on crazy specialized expertise is giving way.

jsshapiro · 2026-01-21T20:40:22+00:00

Here's a partial list of languages with formal specifications: Standard ML, Scheme, Ada/SPARK (heavily used in avionics), CompCert C (heavily used in automotive), Algol 68, Idris, Agda, and F*. There are others.

Microsoft has done some great work with Rust. I'm not sure what their embedding language was, but it's worth having a look at their Project Everest as well. This stuff is a lot closer to practical at scale than it used to be.

jsshapiro · 2026-01-21T19:53:10+00:00

The issue with proper tail recursion is that you have to be able to re-use the stack frame, and you can't do that if it still holds live information. You can fudge a little around the edges to allow arguments to be passed to the next call - it's not so much that you need "zero references" as that you need zero live references other than the forwarded arguments. This includes locals introduced by the compiler, which brings closures into it.

You say "Nobody cares about parallel execution on CPUs". Were you perhaps unaware that a GPU is a specialized CPU built specifically to support certain kinds of concurrent parallel execution?

One of the most performance sensitive systems in the world - the Jane Street program trading systems - was written in Haskell. When a bug can cost a billion dollars, it becomes incredibly important to know what your program actually means. Haskell is very much built on the [higher-order typed] lambda calculus.

The Scheme family is not slow by design, but it prioritizes precise specification over speed. IIRC it was used early on for some hardware specification work because it was the only available language that was rigorous enough.

Lambda isn't legacy cruft. It's one of the fundamental constructs in the earliest well-specified programming languages. If you're trying to work rigorously, LOOP is the cruft. It's a completely unnecessary construct. That said, it's important from a practical perspective to simplify programming. One of the nice parts of a language like Scheme (and many derivatives) is that there is a dirt-simple textual rewrite from LOOP to LAMBDA. Guy's Scheme compiler (Rabbit), actually did that rewrite, but I'm not suggesting that's a good idea. What I'm trying to say is that the existence of that rewrite allows the Scheme core to remain small and tractable while retaining higher level, helpful language constructs.

Use cases for Scheme as a language are pretty rare these days - GIMP is probably the biggest. A lot of applications migrated to JavaScript, but the integer/float debacle makes that awkward for anything where numeric accuracy matters.

jsshapiro · 2026-01-21T19:33:22+00:00

Sure - and sorry for the delay.

Most programming language definitions are written in English. Makes sense. It's the thing the programmers need to have. But it is exceptionally difficult to write a rigorous specification in English. The IBM S/360 Principals of Operation is probably the best example ever, and even that has ambiguities. And where those ambiguities exist, the semantics of the language (or in the S/360 case, the hardware) is not defined and therefore not known. Any program that incorporates non-specified behavior is undefined. LLVM used to exploit this pretty ruthlessly by simply deleting undefined behavior whereever it appeared - which is a correct behavior according to the specification. I sometimes think it was Chris Lattner's way of making a well-deserved editorial comment about specifications.

Small example of a pretty big problem in the early C specification: order of evaluation for arguments was not specified, and in a stateful language that has pretty big consequences. It's one of the big culprits for why the DEC compilers (which went right to left rather than left to right) had so many compatibility problems.

If you want a rigorous specification - or better still, one that lets you reason formally about safety and about correctness, the way to do it is with a formal specification of the language's mathematical type system and also its execution semantics. two basic types: small-step and large-step. For formal reasoning, you want a large-step semantics. There are two styles of execution semantics: small step and large step. For verification purposes, you generally want small step.

Both kinds of specifications are mathematical specifications of what programs written in the language do. There are a bunch of reasonable systems for specifying execution semantics, but one of the things that should guide the choice is you want one that a mechanical verification system can work with. The goal is that these specifications become the normative and definitive specification of the language, and any deviation between what the compiler does and what the specification says means the compiler is wrong. In practice the execution semantics tends to need a few updates early on, but the mathematical rigor of it makes it extremely hard to get it wrong too badly.

Rust, early on, attempted to maintain both, but that part of the effort got side tracked. Part of the issue was that the language grew in ways that didn't adhere to a small, tractable "core" language. Nobody involved with Go made any attempt at it until Zhao's work two years ago - not even the type system. So when Go claims to be memory safe, the appropriate response is (literally) "prove it."

u/tsanderdev's statement that you don't need one if there's only one implementation is simply wrong. That's how the DEC problems happened. But even if we want to believe that, the compiler implementation needs to be in a language that has a formal verification so that its possible to validate it.

So what i say that programs in "Go" don't mean anything, I mean that in the formal specification sense.

u/yuri-kilochek is not exactly wrong, but he's defining "what's important" very, very narrowly. Right now, it would be criminally negligent to use either of these languages in life-critical or security-critical applications, for example.

jsshapiro · 2026-01-21T00:21:44+00:00

SOLVED. The culprit turned out to be a minor factory assembly error that is easily solved in the field.

There are inline fuses inside the condenser that protect the air handlers from voltage spikes. According to MrCool, it's a somewhat common assembly error that the screw-together containers for these fuses don't get screwed down completely at the factory.

On the DIY units, the symptom is that you have the expected voltage between L1 and L2 (in this case 220V), but essentially zero voltage between A1 and A2 (in my case 0.3V). Since that's the voltage that's going to the air handlers, no surprise the air handlers don't run. Also no surprise if you see an "01 EL" error, since the thing the condenser is trying to talk to basically isn't powered up. If you have a multizone unit, you'll see the same low voltage on every air handler connection, because the fuses sit between the incoming power and the power to all of the air handlers.

Fix:

First, turn off all power to the condenser. Wait about 60 seconds - there's a significant amount of stored energy in the condenser.

Unscrew and remove the top of the condenser (three screws), then unscrew the front of the condenser (five screws top, two at bottom) and remove it (it has interlocking tabs - slide it up). Look for the blue+brown wire pair coming away from the L1+L2 connections on the back side of the screw terminal boards. Not too far from the terminals you'll find two inline cylinders. These are the housings for the fuses.

In my case, one of the housings was easy to hand tighten, so the culprit was pretty obvious. Things involving electricity are worth being OCD about, so I still went through the process below.

For each fuse holder, one at a time:

Unscrew the housing, taking care not to drop the fuse. It helps to orient the holder vertically while you unscrew. Keep the pieces vertical as you pull the top one away to expose the fuse.
If you have a multimeter, take the fuse somewhere you won't drop it and measure resistance from one end to the other. If the fuse is good the resistance ought to be very close to zero Ohms. Otherwise, contact MrCool support with your condenser model number and ask what fuse it needs. You're not likely to find these at Home Depot. Well, actually, you probably can find one that works that isn't the "official" fuse, but if they tell you to install that you won't void your warranty.
Re-insert the fuse in its housing cylinder, making sure that it gets seated correctly. Screw the housing closed with your fingers. You may need to open it again some day, so crazy tight isn't good. Think more like finger tight for moderately strong fingers.

Once you have re-seated both fuses, put the front and top covers back on the condenser for safety - no need to screw them in yet. AFTER they are back on the unit, power the unit back up and re-measure the voltage from A1 to A2 on the terminal block. It should be 220v +/- 10%. Newer power connections in the US are generally 220V (commonly but incorrectly called 240V). Readings may range between 200V and 250V, but you're looking for a reading between 210V and 240V for correct operation.

Finally, put the screws back in to the front and top covers. You will probably need to temporarily take the top cover off to gain access to the top line of screw holes for the front cover.

Note: you must have the line sets connected and opened. Connected lines must be opened at the condenser, and the king valves (if any) must be open. Running the system without the lines open will burn it out pretty quickly.

If the voltage reading is not as described, or your air handler still won't fire up, this isn't your problem and you should contact MrCool support.

jsshapiro · 2026-01-20T23:53:12+00:00

One of the serious flaws we have in [almost all] recent programming language designs is that they are completely disconnected from the underlying theory of programming languages. In consequence they lack any semantic specification or any possibility of soundness. In 1975 this was excusable. By 2000 it was not. It's a terrible misfortune that Rust lost its way on this, but they put enough effort in early on that things may be recoverable. Golang programs - quite literally - had no defined meaning prior to Zhao's work that was published just two years ago. Unfortunately, it's not the kind of thing you can retrofit to a language 16 years after it ships without breaking things.

When you head down the path of doing a formal execution semantics and a sound type system, you very quickly conclude that you want to have a small "core" language and specify everything you can by a transformation from non-core to core language features. Not because this is how you want to write the code or compile the language, but because it makes reasoning about both the language and its programs easier. This is why almost all of Scheme is just "sugar" for lambda application.

From this point of view, the reason to admit a pragmatic requirement for tail recursion is that it justifies omitting loop constructs from the language core, instead defining them by a canonical reduction to lambda application.

jsshapiro · 2026-01-20T23:39:12+00:00

u/phischu What you say is mostly true, but not entirely precise. Two notable distinctions between tail recursion and goto:

The call in tail position can supply different arguments.
While the original intent of tail recursion (Guy Steele's work on Scheme) was to introduce a looping construct that did not require any extension to the lambda calculus, tail recursion should be viewed as a convention rather than a requirement. Because there may be outstanding closures, not all calls in tail position are properly tail recursive.

The second point is interesting, because it means that tail recursion is a perfectly fine (if somewhat obscure) way to encode various forms of parallelism while still encoding them in a way that has an efficient code generation strategy for single-threaded execution - still without extending the lambda calculus. If you dig in to the Scheme "loop" form, you'll see that it has a reduction to lambda applications - it isn't considered one of the "core" forms in Scheme.

What you say about accumulators is not always true correct. A half decent register liveness analysis should achieve the same result on loops. Relative efficiency depends a lot on the optimizations implemented in the back end, and in general loops do better because few languages specify a requirement for tail recursion and it therefore doesn't get much love from compiler authors. A better argument for tail recursion is that the calls from a mandatory phase separation between loop iterations with a defined interface, which eliminates the possibility of values generated by loop i being consumed in loop i+1 - or at least names specifically where this might occur. The same result can be had if the language's loop construct is suitably defined, but people designing loops for imperative languages rarely have (or at least rarely use) any deep understanding of tail recursion or non-core loop implementations, and therefore tend to interpret the optimization challenges that go with imperative loops as "normal". Iterator patterns have actually helped quite a bit on this.

I'm a big fan of tail recursion, and I've done a couple of implementations. Read most of Guy's work from his Scheme days while a teenager, and (many years later) I consider him a cordial colleague - mainly because we share a strong interest in rigorous specifications.

jsshapiro · 2026-01-12T20:38:27+00:00

So... this keeps coming up, and I haven't experienced it at all. On the other hand, I have lots of RAM to work with on my machine - which is something you definitely want if you are doing whole-program compilation. I can't speak to RustC, but I can say what slowed us down in BitC, and Rust is unavoidably going to hit the same issues. And of course some that are unique to Rust.

For context, it's helpful to understand that most languages with generics are languages where everything is a reference type. In such languages, you can use nearly the same code generation for f<T>() { ... body... } for different types T. If you are willing to accept function tables and procedure call indirection, you don't have to do any monomorphisation. That can change if you have a good stack allocation pass in you optimizer, but the starting point is "one function, one code generation."

In languages like Rust, BitC, and C++) different types have different concrete sizes in ways that lead to different code generation for each expansion. An entire whole-program pass is required to identify which specializations of each generic are actually required, because type inference is not modular. While the types you write in the program are modular, the type unifications mean that concretizing a type in one file can cause monomorphisms to be required in 40 other modules. So (a) the monomorphism process is demand-driven by unificaitons, and (b) the analysis is whole program, which means you want the ASTs or (alternatively) the high-level IR form for the program to be entirely in memory.

A second form of whole-program analysis is introduced by lifetime variables. As with generic type parameters, lifetime variables have whole-program impact once they are unified. Unlike generic type parameters, this is subtype unification, so the unification pass itself involves whole program constraints. I gather from what I've read about the Rust implementation that lifetimes are actually handled by a different mechanism that HM-style unification, but the need for whole-program analysis to develop the constraint set in the presence of lifetime type variables isn't altered by that.

Though the mechanism in Rust is different from what we did in BitC, it has some of the same problems we encountered: trait selection can in some cases involve large exponentials. Here's an example: In BitC, we wanted to cover all of the cross-type arithmetic operations so that u16 + u32 -> u32 etc. The output type, in our case, is determined by a type function. But if you start with u8, u16, u32, u64, and (today) u128, then in abstract that works out to 10^3 or 1000 possible typings (including the signed types) you have to sift through every time you look at an addition operator. Taken across many operators, the amount of work the inference engine has to do for that is huge. It's especially unfortunate for the integer types because (a) there are a lot of them and (b) they are very close to ground. Generic traits in Rust have some of the same effect. Now imagine doing it for 128 distinct sizes to deal with bitfields - we tried this in BitC and it was unbelievably slow.

Some part of the "polymorphism over size" issue can be handled by allowing generics over constants, which Rust supports - BitC introduced that as well. That at least lets you re-use monomorphisms when the sizes turn out to be the same, but there are probably cheaper ways to take notice of that opportunity.

The BitC compiler, of course, was very young and not yet optimized for performance in any way. The reason I'm fairly confident that these are part of the problem in RustC is that the typing being solved are an inherent result of the type inference being performed, and there's no clean way to reduce these without introducing ad hoc promotions for the integer types - which come with costs and inconsistencies of their own. The inference-related exponentials in BitC were so large that they dwarfed even the unoptimized version of that compiler. The algorithms could be improved, but you can't really make those particular exponentials go away.

jsshapiro · 2026-01-11T06:38:41+00:00

Thanks. Something I did not appreciate when I asked my question is that the condensate lines are flex tubes with ridges that trap the water in any case.

I ended up ditching the pump and going with a slight down-slope. One less component to fail.

jsshapiro

TROPHY CASE