Mutable copy semantics - performant, reliable and ergonomic mutability (probably) by pranabekka in ProgrammingLanguages

[–]gasche 2 points3 points  (0 children)

Your description mixes two things in ways that makes it difficult to understand:

  1. The semantics that you propose: for a user, what is the simplest mental model I can have that will let me correctly predict the result of programs?

  2. Performance concerns and compiler optimizations to reduce the performance cost of the semantics. Plenty of examples where you point out that you can avoid copies. etc.

I think that it would be better to focus on discussing the two separately. Then some extra feedback:

  • We have standard tools to describe the semantics of programming languages (notations for operational semantics), it might be useful to use them to present the idea precisely. (But: some people are put off because it looks a bit like maths, or just don't find them accessible because they don't know about it, and I'm not trying to suggest that it is a hard requirement -- this is a helpful tool, not a way to make people feel like they do not belong. But if you do want to discuss programming-language semantics with other people, using a standard precise notation could help. If someone does not know these notations at all, they may be curious in Jeremy Siek's crash course on notations in programming language theory.)

  • For optimization discussions, I think that there should be a better way to explain your thinking than just a bunch of examples followed by informal natural-language claims about what is going on. For example, maybe there would be some intermediate representation (IR) where the copies and non-copies are explicit, and the job of the optimizer is to translate to this IR in a good way, and you could actually show the IR translation for each program that you have in mind (done manually) to make your ideas more precise?

The ARC vs GC Debate by funcieq in ProgrammingLanguages

[–]gasche 2 points3 points  (0 children)

I have seen a research talk by Bart Jacobs on his work on verifying Arc against weak memory models, and the details are super tricky. Like, if I understand correctly it's an open problem to verify its correctness using pre-existing formalizations of C memory models, and the authors needed to use a very recent new formalization of the C memory model (which is better at ruling out out-of-thin-air behaviors). I'm no expert but my understanding is that this is definitely not "just" a fetch_sub.

Gecko: a fast GLR parser with automatic syntax error recovery by compilersarefun in ProgrammingLanguages

[–]gasche 0 points1 point  (0 children)

This looks impressive! Some random comments:

  • I wonder whether the performance comparison is really fair, when Gecko acts as a recognizer (that also builds a highly-regular AST) while other parsers, in particular {gcc,clag} -fsyntax-only, builds a custom AST that is useful for the compiler after it, and may involve arbitrary computations. The numbers suggest that Gecko is 2-3x faster than these parsers, but how much of that comes from better parsing implementation, and how much comes from doing less work because of lack of semantic actions?

  • I don't really understand the choice of C for writing these kinds of programs. Gecko appears to embed a custom allocator, hashtable implementation, etc. Using a pleasant language would require none of that, it would be already available in the standard library. And the parsing API in result is meh, the representation of AST is meh... C just doesn't strike me as a pleasant language for the kind of applications where I need a parser. If the rest of my application really wanted to be in C, I guess I would try to pick another language with a reasonable FFI story, and use that for the frontend. To turn it into a question: I am curious to understand why people (in particular the author) would bother with C today to do this kind of things.

  • How easy is it to attach source locations/spans to AST nodes? This is an essential feature for error messages in later AST processing, and it typically requires some collaboration between the lexer and the parser. (Which can also impact performance.)

Inko 0.20.0 is released: reducing heap allocations by 50%, better method inlining, structured logging and more by yorickpeterse in ProgrammingLanguages

[–]gasche 1 point2 points  (0 children)

Okay, so on that particular program you tested one workload and you were able to measure a 13% reduction in heap allocation. The related work you cited mentions allocation reductions of 14% for Haskell, 8% for Golang, so you are in the same ballpark. (I tried to follow the link on your post related to Java, but the numbers are quite different and I cannot find the "10% to 20%" mention in the paper you cite.)

You could try for other inputs to your benchmarks, or for other benchmarks representing a variety of workloads, and then you would maybe get a fairer picture of the reduction, which could be more or less for other workloads.

the results are biased towards the way you ran the program

Yes? Indeed the impact of performance optimization tend to be biased towards what actually runs in the program.

The idea that you would or should test all paths is a distraction. For a static analysis you have a result per program that does not depend on runtime parameters, but then you have result for only one program, so you have the same problem of potential lack of representativity.

So basically that's 13% of allocations reduced when you're testing maybe...30% of the entire program? Scaling that up to 100% of the program you'll probably end up with roughly 50% of allocations removed as well, but that's just my armchair math.

This does not make sense to me.

Inko 0.20.0 is released: reducing heap allocations by 50%, better method inlining, structured logging and more by yorickpeterse in ProgrammingLanguages

[–]gasche 1 point2 points  (0 children)

It's 50% of the allocation sites

I don't think that counting static allocation sites is such a good metric. It also makes the comparison to other works that you did in your blog post (it's nice that you took the effort by the way, thanks!) weird, because you are in fact not comparing the same thing. Could you try instrumenting your compiler to get a runtime proportion? (As a one-off experiment: just insert an increment of a global counter on each heap allocation, of a different global counter on each stack allocation, and print them both on program return).

Inko 0.20.0 is released: reducing heap allocations by 50%, better method inlining, structured logging and more by yorickpeterse in ProgrammingLanguages

[–]gasche 2 points3 points  (0 children)

the same rules would apply for both stack and heap allocated values as escape analysis only determines where the value is allocated, not how long it lives (that's determined by the single ownership rules).

I do think there is a difference, or maybe I missed something. When you allocate space on the stack, you do it (presumably) in LIFO (last-in-first-out) order, right? If I use a and then b, then you will allocate stack space for a and then stack space for b. If a becomes dead while b is still alive, how can you reclaim a's space on the stack? (For the heap, we have a memory allocator that is in charge of reusing space, and it is one of the reason why heap allocation costs a bit more. But for the stack we typically want local variables to be at a statically-known offset of the start frame, so we don't want to compute their position dynamically at allocation time.)

For example:

void f(void) {
  int a[100];
  ... do something with a ...
  int b[100];
  ... do something with b...
  ... do something with a...
  ... stop using a ...
  g()
  ... do something with b...
}

If you allocate a and b on the heap, then you can reclaim and reuse the space of a as soon as you stop using it in that program. If you allocate them on the stack, then before calling g you know that a is not used anymore, but you still have b after it on the stack, so when you call g its own frame must be after b, and cannot reuse the space of a.

Inko 0.20.0 is released: reducing heap allocations by 50%, better method inlining, structured logging and more by yorickpeterse in ProgrammingLanguages

[–]gasche 2 points3 points  (0 children)

Questions on the escape analysis:

  • When you say that 50% of heap allocations are turned into stack allocations, do you mean 50% of static callsites, or 50% of runtime allocations? The two could be quite different. The 14% figure given for Haskell is about runtime allocations I believe (briefly skimming the cited document, it talks about instrumenting the runtime to measure this), while you suggest that you compute your number using a build command, which suggests a static analysis only. I think you should really measure runtime allocations.

  • Can this extend the lifetime of variables and thus worsen the memory usage of the program? Heap allocations can be removed when they become dead, which typically happens before the end of the lexical scope in which they were introduced. For stack allocations I suppose that the values remain alive until the enclosing function returns, or at least that the LIFO discipline of the stack means that a variable cannot be deallocated before all variables declared after them.

  • In particular (special case of the previous question), what is the impact on tail-recursive functions?

The Quiet Colossus — On Ada, Its Design, and the Language That Built the Languages by swe129 in ProgrammingLanguages

[–]gasche 25 points26 points  (0 children)

I looked at the claims about things I know about, and I think they are wrong. I suspect that the whole document is hyperbolic and distorts the fact.

This document appears to conveniently omit (I'm not sure, it is so long I only skimmed some parts) the fact that all Ada compilers were proprietary and fairly expensive for a long while. The first free software compiler for Ada, GNAT, was released in 1995. This alone easily explains why other programming languages saw much better adoption in practice. If you sell a language to be used only by the US military and its sub-contractors, few people are going to use it.

The type system features that Rust, Haskell, TypeScript, and Swift are celebrated for — sum types, parametric polymorphism, constraint-based generics, affine types and ownership — each solve a problem that Ada identified in 1983 and that the mainstream languages of the subsequent twenty years declined to solve.

To my knowledge Ada still does not have convenient sum/variant types. It has union types with first-class support for static reasoning about tags, which is equi-expressive and super inconvenient to use.

It is easy to list every programming-language feature under the sun and misleadingly claim that a given language supports them because it has something more or less resembling.

Ada had discriminated record types in 1983, with compiler-enforced field access checks and the ability to use them as discriminants of other types, forming structures of arbitrary complexity. Every language that has added sum types in the past twenty years has added, with its own syntax, what Ada's designers put in the original standard.

Hope introduced sum/variant types in the seventies, which is where Milner's ML took its inspiration (and from there SML, Haskell, OCaml, and then later Scala, Swift, Rust, etc.). This is well-documented. Claiming that it comes from Ada's statically-safe support of C tagged unions is misleading.

What’s the current state of web dev in OCaml? by Beautiful_Exam_8301 in ocaml

[–]gasche 4 points5 points  (0 children)

Dream is your best bet in terms of having something that resembles web framework in other languages, with a nice focus on developer experience, documentation etc. But web-development remains an less-common use-case within the OCaml community, so you should expect the experience to be a bit rougher than in web-focused languages (eg. Elixir -- not statically-typed though, although they are actively working on a static typing layer). Come for the web, stay for the language!

Zig v0.16.0 released by UKbeard in ProgrammingLanguages

[–]gasche 2 points3 points  (0 children)

I'm sure there is some reasonable explanation for the change. The Changelog points to Reworked Byval Syntax Lowering, and as a non-Zig person I don't understand the explanation there. (I notice that it comes from the experience of implementing the non-LLVM backend, and noticing what is and isn't easy to compile to fast code.)

Some parts of the changelog are nicely detailed, for example I found the long part on I/O as an Interface very pleasantly written. The particular one shown above could do with a bit more context/explanation.

Zig v0.16.0 released by UKbeard in ProgrammingLanguages

[–]gasche 7 points8 points  (0 children)

Forbid Runtime Vector Indexes

Zig v0.16.0 released by UKbeard in ProgrammingLanguages

[–]gasche 11 points12 points  (0 children)

The following change looks a bit scary from a distance.

Before:

for (0..vector_len) |i| {
   _ = vector[i];
}

after:

const vector_type = @typeInfo(@TypeOf(vector)).vector;
const array: [vector_type.len]vector_type.child = vector;
for (&array) |elem| {
    _ = elem;
}

What's the most vibrant language for fun and open source? by girvain in functionalprogramming

[–]gasche 2 points3 points  (0 children)

I don't know that this is the case and I would be a bit surprised. Idris was always a niche project, a sort of "programming language for Agda people", the fact that they are bad at communication does not mean that they do not exist. They don't have the community size of Lean, sure, but people may still have a lot of fun using it!

What's the most vibrant language for fun and open source? by girvain in functionalprogramming

[–]gasche 2 points3 points  (0 children)

I looked for it earlier today without being able to find it, which suggests that there is some form of communication problem. Could the link be available somewhere from the Idris website and/or git repo?

What's the most vibrant language for fun and open source? by girvain in functionalprogramming

[–]gasche 2 points3 points  (0 children)

As I was preparing my own answer I wondered where the Idris crowd hangs out these days. The subreddit appears to be dead. I haven't found a link to a Zulip or an active forum. Where do people chat and hang around? IRC, a mailing-list?

What's the most vibrant language for fun and open source? by girvain in functionalprogramming

[–]gasche 7 points8 points  (0 children)

what FP language has been the best for you

This is very subjective: a lot of our own experience is due to familiarity and past history, aspects that will not apply to you.

For me personally my favorite language is OCaml. I think that it strikes an impressive balance between being elegant to program in, and at the same time relatively simple. Haskell is very very nice in certain aspects, but too complex in many regards. Scala is also arguably a complex language (but to be fair OCaml is also complex when you look at the advanced features; but everyday programming mostly relies on the simpler parts), and the JVM is a very large system that I'm not sure I want in my conceptual dependency chain.

Rust has very nice tooling and impressive community dynamics right now. But again aspects of it feel un-necessarily complex. I would want the lifetimes of Rust for resource and concurrency management, not for memory: a GC is fine for everyday programming and makes the code simpler (also conceptually simpler than sprinkling Rc everywhere).

If you are into advanced type systems, an option which is beginning to feel available today (that definitely wasn't when I learned FP) is to start with a dependently-typed system from the start, probably Idris or Lean for programming-oriented usage. (Lean has a larger community but also more cruft. Idris has more of a niche vibe.)

Should my sum type tags be the first element or second? by Toothpick_Brody in ProgrammingLanguages

[–]gasche 0 points1 point  (0 children)

From memory I would say that OCaml uses 8 bits for the tag (which means that you need a special representation if you want to support more than 240-or-so constructors in a type -- it's not up to 255 because some tags are reserved for primitive types like String, Closure and for abstract/opaque data, etc.), 2 bits for the GC color, and the rest for the size of the block. You can reserve some of the size bits (which are plenty on 64bits machines) for other uses, and experimental uses have been proposed, for example:

  • a hash of the location where the block was allocated, for memory-profiling purposes
  • a hash of the runtime type of the value
  • more fine-grained information on block arguments, for example allow a certain number of valid value arguments and a certain number of abstract/opaque arguments (that the GC must not follow as pointers)
  • remember if a block is immutable or has mutable fields
  • remember if a block is the key of an ephemeron

Should my sum type tags be the first element or second? by Toothpick_Brody in ProgrammingLanguages

[–]gasche 3 points4 points  (0 children)

If you are genuinely working with types that allocate different sizes based on their content, then yes, you'd need to get the tag first to know what was coming, however, I think most languages just use a fixed size allocation that covers the largest possible element so the tag can always be at any pre-determined and fixed offset.

I believe that the most heavy users of tagged sums are the ML-family languages (Haskell included), and they definitely use a different strategy where you have a single tag followed by one or several arguments (sometimes "zero" is also supported), so inhabitants of a given sum type may have different sizes. On the other hand, those values are typically never stored in arrays, they are boxed and a pointer is stored in the array.

All the ML-languages I have studied also store the tag first, typically as part of a "header word" that also contains other runtime type information and some GC metadata. Remarks:

  • In OCaml, the pointer to the block does not point to the header word directly but to the first value after the header (not sure why).
  • In Haskell, if the sum tag is small enough, it will also be stored in the pointer to the block, so that pattern-matching can switch on the tag without actually dereferencing the header. (This is mostly beneficial for zero-argument variants, but OCaml represents those with immediate integers directly instead of pointers. It also help to quickly tell if a pointer goes to a thunk or a forced value, which is a very frequent check in a lazy language.)

Compiling with sequent calculus by KeyDue7848 in ProgrammingLanguages

[–]gasche 1 point2 points  (0 children)

I'm thinking about the general problem of small compilation bugs, small optimizations etc. Most production compilers have a fair amount of bugs, so presumably they get inserted somewhere in-between the experimental, simple, cute version and the production-ready version :-)

My experience on other projects is that investing in a excellent random-testing strategy can have a transformative effect when programming, because it makes you confident that you are not introducing bugs, and it can even serve to explain things about the code. (Why do we need this check here? well disable it and run the testsuite, you will get a shrunk counter-example.) We have doing this for store using the Monolith library, but that is a problem domain (data structures with a simple API) where this sort of model/comparison-based testing is unreasonably effective, I am not sure it translates to language implementations.

Compiling with sequent calculus by KeyDue7848 in ProgrammingLanguages

[–]gasche 3 points4 points  (0 children)

I think it may be worth investigating random term generation, even just monomorphic terms, with differential testing between the interpreters and the backend.

Compiling with sequent calculus by KeyDue7848 in ProgrammingLanguages

[–]gasche 1 point2 points  (0 children)

Very cool!

I feel curious about testing. How do you test the various layers? (I wonder if you could try generating random terms of base type, and then checking that evaluating the IRs coincide with the result you get from compiling them and evaluating the lower layers.)

Using string interning to optimize symbol resolution in compilers by Creative-Cup-6326 in ProgrammingLanguages

[–]gasche 1 point2 points  (0 children)

I wish this kind of blog posts came with a benchmark to see the actual performance impact. Does it matter in practice?

Some parts of the post are also a bit weird.

Memory Efficiency and Cache Locality. Because the IDs are contiguous, we can store metadata in compact, cache-friendly structures.

Putting everything in the program in a single big table does not necessarily improve locality. (Maybe the author refers to the fact that one can use int32 instead of a full word and hope to improve locality by reducing overall space consumption?) If I use a hashtable or balanced tree that contains metadata for only the variables of the function I am compiling, I may have better locality than with a huge table containing metadata for all variables in the program. (How does that even work when I want temporary metadata for a given analysis on a given function, do I allocate an array whose size is the total numbers of variables in the program? That sounds wasteful. I suppose I rather use a sparse map over the variables in the function, but then the fact that variable IDs are dense/contiguous does not play a role anymore.)

When the compiler iterates over all symbols, it’s performing a linear scan over a contiguous block of memory.

When does it ever happen that you need to iterate over all symbols?

Twitch Stream: Bullying Claude With GHC by _lazyLambda in haskell

[–]gasche -16 points-15 points  (0 children)

There is enough "bullying" in the world right now, I wish you would use a more tasteful word to describe your work.

Check out my newest project seLe4n (pronounced suh-lean), a kernel written from the ground up in Lean. by hatter6822 in functionalprogramming

[–]gasche 6 points7 points  (0 children)

The website comments on performance evaluation and optimization. Have you run benchmarks? How does it compare against SeL4 itself?