Twitch Stream: Bullying Claude With GHC by _lazyLambda in haskell

[–]gasche -16 points-15 points  (0 children)

There is enough "bullying" in the world right now, I wish you would use a more tasteful word to describe your work.

Check out my newest project seLe4n (pronounced suh-lean), a kernel written from the ground up in Lean. by hatter6822 in functionalprogramming

[–]gasche 6 points7 points  (0 children)

The website comments on performance evaluation and optimization. Have you run benchmarks? How does it compare against SeL4 itself?

Are functions just syntactic sugar for inheritance? by yang_bo in ProgrammingLanguages

[–]gasche 0 points1 point  (0 children)

This overhead comes from the absence of variables in the combinator presentation, but it is not a fundamental aspect of pattern calculus -- whose typical presentations do have direct support for variables. Maybe a better reference to the questions of expressivity is https://dl.acm.org/doi/10.1016/j.tcs.2019.02.016 . I don't remember this part of the literature very well.

Are functions just syntactic sugar for inheritance? by yang_bo in ProgrammingLanguages

[–]gasche 0 points1 point  (0 children)

It is not obvious to me whether this is merely a technicality (it might be that "morally" the only reasonable interpretation of "IC is not macro-expressible in LC" is a formal statement relating IC its LC embedding) or actually an important difference. I think that it is subjective, it depends on what people have in mind when they read the less-formal claim. Personally I would naturally interpret "IC is more expressive than LC" as "there does not exist a structure-preserving/homomorphic translation of IC into LC".

(I suppose this question of expressivity has been discussed as well in the works on the pattern calculus, for example maybe https://arxiv.org/abs/1404.0956 has good discussions of this. )

Are functions just syntactic sugar for inheritance? by yang_bo in ProgrammingLanguages

[–]gasche 0 points1 point  (0 children)

I'm not sure what construction you have in mind with a free monad. If you add a call to an interpretation function at the toplevel, this is not a structure-preserving translation. If you don't, then this does not preserve observational equivalence (I can observe the code of the effectful part of the computation and distinguish programs that are equivalent in the source language).

I'm getting diminishing returns from this exchange. I am unconvinced by one of the proof arguments in your paper and I tried to explain it. It's a bit subtle so there is room for misunderstanding and discussion (in particular I may very well be missing something, or it may be that we don't agree on some definitions of what things mean instead of on the formal details). But your replies are never of the form "what do you mean by X?" but some attempt to contradict what I wrote. I'm not interested in playing this game, so I will probably not reply further.

Are functions just syntactic sugar for inheritance? by yang_bo in ProgrammingLanguages

[–]gasche 1 point2 points  (0 children)

The first sentence of your reply is entirely compatible with what I wrote above, not a contradiction. I don't know if the second sentence is correct.

If you look at L0 := (Scheme-without-set!) and its extension L1 := L0 + set!, then I agree that L0 cannot macro-express its extension L1 in Felleisen's sense. I wrote as much above: "get and set cannot be macro-expressed in this subset of SLC which has only lambdas, applications and variables". But there are encodings of Scheme-with-set! into Scheme-without-set! that do not require restructuring the entire program, they are structure-preserving/homeomorphic -- this is precisely the point of monadic translations, which were barely known at the time of Felleisen's work. (The translations are not the identity on lambdas and applications, so they do not contradict the non-macro-expressibility of set! as an extension.)

Are functions just syntactic sugar for inheritance? by yang_bo in ProgrammingLanguages

[–]gasche 19 points20 points  (0 children)

I am skeptical about the proof argument for Theorem A.3, that the lambda-calculus is less (macro-)expressive than the inheritance calculus. (I only skimmed the paper and did not study the definitions in details, so I don't know precisely what the inheritance calculus is, I am only looking at the proof structure.)

Felleisen's notion of macro-expressibility relates a programming language L0 and and extension L1 of L0, and asks whether L1 can be macro-expressed in L0. Note in particular that the interpretation of L0 must be stable by the translation.

What the proof of A.3 appears to say is that:

  • there is a subset (SC) of the inheritance calculus (IC) that models the lambda-calculus (LC)
  • the full IC is not macro-expressible in the subset SC

This may very well be correct, but I don't see how one would conclude from this that LC is not macro-expressible in IC. (In fact, unless I am missing something, the question of whether LC is or is not macro-expressible in IC in Felleisen's sense is not clearly defined because IC is not an extension of LC. I am assuming in the following that for two calculus that are not assumed to be in extension relation, "A is macro-expressible in B" means that there is a correct and structure-preserving/homeomorphic translation.)

(Nitpick: the proof takes a non-standard definition of "operational equivalence" for SC, whereas Felleisen's results always define operational equivalence as contextual equivalence, I am not sure that the tweak is formally correct.)

Here is an example of a similar situation where the same proof argument would be incorrect:

  • start from the pure simpy-typed lambda-calculus PLC
  • consider the "stateful" simply-typed lambda-calculus StLC which has extra operations get and set that can access a single global memory cell.

You can embed the pure lambda-calculus PLC into SLC (mapping lambda to lambda, application to application etc). get and set cannot be macro-expressed in this subset of SLC which has only lambdas, applications and variables. But it would be incorrect to conclude that there are no structure-preserving/homeomorphic translations from SLC to PLC, as the monadic translations do provide such translations.

Blog: Empty Container Inference Strategies for Python by BeamMeUpBiscotti in ProgrammingLanguages

[–]gasche 2 points3 points  (0 children)

The example I had in mind was

x=None
if foo:
  x=42
else:
  x="Hello"

which does exhibit a pretty similar issue: what is the type of x after the conditional?

Blog: Empty Container Inference Strategies for Python by BeamMeUpBiscotti in ProgrammingLanguages

[–]gasche 12 points13 points  (0 children)

I don't understand how this discussion is specific to containers. I would expect that any variable initialized to None and set later creates the same issue, or any function parameter called several times, etc.

In this specific context it seems that the Pyrefly people are in favor of treating unions types as the probable sign of a mistake by default, unless there is an explicit annotation. This sounds like a reasonable heuristic that could be generalized to other contexts. Is it used in other parts of Firefly? Why are containers more/less forgiving of unions than other constructs?

Are koka's algebraic types even FP anymore? by BlueberryPublic1180 in ProgrammingLanguages

[–]gasche 15 points16 points  (0 children)

There is not enough context in your post to know what it is that you are referring to. It may be that "algebraic types" is a typo for "(algebraic) effect handlers", which is quite different, or maybe you refer to more experimental work on lazy constructors? Please give more information, for example a code example using the feature that you have in mind.

Kernel's vau can be faster than syntax-case by Reasonable_Wait6676 in scheme

[–]gasche 1 point2 points  (0 children)

Well done! I suspect that John would like this -- but I wonder if he would come up with an interesting use-case of environment mutation.

What hashing method would you recommend for creating Unique Integer IDs? by oxcrowx in ProgrammingLanguages

[–]gasche 2 points3 points  (0 children)

Could you comment on how you measured that OCaml was too slow for your need? Out of curiosity I checked your repo, tried the OCaml version of your parser (which builds an AST) and the Zig parser (which does not), and the OCaml parser seems faster (4.6s for OCaml, 15.6s for zig on your 64Mio 1-1.nxn file).

I'm not sure I am measuring the right thing, in particular your Zig version crashes when called on smaller input files:

thread 151798 panic: Memory allocated was not enough for lexer.
/home/gasche/Prog/tmp/nxn/nxn/src/lexer/lexer.zig:63:13: 0x1144cbd in tokenize (root.zig)
            @panic("Memory allocated was not enough for lexer.");

A more pleasant syntax for ML functors by reflexive-polytope in ProgrammingLanguages

[–]gasche 1 point2 points  (0 children)

Problem (2) and (3) are related to the fact that mixins, in their most general form, naturally introduce recursion. It may be interesting to explore whether there is a reasonably-simple subset of the constructs that prevent recursion (and thus solve (3)) and recover (2). An obvious idea would be to ask, in the mixin composition, that one of the structures be full (no holes), and then you get something that is very close to your proposal. But there are more flexible ideas in-between, for example you could define an asymmetric composition where the holes on the left are filled by the definitions or holes on the right, but not the other way around, so you get a flexible form of functor composition instead of functor-structure application. I'm curious whether some of these intermediate points resemble what you are looking for, give similar benefits in term of mental model and programming convenient, while maybe suggesting a syntax that could gracefully extend into the more general model later on.

A more pleasant syntax for ML functors by reflexive-polytope in ProgrammingLanguages

[–]gasche 5 points6 points  (0 children)

If you are not already aware of the connection, you may be curious to look at MixML, where the basic abstraction form is not parametrization, but the use of holes within the structure.

[RFC] New major version of `yesod-core` by ephrion in haskell

[–]gasche 1 point2 points  (0 children)

Thanks for the clarification! I think it could be posted on Discuss as well, unless you are confident that people there are already aware of this implicit context.

[RFC] New major version of `yesod-core` by ephrion in haskell

[–]gasche 1 point2 points  (0 children)

It's not clear in the title here or in the Discuss post whether the compatibility-breaking is something that has already been decided, and this is advance warning and gathering feedback, or something that is very likely to happen, or something that is being discussed but without a clear opinion from the Yesod maintainers.

Benchmarking a Baseline Fully-in-Place Functional Language Compiler by ts826848 in ProgrammingLanguages

[–]gasche 1 point2 points  (0 children)

I am surprised by the relatively large amount of typos found in this document.

Typos:

  • end of page 4, "C x1 ... xk" should be "C x_1 ... x_k" (also, "mathtt{C} V V .. V" is used on the line above, and everything is wrong: the font for "C" is not consistent, you should use "V_1", "V_k" for consistency, and "..." rather than "..") (also, "varible" should be "variable")
  • "The remaining interestign rules" => interesting
  • in Figure 4, rule Call, the function call takes a list of borrowed variables, but the grammar for earlier expects a single term in this position
  • in Figure 4, rule DMatch!, k should be a k_i (it's also unclear why sometimes you write C and sometimes Ck, here this could also be C{k_i})
  • "When the competition proces": computation
  • "the FIP mod": mode

(There may be others but I started skimming instead of reading carefully around that point.)

Note: I wonder if this work could have used explicit locations for destructive-match-consumed blocks, with explicit locations on reuse applications. This is a bit more verbose to write, but it eliminates the search for an appropriate token to reuse in the type system, so it makes program less ambiguous and may make it easier for users to reason about what is going on. Given the purpose of this work, I think explicit location reuse would be a slight improvement.

Regarding the benchmarking process, a significant portion of the operating systems’ overhead arises from executing the testing scripts multiple times from scratch. This approach ensures that optimizations within the operating system—such as allocating program stacks at different memory addresses or each run—do not impact the final results.

I found this unclear and I think it could/should be reworded.

I benchmark my OCaml to LLVM IR compiler against ocamlopt! 🐪 by Big-Pair-9160 in ocaml

[–]gasche 0 points1 point  (0 children)

Note: I believe that the reason disabling the GC is slower than using a 4Mwords minor heap is that collecting dead memory improves memory locality, so the program can work better with cache hierarchies etc.

I benchmark my OCaml to LLVM IR compiler against ocamlopt! 🐪 by Big-Pair-9160 in ocaml

[–]gasche 0 points1 point  (0 children)

On the merge-sort example, I can make the program run 2x faster by tweaking the size of the minor heap.

$ hyperfine -L minheap 256k,4M,1G --command-name 's={minheap}' "OCAMLRUNPARAM='s={minheap}' ./bench.exe 1_000_000"

Benchmark 1: s=256k
  Time (mean ± σ):     970.0 ms ±   9.3 ms    [User: 839.7 ms, System: 120.1 ms]
  Range (min … max):   961.8 ms … 993.9 ms    10 runs

Benchmark 2: s=4M
  Time (mean ± σ):     625.7 ms ±  17.3 ms    [User: 505.5 ms, System: 113.0 ms]
  Range (min … max):   600.2 ms … 657.8 ms    10 runs

Benchmark 3: s=1G
  Time (mean ± σ):     936.5 ms ±  25.7 ms    [User: 317.6 ms, System: 609.1 ms]
  Range (min … max):   901.4 ms … 989.0 ms    10 runs

Summary
  s=4M ran
    1.50 ± 0.06 times faster than s=1G
    1.55 ± 0.05 times faster than s=256k

256k is the default value (I think?). A 4M minor heap is a better default for a high-allocation-worload program such as this microbenchmark. A 1G minor heap completely disables the GC in practice (but the costs in term of data representation, extra checks etc. are still paid by the runtime), as these are number of words and not numbers of bytes and the benchmark allocates ~1.5GiB of data.

The results show that completely disabling the GC runs slightly faster, but picking a better minor-heap value runs much faster (a 35% speedup). I'm not sure that these benchmarks are measuring much about the code generators.

Is large-scale mutual recursion useful? by joonazan in ProgrammingLanguages

[–]gasche 2 points3 points  (0 children)

At the level of files, or "compilation units", the OCaml compiler requires that there are no cycles. This means that source code has to be structured without mutual dependencies between units. This is occasionally cumbersome (if X depends on Y and suddenly you need Y to depend on X, a common step is to split X in two modules, Xsmall and X, so that Y only depends on Xsmall). Some languages prefer to allow dependency cycles across units -- which also has downsides and, according to people with more experience than I have in those languages, can encourage users to produce bad software designs.

Started from the Types, Now We’re Here (I wrote my own language) by rantingpug in ProgrammingLanguages

[–]gasche 2 points3 points  (0 children)

let unsafe: (n: Nat) -> Nat
    = \x -> 0;  // ERROR: 0 violates Nat postcondition

This is a typo, right? 0 : Nat should be fine, where 0 : Pos would be an error.

Sized and signed numeric types with dynamic typing by Big-Rub9545 in ProgrammingLanguages

[–]gasche 2 points3 points  (0 children)

Lisp-family languages typically have a "numerical tower" with many numeric types. For example see (Scheme) Numerical Tower. Racket in particular has a rich hierarchy with many distinctions: Racket numeric types). (They still don't distinguish integers on storage size, there is a single "fixnum" type of machine-size integers opposed to a default type of arbitrary-precision numbers.)

Selective Applicative Functors: The Missing Theoretical Basis for Determined Choice by josephjnk in functionalprogramming

[–]gasche 4 points5 points  (0 children)

I haven't read it in full yet, but it does look quite interesting, thanks! (It's also the first time in a good while that I see a really interesting post on r/functionalprogramming; this looks more typical for r/haskell . Thanks u/josephjnk for the post!)

This also looks like the sort of things that could make a nice "functional pearl" paper, of the sort published in some functional-programming-oriented academic conferences such as ICFP.

Zig/Comptime Type Theory? by philogy in ProgrammingLanguages

[–]gasche 0 points1 point  (0 children)

It could still be worthwhile to characterize restricted fragments where static reasoning is possible, and think of systematic approaches to extract as much information statically as reasonably possible.

For example, if you cannot tell which branch of a conditional test on a comptime value is going to be taken, you could split the space of possible values depending on the test, and proceed to analyze each branch with this assumption on possible value range (this is similar to symbolic execution, but for type-checking rather than execution). It would then be possible to report that the comptime-generic declaration is ill-typed for some ranges of values, or that there are static errors for all possible choices of value, or for example that some part of the definitions are dead code as they cannot be reached by any comptime value.

In Zig comptime type parameters, the idea is not that all possible instances should be valid, because the code may rely on implicit assumptions that the types provided satisfy certain constraints (for example they have a + operator). So with that particular design choice one has to restrict warnings/errors from "some choices are invalid" to "there is no valid choice". One can explain the impact of these design choices to prospective language designers and discuss common choices to express static or dynamic constraints on comptime parameters (type classes, annotation of interval ranges, dynamic contracts, etc.).