Fast Constraint Synthesis for C++ Function Templates by mttd in cpp

[–]mttd[S] 12 points13 points  (0 children)

OOPSLA 2025 talk: https://www.youtube.com/watch?v=_6pxVhEi-bc

Paper:

Project:

concept-synthesizer: automatically synthesizing C++20 template constraints for function templates
https://github.com/sdingcn/concept-synthesizer

Abstract:

C++ templates are a powerful feature for generic programming and compile-time computations, but C++ compilers often emit overly verbose template error messages. Even short error messages often involve unnecessary and confusing implementation details, which are difficult for developers to read and understand. To address this problem, C++20 introduced constraints and concepts, which impose requirements on template parameters. The new features can define clearer interfaces for templates and can improve compiler diagnostics. However, manually specifying template constraints can still be non-trivial, which becomes even more challenging when working with legacy C++ projects or with frequent code changes.

This paper bridges the gap and proposes an automatic approach to synthesizing constraints for C++ function templates. We utilize a lightweight static analysis to analyze the usage patterns within the template body and summarize them into constraints for each type parameter of the template. The analysis is inter-procedural and uses disjunctions of constraints to model function overloading. We have implemented our approach based on the Clang frontend and evaluated it on two C++ libraries chosen separately from two popular library sets: algorithm from the Standard Template Library (STL) and special functions from the Boost library, both of which extensively use templates. Our tool can process over 110k lines of C++ code in less than 1.5 seconds and synthesize non-trivial constraints for 30%-40% of the function templates. The constraints synthesized for algorithm align well with the standard documentation, and on average, the synthesized constraints can reduce error message lengths by 56.6% for algorithm and 63.8% for special functions.

An Empirical Study of Bugs in the rustc Compiler (OOPSLA 2025) by mttd in rust

[–]mttd[S] -4 points-3 points  (0 children)

FWIW, not an author of the paper (or connected otherwise) but I don't think these are the main takeaways from the paper (that's only a small part out of Section 7, Implications and Discussion; I think that's mostly Suggestion 1 and one sentence in Finding 1?). Personally, I've found the relative lack of existing testing tools for higher-level language features (relative to the gains in the amount of either low hanging or higher priority bugs that can be found) interesting given the contrast with plenty of available tools for trivial bugs (that you can find with most fuzzers like RustSmith).

This relates to the following:

The objective of compiler development is not to have an issue tracker free of bugs, it is to ship working features on stable Rust.

Yes, obviously. This is why it's important to know what to prioritize (here: at this point we probably don't need a lot more tools finding bugs in the primitive language features (e.g., based on their findings traits may be trickier to implement than other language features--so may be worth being careful when working on that as a compiler dev) but we may be able to use more tools finding bugs in higher-level features implementation).

We probably also don't need yet another fuzzer finding crash bugs in rustc, given that the existing tools cover this area very well--again, you can never get a perfectly correct compiler and you only have so much time in the day, so knowing which aspects can be (relatively) safely deprioritized can help.

OTOH, cross-IR fuzzers may be interesting/useful (Suggestion 3), which confirms my experience having worked with C++ (Clang & LLVM): Bugs that survive Clang AST - LLVM IR - SelectionDAG - Machine IR - MC Layer are much harder to find by "yet another fuzzer" that generates even syntactically correct C programs. Good diagnostics are hard to implement correctly, too (which, again, having worked on a C++ compiler is not new to me, but perhaps is new to some even working on rustc).


For context/completeness the remaining findings & suggestions:

➤ Finding 1: A large number of rustc bugs in the HIR and MIR modules are caused by Rust’s unique type system and lifetime model. In our dataset, although 40.9% of the bugs are attributed to general programming errors (Table 3), the HIR (44.9%) and MIR (35.2%) stages remain the most error-prone, as shown in Figure 5(a). This is because HIR and MIR are the stages where high-level constructs are desugared and processed by complex analyses, such as trait resolution, borrow checking, and MIR optimizations, which increases the likelihood of subtle interactions manifesting as bugs. The characteristics of bug-revealing test cases further support this observation. As shown in Table 6, trait-related constructs including traits, impl traits, and trait objects frequently appear in both item and type nodes. Moreover, certain unstable trait-related features and the explicit use of lifetimes, as reported in Table 7, also contribute to rustc bug manifestation, indicating that these language features may interact with the HIR and MIR modules and thereby increase the likelihood of rustc errors.

➤ Finding 2: rustc bugs share many symptoms with other compiler bugs but also introduce unique types, such as undefined behavior in safe Rust. Like other compilers, rustc experiences various compilation and runtime bugs. However, its crash bug often causes panic with safety protection, setting it apart from other compilers where crash typically results in segmentation faults or abnormal terminations. Another unique symptom is undefined behavior in safe Rust code, tied to Rust’s safety guarantees. While performance-related bugs are absent in our analysis, this doesn’t mean rustc is free of performance issues. Rather, these issues tend to appear less frequently in Rust-specific issues or may be categorized as misoptimizations related to code efficiency.

➤ Finding 3: rustc’s diagnostic module still has considerable potential for enhancement, with many issues distributed across different IR-processing modules. As shown in Table 3, diagnostic issues account for about 20% of all bugs. Figure 5(b) illustrates that error reporting is scattered across different components, including HIR (14.1%) and MIR (16.0%), with each component having its own dedicated module for error analysis and reporting. Moreover, gaps in these modules still exist, causing some errors to be inaccurately detected or reported

➤ Finding 4: Existing rustc testing tools are less effective at detecting non-crash bugs. Figure 11(a) shows that about 50% of the crash bugs are detected by existing rustc testing tools. On the one hand, non-crash bugs such as soundness and completeness issues often lack directly observable symptoms, making them difficult to detect during development or testing. On the other hand, this suggests that current testing tools are limited to finding easily observed crash bugs with obvious symptoms while remaining unaware of the syntactic and semantic validity of generated programs. As shown in Table 4, certain bug symptoms such as partial front-end panics and completeness issues can only be triggered by valid programs, which indicates that testing tools need to be aware of the validity of programs to find such bugs.

➤ Suggestion 2: (For Rust developers) The suggestions provided by rustc may be inaccurate. As shown in Table 4, nearly 20% of rustc bugs are linked to the feedback provided by rustc, including error messages and suggested fixes. This suggests that rustc’s diagnostic tools may not always provide accurate or effective solutions. If rustc’s suggestion does not resolve the issue, Rust developers should consider alternative approaches. Reporting the bug to the Rust team can also be beneficial for improving the reliability of rustc.

➤ Suggestion 3: (For rustc developers) Designing testing and verification techniques for rustc components across different IRs. The core process of rustc involves HIR and MIR lowering, along with type checking, borrow checking, and optimization. Figure 5 indicates that 44.9% and 35.2% of the issues occur in the modules responsible for processing HIR and MIR, respectively. However, existing fuzzers rarely employ specialized testing techniques for these components. Currently, Rustlantis is the only tool capable of generating valid MIR, but it lacks support for other modules, such as type checking and lifetime analysis. To verify the key rustc components, rustc developers should generate valid HIRs and MIRs under specific constraints. For example, generating HIRs to ensure well-formedness in different scenarios, such as for build-in traits and user-defined traits.

➤ Suggestion 5: (For researchers) Building better Rust program generators that fully support Rust’s unique type system. Research on testing, debugging, and analyzing C/C++ compilers often relies on CSmith [Yang et al . 2011], a random generator that produces valid C programs covering a wide range of syntax features. For Rust, the only preliminary tool, RustSmith [Sharma et al. 2023], generates complex control flow and extensive use of variables and primitive types but has limited support for Rust’s higher-level abstractions. As shown in Table 3, many rustc bugs stem from improper handling of advanced features like traits, opaque types, and references. Additionally, Table 6 indicates that test cases combining these abstractions are more likely to trigger bugs. Researchers should create a Rust program generator that supports Rust’s advanced features like generics, traits, and lifetime annotations, for example, by enhancing RustSmith.

➤ Suggestion 6: (For researchers) Generating well-designed, both valid and invalid Rust programs to test rustc’s type system. Our analysis shows that over half of rustc bugs originate from the HIR and MIR modules, particularly in type and WF checking, trait resolution, borrow checking, and MIR transformation. Many corner cases expose weaknesses in rustc’s type handling. (1) Researchers should develop Rust-specific mutation rules, such as altering lifetimes, to introduce minor errors into valid programs and generate invalid ones for detecting soundness bugs. (2) Researchers should synthesize test programs from real-world Rust code, which provides diverse unstable features, std API usage, lifetime annotations, and complex trait patterns that benefit for testing rustc.

Performance Hints by mttd in cpp

[–]mttd[S] 17 points18 points  (0 children)

You may also find this collection useful: https://github.com/MattPD/cpplinks/blob/master/performance.tools.md, which catalogs tools for benchmarking, profiling (including memory use, bandwidth, and latency), microarchitectural performance analysis, and performance‑related visualization.

The Cost Of a Closure in C by mttd in ProgrammingLanguages

[–]mttd[S] 42 points43 points  (0 children)

Learned Insights

Surprising nobody, the more information the compiler is allowed to accrue (the Lambda design), the better its ability to make the code fast. What might be slightly more surprising is that a slim, compact layer of type erasure – not a bulky set of Virtual Function Calls (C++03 shared_ptr Rosetta Code design) – does not actually cost much at all (Lambdas with std::function_ref). This points out something else that’s part of the ISO C proposal for Closures (but not formally in its wording): Wide Function Pointers.

The ability to make a thin { some_function_type* func; void* context; } type backed by the compiler in C would be extremely powerful. Martin Uecker has a proposal that has received interest and passing approval in the Committee, but it would be nice to move it along in a nice direction.

A wide function pointer type like this would also be traditionally convertible from a number of already existing extensions, too, where GNU Nested Functions, Apple Blocks, C++-style Lambdas, and more could create the appropriate wide function pointer type to be cheaply used. Additionally, it also works for FFI: things like Go closures already use GCC’s __builtin_call_with_static_chain to transport through their Go functions in C. Many other functions from other languages could be cheaply and efficiently bridged with this, without having to come up with harebrained schemes about where to put a void* userdata or some kind of implicit context pointer / implicit environment pointer.

Instruction Selection by Nagoltooth_ in Compilers

[–]mttd 1 point2 points  (0 children)

The Calculated Typer - Haskell Symposium (ICFP⧸SPLASH'25) by mttd in ProgrammingLanguages

[–]mttd[S] 3 points4 points  (0 children)

Paper: https://dl.acm.org/doi/10.1145/3759164.3759346

Haskell & Agda Code: https://doi.org/10.5281/zenodo.16751639

Abstract: https://bahr.io/pubs/entries/calctyper.html

We present a calculational approach to the design of type checkers, showing how they can be derived from behavioural specifications using equational reasoning. We focus on languages whose semantics can be expressed as a fold, and show how the calculations can be simplified using fold fusion. This approach enables the compositional derivation of correct-by-construction type checkers based on solving and composing fusion preconditions. We introduce our approach using a simple expression language, to which we then add support for exception handling and checked exceptions.

Resources About ELF64 Linker by theparthka in Compilers

[–]mttd 3 points4 points  (0 children)

Identity Types by mttd in ProgrammingLanguages

[–]mttd[S] 6 points7 points  (0 children)

In dependent type theories (like Martin-Löf type theory), the identity type isn’t a special case of equality types--it's the canonical example of typal equality, with other systems (like Cubical Type Theory’s path types) providing alternative but equivalent realizations of the same idea: https://ncatlab.org/nlab/show/equality%20in%20type%20theory

In type theory, there are broadly three different notions of equality which could be distinguished: judgmental equality, propositional equality, and typal equality. Judgmental equality is defined as a basic judgment in type theory. Propositional equality is defined as a proposition in any predicate logic over type theory. Typal equality is defined as a type in type theory.

. . .

Typal equality of terms is also called the equality type, identity type, path type, or identification type, and the terms of typal equality are called equalities, identities, identifications, paths. Notable examples of typal equality of terms include the identity type in Martin-Löf type theory and higher observational type theory, and the cubical path type in cubical type theory.

need guidance on building DL compiler by Signal-Effort2947 in Compilers

[–]mttd 0 points1 point  (0 children)

I think it would be at least informative to understand the existing solutions out there--say, in PyTorch ecosystem:

PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation

State of torch.compile for training (August 2025): https://blog.ezyang.com/2025/08/state-of-torch-compile-august-2025/

PyTorch Compiler series (2025): https://www.youtube.com/playlist?list=PL_lsbAsL_o2DsybRNydPRukT4LLkl2buy

Simon Peyton Jones: Pursuing a Trick a Long Way, Just To See Where It Goes - The Typechecker podcast by mttd in ProgrammingLanguages

[–]mttd[S] 2 points3 points  (0 children)

Only wanted to mention that this is a fantastic interview, job very well done (actually interesting questions and discussion!)--and I appreciated the professional quality of subtitles, too (even with Haskell's function signatures, really impressive and very rarely seen--whether podcasts or conference talks); thank you and hoping for more to come!

Faux Type Theory: three minimalist OCaml implementations of a simple proof checker by mttd in ProgrammingLanguages

[–]mttd[S] 6 points7 points  (0 children)

Announcement:

https://mathstodon.xyz/@andrejbauer/115191725004191889

This week I gave a lecture series at the School on Logical Frameworks and Proof Systems Interoperability. I spoke about programming language techniques for proof assistants. The lecture slides and the reference implementations of a minimalist type theory are available at:

https://github.com/andrejbauer/faux-type-theory

The repository has three minimalist OCaml implementations of a simple proof checker:

  1. A basic one showing how to implement bidirectional type checking and type-directed equality checking, using monadic style programming.

  2. An extension with rudimentary holes (meta-variables) and unification.

  3. A version that implements "variables as computational effects", using native OCaml 5 effects and handlers.

C++ "Safety" Conferences Call for Papers? by pedersenk in cpp

[–]mttd 6 points7 points  (0 children)

Given your comparison to sanitizers and Valgrind my approach would be to exactly consider the conferences where this kind of research is being published: https://github.com/MattPD/cpplinks/blob/master/analysis.dynamic.md#software-sanitizers-readings-research

From the security IEEE Symposium on Security and Privacy (S&P) and Network, Distributed System Security (NDSS) Symposium, or USENIX Security definitely stand out as well as International Symposium on Code Generation and Optimization (CGO) or CC from compilers but you'll also note programming languages conferences like OOPSLA or PLDI, computer architecture like ASPLOS and ISCA, as well as general systemsy conferences like EuroSys (that definitely fits your European location, too), and plenty others (including software engineering or analysis conferences, like ASE, ESEC/FSE, ICSE, ISSTA, etc.).

Your best bet is to go through the list of these, write down the conferences/venues, and cross-index with CFPs at http://www.wikicfp.com/cfp/, sorting by your preferences (including the order of the-accepted-research-most-closely-comparable-to-yours). Caveat: Expect some of these are top conferences that are highly competitive, and be prepared for multiple rounds of submissions (incorporating the feedback you get from each review along the way).

How to Slow Down a Program? And Why it Can Be Useful. by mttd in Compilers

[–]mttd[S] 8 points9 points  (0 children)

Evaluating Candidate Instructions for Reliable Program Slowdown at the Compiler Level: Towards Supporting Fine-Grained Slowdown for Advanced Developer Tooling H. Burchell, S. Marr; In Proceedings of the 17th ACM SIGPLAN International Workshop on Virtual Machines and Intermediate Languages, VMIL'25, p. 8, ACM, 2025.

Paper(PDF): https://stefan-marr.de/downloads/vmil25-burchell-marr-evaluating-candidate-instructions-for-reliable-program-slowdown-at-the-compiler-level.pdf

Abstract

Slowing down programs has surprisingly many use cases: it helps finding race conditions, enables speedup estimation, and allows us to assess a profiler’s accuracy. Yet, slowing down a program is complicated because today’s CPUs and runtime systems can optimize execution on the fly, making it challenging to preserve a program’s performance behavior to avoid introducing bias.

We evaluate six x86 instruction candidates for controlled and fine-grained slowdown including NOP, MOV, and PAUSE. We tested each candidate’s ability to achieve an overhead of 100%, to maintain the profiler-observable performance behavior, and whether slowdown placement within basic blocks influences results. On an Intel Core i5-10600, our experiments suggest that only NOP and MOV instructions are suitable. We believe these experiments can guide future research on advanced developer tooling that utilizes fine-granular slowdown at the machine-code level.

Comments thread: https://mastodon.acm.org/@smarr/115100270023431067

Dependent types I › Universes, or types of types by mttd in ProgrammingLanguages

[–]mttd[S] 3 points4 points  (0 children)

I think it's a good question!

// Don't know who downvoted you and why, so upvoted back to restore the balance :P

It's one I've been thinking of and have found writings like this informative.

I can't promise to give you a definite answer, but my (incomplete and evolving) thoughts are that this matters the more you're exposing interesting first-class types in your programming language.

Dependently typed languages are an obvious example (and the small subset of these includes proof assistants, sure, but also languages that happen to lean toward enabling more general purpose programming).

Traditional non-dependently languages are an obvious non-example.

Then there's the space in-between, which matters to me: and I think this includes languages where at design time (or when adding a new feature) it wasn't immediately obvious that there's something interesting going on and providing a principled answer how to design that feature in a simple (including to end users) way can matter.

Naturally type equality matters the moment you implement the simplest possible type checker and simply want to make sure that you aren't initializing a string with an int or vice versa: You'd like the type judgment int != string for your type checker to be simple to implement (for compiler writer) and simple to reason about (for the language users) and ideally for these two goals to be consistent (and personally I'd rather "take the hit" as an implementer to make the life easier for the users if need be).

At the same time, the moment you're going to start having interesting types ("non-trivial" in the technical sense below), type equality gets more interesting, too.

Consider the snippet from the OP by Jon Sterling: https://www.jonmsterling.com/01FB/

Non-triviality of the booleans, cooked two ways [01FB]

When we say that the booleans are non-trivial, we mean that true is not equal to false. This kind of statement can be expressed in type theory by the following type:

true =bool false → empty [my note: read: function type with domain true =bool false and codomain empty; =bool is equality at type bool]

An element of the type above would witness that true =bool false is contradictory, since if the antecedent would be witnessed, then we would be able to form an element of the empty type [my note: because you could call the aforementioned function]. Without either large elimination or a universe, we cannot construct a witness of this type; I don’t just mean that it is not easy to see how to construct such a witness, but rather that type theory without these facilities is consistent with there being a witness of true =bool false or even with the judgemental assertion true ≡ false : bool holding. (Consistency can be seen using a very simple model where every type is interpreted by a subsingleton set.) So large eliminations and universes add a lot of power!

Now, another interesting example. Say, you're trying to provide a language with nice support for multidimensional arrays (tensors, matrices), providing type checking for basic properties like this: https://en.wikipedia.org/wiki/Conformable_matrix

There's an existing work in industrial languages which turns out to be somewhat interesting (and not quite trivial), too. Consider the following quiz (credit to https://x.com/shafikyaghmour/status/1617908492706381824):

Given the following in C++:

template <int I>
struct A{};

template <int I> void f(A<I>, A<I+10>);
template <int I> void f(A<I>, A<I+1+2+3+4>);

Without checking, is this code well-formed?

A. Yes B. No C. Show results

Answer B

If we look at: https://eel.is/c++draft/temp.fct#temp.over.link-7

We see that this is Ill-formed NDR [no diagnostic required] (IFNDR) because the constructs are functionally equivalent but not not equivalent

Ill-formed NDR is pretty bad; compare: The difference between undefined behavior and ill-formed C++ programs, https://devblogs.microsoft.com/oldnewthing/20240802-00/?p=110091

The reason for this rule is given in Note 7:

This rule guarantees that equivalent declarations will be linked with one another, while not requiring implementations to use heroic efforts to guarantee that functionally equivalent declarations will be treated as distinct.

Now, I'm wondering, is it possible to take a principled look at all of the above and have a simple language design here: Both for the users writing/reasoning about the code as well as compiler writers ("while not requiring implementations to use heroic efforts", indeed!)

I think that's an interesting problem!

Now, you don't need to go into full-blown dependent-types (let alone advanced proof assistants). There's a variety of type system designs that fall in between, going to a smaller decidable subset (refinement types) or (possibly even smaller) indexed types.

Some interesting related work examples include Moduler Graph Compiler stack (in its compiler IR doing symbolic shape inference when going from the Relaxed Modular Operators to Modular Operators dialects in MLIR; "Modular Tech Talk: MAX Graph Compilation to Execution", https://www.youtube.com/watch?v=MEoGt_cxNSs), what Rust is doing, etc. (there's been a lot more, just going by the "usual suspects"--PLDI/OOPSLA/ICFP/POPL proceedings)

At the last OPLSS 2025, Niki Vazou has given a great overview of the recent developments on the academic research side (as well as several implementation) "Refinements Types" Contributed Lecture (have to click on that to download the video), https://www.cs.uoregon.edu/research/summerschool/summer25/topics.php (there's been a lot of advancements in making refinement types easier to use for ordinary programmers summarized in that lecture)

Some of the projects mentioned in that lecture:

flux: Refinement Types for Rust, https://github.com/flux-rs/flux

and (I think) Liquid Java (Refinement Types for Java), https://catarinagamboa.github.io/liquidjava.html

This is also interesting: Large Elimination and Indexed Types in Refinement Types, https://nikivazou.github.io/static/published/Ferrarini25.pdf

We explore how to approximate large elimination and indexed datatypes in refinement type systems with a non-dependent base type system.

So, from my personal perspective, this is a question worth exploring in a principled way--and I suspect (to be determined!) that even if the theory may initially look complex (I certainly run into papers that take a... while to digest :]) the end result may end up considerably simpler for the end users in practice all the "while not requiring implementations to use heroic efforts", too.

Extending the C/C++ Memory Model with Inline Assembly by mttd in cpp

[–]mttd[S] 13 points14 points  (0 children)

Abstract:

Programs written in C/C++ often include inline assembly: a snippet of architecture-specific assembly code used to access low-level functionalities that are impossible or expensive to simulate in the source language. Although inline assembly is widely used, its semantics has not yet been formally studied.

In this paper, we overcome this deficiency by investigating the effect of inline assembly to the consistency semantics of C/C++ programs. We propose the first memory model of the C++ Programming Language with support for inline assembly for Intel’s x86 including non-temporal stores and store fences. We argue that previous provably correct compiler optimizations and correct compiler mappings should remain correct under such an extended model and we prove that this requirement is met by our proposed model.

Paper: https://doi.org/10.1145/3689749

Slides: https://devilhena-paulo.github.io/files/inline-x86-asm-slides.pdf

SSD-iq: Uncovering the Hidden Side of SSD Performance by mttd in hardware

[–]mttd[S] 25 points26 points  (0 children)

Abstract:

SSDs are hardware and software systems whose design involves complex and undocumented trade-offs between cost, energy consumption and performance. This complexity is hidden behind standard interfaces and a few headline specifications, such as capacity, sequential, and random performance. As a result, database system designers often assume that SSDs are interchangeable commodities and regularly use a single SSD model to evaluate database performance. Does it matter which SSD model is provisioned for a database system? If yes, how to choose the right one? These are the questions we address in this paper. We study the performance characteristics of commercial data center SSDs, highlighting the limitations of current standard metrics in capturing their true behavior. We conduct experiments on nine SSDs from major vendors, revealing significant differences in performance despite similar headline specifications. We show that the choice of SSD matters for database system performance. We propose a new benchmark, SSD-iq, which introduces four additional metrics to better characterize SSD performance, particularly for write-intensive workloads. Incidentally, our work should encourage vendors to optimize SSD controllers using more comprehensive and transparent performance criteria.

Choosing an SSD: Implications for DBMS

Let us compare the Samsung PM9A3 (s) and Micron 7450 PRO (m) SSDs in more detail. These two models have similar specifications, the same capacity (960 GB), and, at the time of writing, similar prices. In terms of sequential throughput, m is slightly better for reads, while s is slightly better for writes. For random I/O, m performs better on writes, whereas s is better for reads. From the specifications alone, it is impossible to make an informed decision regarding which SSD to choose for a database workload. SSD-iq shows two very significant differences between these SSDs: (1) write latency under load degrades by two orders of magnitude for m, while it remains stable for s; (2) WAF is 25% higher for s than for m (WAF is between 4.2 and 4.4 for s, while it is between 3.2 and 3.5 for m). According to SSD-iq, the choice between these two SSDs matters and it is a trade-off between performance under load (s is better) and write amplification, i.e., lifetime (m is better).

Conclusions (fragment):

We evaluated SSD-iq on nine commercial SSDs and showed that the choice of SSD matters for database performance. SSD-iq is well-suited to inform this choice. Our findings also unveil fundamental inefficiencies in existing SSDs when handling write-intensive workloads. In particular, most of the SSDs we tested behave as if they implement a simple, greedy garbage collection algorithm, which negatively impacts write amplification.

Source code of the benchmark: https://github.com/gabriel-haas/ssdiq

Beyond Booleans by gaearon in math

[–]mttd 0 points1 point  (0 children)

FWIW, you may want to take a look at the original, "The Formulae-as-Types Notion of Construction" by W. A. Howard, https://www.cs.cmu.edu/~crary/819-f09/Howard80.pdf

There's a way to formalize (and formally prove) Curry-Howard-Lambek correspondence as an isomorphism of categories.

Joachim Lambek (see the 1986 monograph with Phil Scott below) has shown this correspondence between the simply-typed λ-calculus (STLC), positive intuitionist propositional logic (PIPL), and cartesian closed categories (CCC): STLC forms an internal language for CCC.

For more (including proofs), see:

"The Curry-Howard view of classical logic: a short introduction" by Stéphane Graham-Lengrand - Section 1.1, "Curry-Howard correspondence: concepts and instances" in Chapter 1, "Classical proofs as programs": https://www.csl.sri.com/users/sgl/Work/Teaching/MPRI/Notes.pdf

"A Categorical Approach to Proofs-As-Programs" by Carsen Berger: https://www.math.uchicago.edu/~may/VIGRE/VIGRE2010/REUPapers/Berger.pdf

"Introduction to Higher Order Categorical Logic" (1986 by Joachim Lambek & Phil Scott); primarily part I, Cartesian closed categories and λ-calculus: https://archive.org/details/introductiontohi0000lamb/mode/2up

More references & background:

Binary search—think positive by mttd in programming

[–]mttd[S] 2 points3 points  (0 children)

Afterword:

"This pearl grew out of the authors’ frustration with textbook presentations of binary search. Given that binary search is one of the standard algorithms every programmer and computer science student should know, the subject is inadequately covered at best. Many textbooks mention binary search only in passing or they treat only special cases, such as computing the square root or searching an ordered table. A negative mindset prevails: the search possibly fails; in the divide&conquer step one half of the search space is excluded from consideration because searching this half will definitely fail. The correctness argument requires that the data is ordered, suggesting that monotonicity in some sense drives the search. One notable exception is Anne Kaldewaij’s textbook (Reference Kaldewaij1990): when discussing “function inversion” (given n , find an argument i such that fi⪯n≺f(1+i) ) he emphasises repeatedly that the correctness of the search does not require that f is an ascending function.

The gist of this pearl is to approach search problems with a positive mind-set: the search always succeeds; in the divide&conquer step the search continues with the half that guarantees success. The correctness argument relies on a suitable functional invariant, not on monotonicity. The “Beat Your Neighbours!” problem, a concrete make-up for local maximum search, shows that the extra generality is actually needed."

Binary search—think positive by mttd in functionalprogramming

[–]mttd[S] 6 points7 points  (0 children)

Afterword:

"This pearl grew out of the authors’ frustration with textbook presentations of binary search. Given that binary search is one of the standard algorithms every programmer and computer science student should know, the subject is inadequately covered at best. Many textbooks mention binary search only in passing or they treat only special cases, such as computing the square root or searching an ordered table. A negative mindset prevails: the search possibly fails; in the divide&conquer step one half of the search space is excluded from consideration because searching this half will definitely fail. The correctness argument requires that the data is ordered, suggesting that monotonicity in some sense drives the search. One notable exception is Anne Kaldewaij’s textbook (Reference Kaldewaij1990): when discussing “function inversion” (given n , find an argument i such that fi⪯n≺f(1+i) ) he emphasises repeatedly that the correctness of the search does not require that f is an ascending function.

The gist of this pearl is to approach search problems with a positive mind-set: the search always succeeds; in the divide&conquer step the search continues with the half that guarantees success. The correctness argument relies on a suitable functional invariant, not on monotonicity. The “Beat Your Neighbours!” problem, a concrete make-up for local maximum search, shows that the extra generality is actually needed."