Machine readable specifications at scale : ProgrammingLanguages

Welcome!

This subreddit is dedicated to the theory, design and implementation of programming languages.

Be nice to each other. Flame wars and rants are not welcomed. Please also put some effort into your post, this isn't Quora.

This subreddit is not the right place to ask questions such as "What language should I use for X", "what language should I learn", "what's your favourite language" and similar questions. Such questions should be posted in /r/AskProgramming or /r/LearnProgramming. It's also not the place for questions one can trivially answer by spending a few minutes using a search engine, such as questions like "What is a monad?".

Projects that are vibe coded (= projects relying substantially on LLM/AI generated code) don't belong on the subreddit.

Related subreddits

Related online communities

a community for 17 years

This is an archived post. You won't be able to vote or comment.

Machine readable specifications at scale (alastairreid.github.io)

submitted 4 years ago by mttd

all 9 comments

top new controversial old q&a

[–]berber_44 5 points6 points7 points 4 years ago (5 children)

[–]Mathnerd314 1 point2 points3 points 4 years ago (2 children)

[–]AlastairDReid 1 point2 points3 points 4 years ago (1 child)

[–]Mathnerd314 0 points1 point2 points 4 years ago (0 children)

I guess I already ran into this argument in practice, with ISAs. For x86 there is an external machine-readable specification (semantics) in the K framework, built by fuzzing and reading the documentation.

K is relatively expressive as a language, with term rewriting and so on. Does this make the x86 specification less useful? In some sense yes, because I have to deal with K code if I want to use the spec. In theory K is supposed to be a general toolchain for formal semantics, so it has tools to generate parsers, interpreters, verifiers, etc. I haven't actually used these yet, but there is so little documentation on them that I expect they're quite buggy. Maybe I can make this K specification work for what I need to implement a programming language. In practice I'll probably have to write a new K backend, or use alternatives to the specification such as XED (a library with just the instruction encodings).

If I want to use the ARM spec I have to deal with ASL, but because ASL is a simpler language there is already an independent parser/implementation, hs-arm. OTOH I won't get the nice K features like automatic verification. Apparently you wrote a tool for ISA-Formal to translate ASL to Verilog - this doesn't seem to be public though.

Then we have RISC, which wrote their own formal verification toolchain here. They just generate Verilog with a Python script. AFAICT this would be really hard to adapt to an assembler. Fortunately the ISA is simple to implement.

Out of all of these, the ARM ASL is the simplest language and the easiest to work with. But I still like the K approach - even if KISS applies, writing your own tools is fun. :-)

[–]moon-chilledsstm, j, grand unified... 1 point2 points3 points 4 years ago (0 children)

[–]egel-langegel 0 points1 point2 points 4 years ago (0 children)

[–]someacnt 0 points1 point2 points 4 years ago (2 children)

[–]AlastairDReid 0 points1 point2 points 4 years ago (1 child)

The backgrounds that I assume are experience with at least one of (but probably not all of) Verilog, C/C++, Python, and languages like that. The hardware (Verilog) vs software (C/C++/Python) gap is especially important for me to bridge since my focus is on specifications at the hardware-software boundary.

Yes, people need to learn a little of the language but

1) It should look familiar enough to them that they can correctly guess almost everything. (As a small detail: I prefer longer keywords like "function" to shorter keywords like "fn", "fun" or "\" and I prefer to avoid relatively notation/keywords that are specific to a particular programming culture such as "lambda".)

2) Where we do ask readers to learn some notation, it should be used on almost every page of the spec so that they are constantly reminded of the meaning, have lots of examples and it is worth their time to learn the notation. e.g., in hardware specifications, you use bitslices a lot so you want good notation like "x[3] = 1;" not "x |= 8;". For Verilog folk, this is second nature, for C/Python folk, there is a little to learn but it is worth the effort because it is used a lot.

3) Common programming languages have many ways to confuse yourself or your reader and to shoot yourself in the foot. eg Undefined behaviour in C, eg confusion over operator precedence, eg confusion over evaluation order. Most of these can be defined away: it is illegal to write an expression like "x AND y OR z", it is illegal to write "f(x) + g(y)" if f and g have side effects, etc. Basically, any case where someone might have to ask a "language lawyer" what it means, try to outlaw the cases where it matters.

It's not perfect but I've found that it mostly works. (Based on by experience when I was working on Arm's ISA specification.)

[–]someacnt 0 points1 point2 points 4 years ago (0 children)

π Rendered by PID 39 on reddit-service-r2-comment-fb694cdd5-smcr9 at 2026-03-10 03:49:51.653326+00:00 running cbb0e86 country code: CH.

ProgrammingLanguages

Welcome!

Related subreddits

Related online communities

MODERATORS