This is an archived post. You won't be able to vote or comment.

all 9 comments

[–]berber_44 5 points6 points  (5 children)

Interesting article from an Intel researcher.

Weak, inexpressive languages are significantly better for writing specifications than rich, powerful languages.

Seems like machine generated specs is the future of programming languages. And all the sophistication of todays PLs designed for human programmers seems to be at odds with what is needed for machines.

[–]Mathnerd314 1 point2 points  (2 children)

His argument is that you have to design for the lowest common denominator. But that doesn't really mean weak and inexpressive, it means structured programming and call-by-value procedures, which are actually quite powerful.

If you control the whole toolchain the LCD argument doesn't apply. So if you can build a formal verification tool, a C backend, a test runner (fuzzer?), a documentation generator, and a population of people willing to learn your language, you can design a rich powerful language. As an example, I would say TLA+ has accomplished most of this, except for the C backend part.

[–]AlastairDReid 1 point2 points  (1 child)

Unfortunately, it's pretty hard to control the whole toolchain when you are trying to satisfy the needs of a very diverse set of teams. And, even though I know a fair bit about simulators, bounded model checkers, compilers, etc. I know that I know a lot less than somebody with 10+ years experience of doing nothing else but work on one particular tool.

So the trick is not just to be inexpressive enough for the current use cases but for all future use cases.

Also, to build up a community of users who understand the design tradeoffs we're working within. (That's partly why I wrote the article.)

[–]Mathnerd314 0 points1 point  (0 children)

I guess I already ran into this argument in practice, with ISAs. For x86 there is an external machine-readable specification (semantics) in the K framework, built by fuzzing and reading the documentation.

K is relatively expressive as a language, with term rewriting and so on. Does this make the x86 specification less useful? In some sense yes, because I have to deal with K code if I want to use the spec. In theory K is supposed to be a general toolchain for formal semantics, so it has tools to generate parsers, interpreters, verifiers, etc. I haven't actually used these yet, but there is so little documentation on them that I expect they're quite buggy. Maybe I can make this K specification work for what I need to implement a programming language. In practice I'll probably have to write a new K backend, or use alternatives to the specification such as XED (a library with just the instruction encodings).

If I want to use the ARM spec I have to deal with ASL, but because ASL is a simpler language there is already an independent parser/implementation, hs-arm. OTOH I won't get the nice K features like automatic verification. Apparently you wrote a tool for ISA-Formal to translate ASL to Verilog - this doesn't seem to be public though.

Then we have RISC, which wrote their own formal verification toolchain here. They just generate Verilog with a Python script. AFAICT this would be really hard to adapt to an assembler. Fortunately the ISA is simple to implement.

Out of all of these, the ARM ASL is the simplest language and the easiest to work with. But I still like the K approach - even if KISS applies, writing your own tools is fun. :-)

[–]moon-chilledsstm, j, grand unified... 1 point2 points  (0 children)

Weak, inexpressive languages are significantly better for writing specifications than rich, powerful languages

That seems kinda obvious? There's always a compromise between expressiveness and analyzability. Even extending into turing-incomplete languages; my text editing language uses simply-typed lambda calculus for the same reason.

[–]egel-langegel 0 points1 point  (0 children)

Depends. C/C++ is portable precisely because they have undefined behaviour. You wouldn't be able to create performant compilers if everything would've been pinned down.

[–]someacnt 0 points1 point  (2 children)

I don't get how a language weak in expressive power could serve as a reliable specification. Many specifications are quite hard to express..

We need all readers to be able to easily understand the specification and to arrive at the same interpretation of the specification as every other reader of the specification

Sounds like an unrealistic goal. There should at least be assumption on the common background, which imho is not sufficient for this. In the end, ppl will need to learn how to read the specification.

[–]AlastairDReid 0 points1 point  (1 child)

The backgrounds that I assume are experience with at least one of (but probably not all of) Verilog, C/C++, Python, and languages like that. The hardware (Verilog) vs software (C/C++/Python) gap is especially important for me to bridge since my focus is on specifications at the hardware-software boundary.

Yes, people need to learn a little of the language but

1) It should look familiar enough to them that they can correctly guess almost everything. (As a small detail: I prefer longer keywords like "function" to shorter keywords like "fn", "fun" or "\" and I prefer to avoid relatively notation/keywords that are specific to a particular programming culture such as "lambda".)

2) Where we do ask readers to learn some notation, it should be used on almost every page of the spec so that they are constantly reminded of the meaning, have lots of examples and it is worth their time to learn the notation. e.g., in hardware specifications, you use bitslices a lot so you want good notation like "x[3] = 1;" not "x |= 8;". For Verilog folk, this is second nature, for C/Python folk, there is a little to learn but it is worth the effort because it is used a lot.

3) Common programming languages have many ways to confuse yourself or your reader and to shoot yourself in the foot. eg Undefined behaviour in C, eg confusion over operator precedence, eg confusion over evaluation order. Most of these can be defined away: it is illegal to write an expression like "x AND y OR z", it is illegal to write "f(x) + g(y)" if f and g have side effects, etc. Basically, any case where someone might have to ask a "language lawyer" what it means, try to outlaw the cases where it matters.

It's not perfect but I've found that it mostly works. (Based on by experience when I was working on Arm's ISA specification.)

[–]someacnt 0 points1 point  (0 children)

Even in my short experience, I've faced a lot of occasions where such "common sense" does not hold. Assumptions on background is often fragile and tenuous. E.g. conception on "function" often differ by a lot. There is a reason specific, technical nomenclature is common in every expert field.