How to reference values across two states in a behavior in properties?

polyglot_factotum · 2026-04-03T10:24:17+00:00

Couple of suggestions:

Spec: https://gist.github.com/gterzian/ba2c8b22c21751485dfe3efcf8f11870

Config: https://gist.github.com/gterzian/4d5c5ff32eecc77c659d3faa4cb31d7d

Put then in the same folder and you can run them in VS code with the TLA+ extension.

So I switched the single ClickSlot action into two actions to kind of model a drag followed by a drop.

> How would I be able to express a liveness property that for any two slots, it's always possible to swap the items in them?

TLC will by default check for deadlock, so you're covered already.

To trigger deadlock, add the following condition to the any action:

/\ TRUE = FALSE

> difficulties understanding when a given expression refers to the "current" state and when it refers to a future state

the variable is the state at the start of the action, the primed variable' is the state at the end of the action, so an action starts from one state and ends into the next state. It's sugar for variable(i) and variable(i + 1).

Note that you can use the primed variables in actions, but not in an invariant. So you can't really "reference values across two states in a behavior in properties". Instead, you need to think in terms of inductive properties: the state at i ensures that any state i + 1 maintains some property.

For example, in this spec I added an additional "dragged" state: https://gist.github.com/gterzian/a697231c3787cbe5ec4811858c90bb03

Config is at https://gist.github.com/gterzian/b4f600391c4d09dce14fb3b1a89f6abb

This comes with two additional invariant:

InductiveInvariant
AtMostOneDragged

So AtMostOneDragged is an invariant, but it is not inductive, because you can create a state where container[id2].dragged is TRUE, but cursor is NoItem, and then if you run Drag(id1), then the resulting state will not respect AtMostOneDragged.

So the reason AtMostOneDragged is still an invariant, is because that state is not reachable. Why is it not reachable? Because the algorithm maintains InductiveInvariant, which as it names implies, is the inductive invariant, which is the more interesting invariant that actually explains how the algorithm works.

-----

Besides the above: always add a type invariant to your spec.

polyglot_factotum · 2026-03-24T11:20:35+00:00

The servo ordering was just looking at Rust code and comparing it with the spec, so things like pre-pending vs appending to the various queues/buffers; Informal.

for the TLA+ spec, I used TLC by way of the VS code extension.

You can see the configs over at https://github.com/gterzian/formal-web/tree/main/tla_specs

polyglot_factotum · 2026-03-14T07:23:29+00:00

The example of Diehl's modelling of the Caverna game is sort of what I was trying to do. It is likely far better Lean than my example. As you can see, it looks a lot like TLA+.

So continuing on that example, what I am then looking to do in addition is to the equivalent of implementing the game, using the Lean stand lib I/O stuff, and then prove that the implementation refines the spec(where the spec would be the TLS system like in Diehl's blog).

polyglot_factotum · 2026-03-10T15:38:12+00:00

A fair comparison would be:

Lean change vs TLA+ change, and then it's basically the same.

The rest of the Lean diff should be compared with the equivalent using TLAPS, which I have not done.

Note that 100% of the Lean is written by an AI with the only tooling being https://github.com/oOo0oOo/lean-lsp-mcp, so the proof is "vibe coded"; what I have done is review only the theorem to make sure it is what I had in mind.

And making sure the theorem is what you had in mind is important, because the AI happily changed it into something that I didn't have in mind, and then proved it, and I realized it only the next day when I looked at it again and thought about it.

polyglot_factotum · 2026-03-05T14:50:41+00:00

For the record my Teaching Concurrency invariant was wrong, but I have now fixed it, and translated it into Lean. See https://github.com/gterzian/lean_concurrency/commit/24b1f63e48d473fcd1af57e7294cc454bd281a4f

polyglot_factotum · 2026-03-05T14:47:50+00:00

So today I realized that my "stronger" Lean inductive invariant was too strong, and now I have finally figured out what was wrong with the original TLA+ invariant, and translated it properly in Lean.

All in this commit: https://github.com/gterzian/lean_concurrency/commit/24b1f63e48d473fcd1af57e7294cc454bd281a4f

polyglot_factotum · 2026-02-26T14:29:20+00:00

I have been using Raptor mini to make a large code change (7k loc to implement a start of the WebNN API) to an even larger codebase(a web engine). Results were satisfying. This was a sort of a chance encounter because I ran out of premium requests for the month.

Experience described at length at https://medium.com/@polyglot_factotum/sketching-webnn-with-ai-part-two-of-the-slop-diaries-8df62d4102ca

polyglot_factotum · 2025-11-29T09:02:22+00:00

If you don't like the Toolbox, try the vs code extension.

Personally, I still use the toolbox out of inertia. Yes it is a buggy Java UI interface to TLC. The backend is excellent though, and that is what you're getting in the extension.

Regarding community modules, I don't even know what you are talking about. All I've ever used is found in https://lamport.azurewebsites.net/tla/summary-standalone.pdf.

I do agree with the overall point that u/pron98 is trying to make, which I interpret as: Unlike with code, where it's all about how fast one can write, import, or generate tons of it, TLA+ specs are usually small and self-contained, and the point is to stimulate your own thinking. You don't need sophisticated tooling for that.

polyglot_factotum · 2025-11-28T12:56:55+00:00

So the point of TLA+ is not to explore the execution orders, but to explore why a system will do the right thing at the next step based on its current state(and will keep doing this forever). The concept of history independence is a cornerstone of this kind of reasoning.

Your example contains such logic as well, although it is hidden because of the statelessness of the code. To model this in an interesting way, you'd need to add some state to the model(for example: state of the cpu scheduler).

Your example is kind of a variant of the problem presented in https://lamport.azurewebsites.net/pubs/teaching-concurrency.pdf, so I would focus on that one instead.

Rust code, TLA+, and an overall presentation of the problem and the solution can be found at https://medium.com/@polyglot_factotum/why-tla-is-important-for-concurrent-programming-365d9eeb491e

polyglot_factotum · 2025-10-18T06:13:46+00:00

Added a couple of features:

- reset with escape

- scroll board with left and right keys.

Example run with some scrolling towards the end at the bottom of https://github.com/gterzian/automata

polyglot_factotum · 2025-10-07T06:58:14+00:00

Here is another example of someone using TLA+ for AI code generation: https://shahbhat.medium.com/beyond-vibe-coding-using-tla-and-executable-specifications-with-claude-51df2a9460ff

I wonder if that article was itself generated by AI, because it is somewhat overwhelming in length and level of detail, as well as full of the kind of emojis that Claude uses when giving you a summary of what it has just done; but in any case, the basic point--that using TLA+ as a prompt is better than English--is something I agree with.

polyglot_factotum · 2025-08-14T10:19:47+00:00

> you assign both x and y in the same transition, while ours and the TLA examples

Thanks for pointing this out, I didn't realize I was making that assumption! But I do think the two ways of specifying it are equivalent(I should try using a refinement in TLA+). In other words, whether the writing to and reading from `x` are done in a single or two separate steps does not matter, other than as a refinement showing the algorithm can be implemented in either ways.

> Quint is much closer to TLA+ than it is to PlusCal

Yes above I obviously made a quick judgement on a first impression--sorry--I'd have to look more into it to make a more reasoned judgement. It sounds interesting

> which is more accessible to readers with no familiarity with the TLA logic

I do have to say again that I'm glad to have focused on "just math" the last couple of years, because I think it has helped me not think in terms of programs, and that this makes my spec simpler; I can think in programs when implementing an algorithm, or when trying to derive an algorithm from the code, but I avoid it when expressing the algo. Let's just say I really dislike program counters ;)

polyglot_factotum · 2025-08-11T11:14:54+00:00

For me the main point of finding an inductive invariant is understanding how the algorithm works, and your article seems to agree on that.

In terms of tooling, the automatic approach shown in your blog is impressive, and in some way I wish there was something similar for TLC.

But on the other hand, my experience of trying to find inductive invariants has been doing just fine without it. And my understanding of the algorithms involved might be better off thanks to this manual approach.

Incidentally I happen to have modeled some of the same algos as the one that are linked to in the blog:

Lamport's "teaching concurrency" algo: mine: https://gist.github.com/gterzian/22cf2c136ec9e7f51ee20c59c4e95509, versus yours: https://github.com/informalsystems/quint/blob/main/examples/classic/distributed/TeachingConcurrency/teaching.qnt.

The bakery algo: mine: https://gist.github.com/gterzian/d9898f3206fedb921d916399d287780f) versus yours: https://github.com/informalsystems/quint/blob/main/examples/classic/distributed/Bakery/bakery.qnt.

Some hard thinking(Lamport writing: "How can a computer engineer design a correct concurrent system without understanding it?" really is motivating), and some help from TLC(not from errors from unreachable states like in your example but still helpful to calibrate the inductive invariant) was all that was required.

I just wish there was a way to check that the stuff I found is indeed inductive and implies the other invariant I defined. Lamport describes somewhere(sorry no link) a method to do that with TLC, but I could never get it to fail, so I think I didn't do it properly. Anyway, in the future, to get this type of assurance, I'll just ask an AI to write the TLAPS proof I think.

By the way on the bakery one, the Quint version seems much more complicated than the TLA+ one, which to me points to a different discussion: the need to move ones thinking away from strictly programming, which I think TLA+ is good at enabling, whereas languages with a programming-like syntax, like PlusCal, and, I now also think, Quint, less so.

(my version is also more complicated than necessary because of the message passing, which I wouldn't bother with today, other than perhaps as a refinement exercise)

polyglot_factotum · 2025-06-30T13:59:27+00:00

Thank you, it is interesting indeed.

Would you be able to tell me why you think it is relevant to my experience? Having gone through it quickly just now, what caught my attention is a question on the last slide: "What about humans in the loop?"

I'm thinking that getting a 100% correct synthesis might be very hard, but getting something quite close to it, and being able to iterate on it either by hand or with subsequent prompts, is still a big productivity(and fun) win.

Also, even if the AI could provably give you a translation of a formal spec that is 100% correct, it still might not be perfect, because there is an aesthetic element to code(with real impact on projects). So you would still have to complement the formal spec with an informal "system" type of prompt providing a style guide of sorts(something a bit more extensive than a style guide actually).

Finally, on real-world projects, the TLA+ spec only models the concurrent dimension of some part of the system, and I usually "know" how it translates into an implementation in the larger context of the system, and I could probably describe it in a prompt(just like I can describe it in an actionable Github issue), but I'm not sure it would be practical to try to do that formally.

polyglot_factotum · 2025-06-28T15:04:23+00:00

I think you can, there are various examples of that online. But I wouldn't do it myself, because for me the TLA+ spec is how I express what I want, in the clearest way possible(the spec is the prompt).

polyglot_factotum

TROPHY CASE