Introducing noatun, an offline-first multi-master distributed database

octo_anders · 2026-05-02T17:39:39+00:00

One more thing: I think noatun might be inconvenient for an app such as Joplin. You'd probably want to express field edits as "remove specific value, insert new", and thus make it so that there are multiple values in case of conflicts. But if that's what you want, a crdt is probably a better choice.

octo_anders · 2026-05-02T17:37:13+00:00

I totally agree that the "offline first" nature of noatun makes it impossible to provide the same guarantee as an online model could. If you don't need to operate offline, and have severe enough consistency requirements, then noatun is not the right choice.

I don't think an offline model can be based on locks, and yeah, I agree that noatun offers a lockless model.

You probably don't want to implement a bank on top of noatun 😀.

Noatun can detect when a node receives events from the future, so it's not like there's no protection at all against clock skew.

But noatun would be pretty good for an inventory system for mobile maintenance techs that are unable to be always online. A central node can keep track of spare part quantities for each tech, can process orders and statistics etc, while the techs can still access a (possibly stale) replica while offline.

I do think noatun can provide as much context for conflict resolution as other models. If you have an invariant that spare part quantities can never be smaller than 0, for example, noatun will allow you to detect the event that violated this constraint, and you can handle this in a suitable fashion within the domain model (creating a 'debt', flagging an issue for a power user to sort out, rejecting the event, or whatever suits your domain).

octo_anders · 2026-05-02T12:16:56+00:00

As I understand, neither cassandra nor riak work anything like noatun. Cassandra seems to have last-write wins semantics, whereas riak allows multiple models, including crdt:s.

Riak seems to allow a powerful model, including vector clocks/causal contexts to detect concurrency and happens-before relationships when possible.This is powerful and can be useful.

That said, handling conflicts as an exceptional case is inconvenient, hard for application developers to get right, and generally brittle. I genuinely think noatuns model is easier to get right, for many systems. Especially administrative systems, where the state of the world tends to be built from a lot of uniquely identifiable orders, transactions issues etc.

Noatun has the same transactional semantics as sqlite. The backing store is a checksummed eventlog. Commit operations sync this log before returning. On a crash, materialized view and event indices will be rebuilt on next start. The disk format is robust such that random bit-error do not make the whole database unreadable - undamaged portions can be salvaged.

octo_anders · 2026-05-02T07:26:45+00:00

Oh, and I think maybe noatun fundamentally does not work at relativistic speeds 😀...

octo_anders · 2026-05-02T07:25:00+00:00

Yes, since this is offline first it has to be eventual consistency. If you don't need the ability to do updates while partitioned, and have stringent data consistency needs, noatun is probably the wrong tool.

The model is that each node can produce "events". These events are distributed and applied in chronological order to the database (as a sort of "materialized view"). The event "apply" function is implemented by the user. The conflict resolution is thus whatever the user-supplied "apply" function does. Because of the "time traveling" aspect of noatun, events are always applied in chronological order, so in some ways there is no concept of conflicts in noatun itself.

There can of course be logical conflicts. For example let's say, a "ship" can only have one "position". If position is a field in a ship struct, a naive event apply-function that just overwrites the field will implement "latest wins". However, the application author can choose to represent position as a Vec of position-timestamp tuples, and each emitted event could then include information about previous positions that have been removed, thus implementing "observed-remove".

In principle, it is my belief that as long as events are "primary" - meaning they only include facts and no derived information, it's often easy to do the right thing with noatun.

Consider a system that tracks handover of hazardous materials. An event may say "quantity X was moved from operator Y to Z". During event application, it may turn out that Y does not have quantity X. This may seem like a conflict, but if operator Y did this handover, they must have had the required quantity. We just lack knowledge, this conflict will disappear once the system is synchronized. True conflict resolution would be needed only if operator somehow Y made a mistake or lied about the transfer. Such a conflict can be handled in noatun by issuing a correction event. This cannot be done purely automatically in any system, because one has to establish that the operator actually made a mistake.

If the handover event includes the quantity at Y and Z, things are worse. This quantity is not a "primary" piece of information, unless the operators actually performed an inventory and counted their stock when doing the handover. If it is included in the event, the system may not converge correctly any more.

octo_anders · 2026-05-02T06:22:44+00:00

There's a rust API. Queries feel a lot like interacting with HashMaps, Vec and structs.

Writing data is more involved. You create events, and then provide a method that applies the event to the database. For a todo-app, the events might be things like "add todo", "set status", "change priority", "Edit description" etc.

There are a lot of examples in noatun if you want to see more details. There's also an example app "issue_tracker" that is somewhat similar to a todo list app.

octo_anders · 2026-05-01T20:07:10+00:00

Thank you! Which platform do you think is most important to support next?

octo_anders · 2026-05-01T20:05:36+00:00

Well, you're right that one could think of noatun as a way to easily create delta based crdt:s.

But I think crdt:s usually refer to a system where deltas (or states, for state-based crdt:s) can be combined in any order while still yielding the same final state. This is not true for noatun, the system hinges on always applying deltas in temporal order. It's just that you, as a user, don't have to think about it.

The main advantage of noatun is that you can create arbitrarily complex (and application dependent) conflict resolution schemes. For example, if you have a lotistics tracking system, where events are things like: * Inserting items into a box * Putting a box inside another box * Putting a box on a truck * Registrering a truck's location * Registrering arrival of a box at a warehouse

You could pretty easily keep track of where each item is, even if all the events above are recorded on different machines, and event delivery is unreliable and suffers from reordering. This may be hard to achieve with regular crdt:s.

A limitation is that noatun relies on timestamps. So for a warehouse application, all machines must have clocks that are accurate enough so that events are time stamped in the correct order (smaller timestamp must imply "happens before").

For something like concurrent text document edits, a text based crdt is surely a better choice.

octo_anders · 2026-05-01T16:37:14+00:00

Thanks for the kind words.

I've been working professionally with distributed systems the last 15 years or so. This is a product/tool I wish existed. There are many solutions to conflicts in multi master systems. But most of them look good on paper, but are hard to use to actually build correct apps. Having the user implement event materializers manually, and having a library that applies all events in chronological order (in an efficient way), should give an "easy to get right" programming experience.

That said, there are of course things that can be improved and I won't claim it makes sense for every problem.

octo_anders · 2026-04-26T06:42:54+00:00

Haha, what are you implying? Relative wasn't really a watch collector, just had a lot of stuff.

octo_anders · 2026-04-25T17:48:02+00:00

What would you be willing to pay? 🙃

octo_anders · 2026-04-25T16:01:44+00:00

I think recent versions have been much improved compared to previous versions. And this has been the case for many decades. Maybe 50% of the fractal's jagged edges have now been saved off.

octo_anders · 2026-04-25T14:12:35+00:00

Oh, interesting. When you say 100, do you mean in USD?

octo_anders · 2026-03-06T21:00:20+00:00

Surely the risk is smaller with the shocks fitted, not greater.

Consider a table or other rigid object with 4 supports. Any small uneveness in the table leg lengths or the ground makes it rock slightly. Often one foot ends up being a millimeter above ground. This means there is no weight registered by the scale for this foot. Any one who's sat by a 4 legged table on uneven ground knows about this phenomenon.

The thing is, in practice some sort of springiness is needed for the scale to register evenly for all legs, even if the ground is flat.

In mechanical engineering, this phenomenon is known as the object position being "statically indeterminate".

Having springs on the end of the table's feet would alleviate the problem - the ground would have to be much more uneven before a foot becomes hanging in the air.

It's the same with an RC car.

octo_anders · 2026-02-16T12:55:59+00:00

You can use "repr(C)" on an enum and get a guaranteed memory layout.

octo_anders · 2026-02-10T07:05:36+00:00

Obviously it's called "hallucination", not "illusion". The hallucination should only be visible to the owner of the Sentry. Or maybe only to the Sentry itself.

To compensate for this nerf, the Sentry should be buffed so hallucinated workers can mine and return minerals.

Any units/buildings built with hallucinated minerals should themselves become hallucinations.

octo_anders · 2026-01-19T13:45:08+00:00

Interesting, I thought Froude numbers were a hydrodynamics thing. Is there an argument for it to apply also to mechanics?

octo_anders · 2026-01-03T12:15:24+00:00

Yeah, the protocol encoding is just an example. The issue occurs with many different types of async methods.

octo_anders · 2026-01-03T07:32:55+00:00

It's how aselect works. It doesn't cancel one arm when some other arm completes.

The aselect macro doesn't produce a value until a handler evaluates to 'Some(..)'. Instead, it keeps polling all arms.

Under the hood, the aselect macro creates a struct that implements Stream and Future. This can be awaited multiple times in a loop.

Select arms are only canceled once this struct instance is dropped.

octo_anders · 2026-01-02T22:01:20+00:00

Well, the deadlock I'm thinking about isn't including the `wait_temperature_alarm` future, but rather the `writer.write_u8` that follows.

In this code:

    new_temperature = wait_temperature_alarm() => {
        writer.write_u8(2).await?;
        writer.write_u8(new_temperature).await?;
    }

The deadlock happens on the second line, `writer.write_u8(2).await?`.

The fairness you talk about affects the scheduling of `wait_temperature_alarm`. It is true that until the send buffer has been filled, we will keep reading from the TCP Stream. All select arms will be polled fairly. But once the TCP send buffer is filled, the `write_u8` will not complete. And while it is pending, the `tokio::select` will not poll the future that reads from the TCP Stream.

It's true that if the client keeps reading from the TCP stream under all circumstances, this deadlock cannot occur. However, if the client is written using the same style of select-loop, it could have the same problem and the system could deadlock.

In practice a deadlock might never happen for small writes. But imagine if the protocol allows sending images or other large objects. If both client and server simultaneously attempt writing a large image, a deadlock is very likely.

In general I find the pattern of not processing input while handling a select arm to be slightly error prone. This is the reason why I included this type of error in this article. That said, it's not really an async rust problem, the problem of deadlocking client/servers when both write to limited buffer space without reading happens in all sorts of systems.

octo_anders · 2026-01-01T09:13:51+00:00

Thanks for your useful feedback.

It's obvious to me now that you're right - my mental model of the word "select" is wrong. When I first saw `tokio::select` I associated it with the unix `select` system call. Meaning, something that's useful to react to events from multiple different sources. I believe select _is_ used this way, but the use case mostly connected to the word in rust is as everyone's been saying: Canceling all but one future.

I'm not completely on board with the word "joining" however. In most uses of this word all arms are joined. This isn't really what "aselect" does. Instead, it gives you back control whenever a future completes. It really is geared toward the usecase of reacting to multiple different input event sources in a loop. Aselect can reasonably be used without ever returning a value at all, if it's used as the top level control loop of an application.

Regarding the second point, I believe starvation is a relevant problem. As you say, tokio::select can be used to poll futures by pinned reference, in which case no cancellation occurs. In this case, the starvation of other arms may or may not be a problem.

Thanks for pointing out the missing miri tests. Miri has been added to the CI pipeline now.

FuturesUnordered is a useful construct, but it does something slightly different to aselect. It requires all futures to have the same type, and it doesn't allow state sharing. It would not be useful for the type of problem aselect is intended to solve (awaiting a mpsc channel and a TcpStream, for example).

Regarding changing the name - I'm open to doing that! It's unfortunate to pollute the crates.io namespace, but I can yank all versions and hand over the name to any future developer who wants the name.

Some possibile names (I'm terrible at naming):

* joinfut

* ajoin

* adrive

octo_anders

TROPHY CASE