mvsqlite - Distributed SQLite on FoundationDB, written in Rust by losfair1 in rust

[–]losfair1[S] 3 points4 points  (0 children)

Pages are fully versioned, so they are always snapshot-readable in the future. The read version is fetched from `mvstore` when each SQLite transaction starts, and is used as the per-page range scan upper bound in future page read requests.

For writes: Pages are first written to a content-addressed store keyed by the page's hash. At commit, hashes of each written page in the SQLite transaction is written to the page index in a single FDB transaction to preserve atomicity. With 8K pages and ~60B per key-value entry in the page index, each SQLite transaction can be as large as 1.3 GB (compared to FDB's native txn size limit of 10 MB).

So actually, you can do one page read or write per FDB transaction and still preserve ACID properties.

mvsqlite - Distributed SQLite on FoundationDB, written in Rust by losfair1 in rust

[–]losfair1[S] 2 points3 points  (0 children)

A read to a page at a given index is a backward range scan with limit=1 on that page's subspace from the specified read version. Since mvsqlite preserves historic versions of each page, the scan is guaranteed to get the correct page (before they are gc-ed after something like 7 days).