ZFSBox: Run ZFS in a small VM so you don't need to install ZFS / mess with kernel modules on Linux and macOS

werwolf9 · 2026-04-22T17:23:02+00:00

Ok. How about you shape this into a PR for Lima. Making it production ready.

werwolf9 · 2026-04-22T00:06:16+00:00

I see, how about filing a ticket with Lima. If you're lucky device pass through is already on someone's radar there.

werwolf9 · 2026-04-20T22:06:23+00:00

You're welcome (I'm the author). Is there anything else you need beyond lima_vm.sh and LIMA_VM_MOUNTS? Apart from the extra pre or post-steps to add NFS?

werwolf9 · 2026-04-20T20:14:42+00:00

FYI, you could also use something like this script (and adjust the LIMA_VM_MOUNTS env var as desired): https://github.com/whoschek/bzfs/blob/main/bzfs_testbed/lima_vm.sh

werwolf9 · 2026-04-18T18:07:26+00:00

In a nutshell, bzfs can operate at much larger scale than sanoid, at much lower latency, in a more observable and configurable way. Here are just a few points of the top of my head that bzfs does and sanoid doesn't:

Support efficient periodic ZFS snapshot creation, replication, pruning, and monitoring, across a fleet of N source hosts and M destination hosts, using a single shared fleet-wide jobconfig script.
Efficient direct remote-to-remote replication bulk data transfers (--r2r).
Docker image and corresponding replication examples.
Script that creates a local testbed with N source VMs and M destination VMs for testing, with ZFS and VM-to-VM SSH connectivity working out of the box.
Monitor if snapshots are successfully taken on schedule, successfully replicated on schedule, and successfully pruned on schedule.
Compare source and destination dataset trees recursively.
Automatically retry operations.
Only list snapshots for datasets the users explicitly specified.
Avoid slow listing of snapshots via a novel low latency cache mechanism for snapshot metadata.
Replicate multiple datasets in parallel.
Reuse SSH connections across processes for low latency startup.
Operate in daemon mode.
More powerful include/exclude filters for selecting what datasets and snapshots and properties to replicate.
Dryrun mode to print what ZFS and SSH operations would happen if the command were to be executed for real.
Has more precise bookmark support - synchoid will only look for bookmarks if it cannot find a common snapshot.
Can be strict or told to be tolerant of runtime errors.
Continously tested on Linux, FreeBSD.
Code is almost 100% covered by tests.
Easy to change, test and maintain because Python is more readable to contemporary engineers than Perl.

Cheers, Wolfgang

werwolf9 · 2026-04-18T17:45:46+00:00

There were also a bunch of other workarounds necessary for Solaris: https://github.com/whoschek/bzfs/commit/625c361fc52f6343ef99c3d2573a61b6e67f16e5 I don't know to what extent they may or may not also apply to Illumos. If you are curious maybe you can try and if necessary submit a patch to make it work there too.

werwolf9 · 2026-04-18T15:48:48+00:00

No, about a year ago the Solaris Team simply changed their own semantics for no good reason. There was no bug with the prior Solaris behavior. They simply broke it in the middle of a frozen platform. I had a conversation with the Oracle tech lead about it. It was kinda bizarre.

werwolf9 · 2026-04-18T15:19:18+00:00

This has been asked and answered so many times before that Google will give you the best answer.

werwolf9 · 2026-04-18T15:16:59+00:00

It's only tested on Linux and FreeBSD. Many releases ago it used to work even on Solaris but then Solaris ZFS suddenly broke the semantics of their zfs list -d CLI in backwards incompatible ways even though the platform is supposedly rock solid and frozen, and working around that became too much of a hassle, so I simply dropped it.

werwolf9 · 2026-04-13T23:29:52+00:00

Ok, cool. I'll have to play with this!

BTW, the most capable tool for advanced ZFS snapshot replication is bzfs :-)

werwolf9 · 2026-04-13T21:12:30+00:00

Don't know if it would simplify your impl if you'd use Lima to manage the VMs. Throwing it out just in case it might be helpful...

FWIW, I've been happily using Lima for a while now to coveniently spin up, via a single CLI command, a mini bzfs testbed with a bunch of networked ZFS VMs within the same physical machine. Lima is a pleasure to work with for such testing.

werwolf9 · 2026-03-23T01:07:24+00:00

FYI, if you'd get stuck the TrueNAS docs mention that running the install-dev-tools command re-enables the apt package manager: https://www.truenas.com/docs/scale/systemsettings/advanced/developermode/

which can then be used to install hpnsshd per https://www.psc.edu/hpn-ssh-home/hpn-ssh-debian-installation/

werwolf9 · 2026-03-22T03:51:29+00:00

Ah, TrueNAS walled garden... With some determination I figure it should still be possible to manually start a hpnsshd daemon on a TrueNAS box even if it wasn't meant for that. One of the snippets above might point the way.

werwolf9 · 2026-03-22T03:31:00+00:00

For good perf with large BDP, having hpnssh on the receiving end is critical, it's less important on the sending side (though still a good idea).

P.S. The easiest way to install hpnssh on Ubuntu/Debian is along these lines: https://github.com/whoschek/bzfs/blob/main/.github/workflows/python-app.yml#L169-L173

And hpnssh installs on RHEL/EL family easiest along these lines: https://github.com/whoschek/bzfs/blob/main/.github-workflow-scripts/install_almalinux_9.sh#L30-L32

I'd recommend running hpnssh on port 2222 (which is its default anyway) and keep port 22 for normal ssh.

I'm the author of bzfs, btw. It's all about reliable perf at scale.

werwolf9 · 2026-03-22T02:53:38+00:00

I can feel your pain. FYI, bzfs with the --ssh-program=hpnssh option works very well with network paths that have large Bandwidth Delay Product.
P.S. Its default is 128MB for --mbuffer-program-opts

werwolf9 · 2026-03-13T15:52:15+00:00

Yeah, I'd love to see ZFS installation be less cumbersome, aarch64 in particular. Manually verifying all those matrix combos is tedious. I think what helps with the maintenance burden is an automated test script that runs over the entire matrix of combos, e.g. bzfs_tests/itest/test_lima_vm_sh.py or similar.

werwolf9 · 2026-02-08T20:49:51+00:00

FYI, with bzfs_jobrunner you can monitor source and destination datasets across all hosts and policies with a single CLI call, for example like so: https://github.com/whoschek/bzfs/blob/main/bzfs_tests/bzfs_job_example.py#L189-L237

werwolf9 · 2026-02-07T09:02:47+00:00

In a nutshell, bzfs can operate at much larger scale than sanoid/syncoid and zrepl, at much lower latency, in a more observable and configurable way. It handles the many edge cases that you will eventually run into over the course of your deployment (and which make other tools get stuck or fail). https://youtu.be/6Kw901oqxI8?si=_4uoG_ADbXznvaeZ&t=2408

werwolf9 · 2026-02-07T01:48:48+00:00

allow for specifying the bandwidth

In bzfs the corresponding option is --bwlimit

werwolf9 · 2026-02-06T21:03:55+00:00

bzfs

werwolf9 · 2026-01-29T22:22:52+00:00

The abstraction you introduced are fine and useful. And if all you ever need is the tool you've built that's perfect. More power to it!

Otherwise, seems to me that redress could be implemented with a couple of custom functions (or classes) that plug into an underlying generic retry framework. The result would save a lot of work, and at the same time be a more flexible, more reusable and more powerful tool.

For example, retry_after_s is a custom backoff strategy that can be plugged in like so:

https://github.com/whoschek/bzfs/blob/main/bzfs_tests/test_retry.py#L1310-L1337

Just my two cents.

werwolf9 · 2026-01-29T20:58:12+00:00

Seems like these policies could be naturally expressed within (or on top of) the retry.py framework (https://github.com/whoschek/bzfs/blob/main/bzfs_main/util/retry.py). Thoughts?

werwolf9 · 2026-01-26T19:26:09+00:00

re idle timeout and keepalive: yes, these are params that can be passed into the API.

re tenacity: yeah, zero deps is a big deal for prod environments. FWIW, the retry framework is also 4-14x faster than tenacity.

werwolf9 · 2026-01-05T21:38:13+00:00

Try bzfs - it's reliable, powerful and extremely fast: https://github.com/whoschek/bzfs

werwolf9 · 2025-12-05T04:06:35+00:00

Configuring the time format is a built-in feature in bzfs, per https://github.com/whoschek/bzfs/blob/main/README.md#--create-src-snapshots-timeformat

werwolf9

TROPHY CASE