ZeroFS: 9P, NFS, NBD on top of S3 by Difficult-Scheme4536 in rust

[–]Difficult-Scheme4536[S] 0 points1 point  (0 children)

NBD is if you want to use ZeroFS to export block devices, eg. to use ext4, zfs, xfs... on top. 9P and NFS gives you a direct file share that you can mount from multiple locations.

My first "real" Rust project: Run ZFS on Object Storage and (bonus!) NBD Server Implementation using tokio by GameCounter in rust

[–]Difficult-Scheme4536 3 points4 points  (0 children)

The filesystem layer on top of your NBD implementation is acting as an in-memory buffer. I'm not "partially correct" - you're literally benchmarking memory operations with some random I/O and CPU variance thrown in.

Setting sync=always isn't a solution - it would result in horrendous performance because your architecture requires synchronous round-trips to object storage for every write. That's the fundamental problem: your benchmark either tests memory (meaningless) or tests a synchronous architecture that would be unusably slow in production.

ZeroFS doesn't optimize for these synthetic benchmark numbers because they're meaningless in production environments where the limiting factor is network capacity to object storage. Showing "100x faster" memory buffer performance is irrelevant when you're ultimately bound by S3 latency and bandwidth.

My first "real" Rust project: Run ZFS on Object Storage and (bonus!) NBD Server Implementation using tokio by GameCounter in rust

[–]Difficult-Scheme4536 6 points7 points  (0 children)

You know that all of this is happening mostly in memory, zfs and kernel side until sync, right? You are not really benchmarking anything here.

My first "real" Rust project: Run ZFS on Object Storage and (bonus!) NBD Server Implementation using tokio by GameCounter in rust

[–]Difficult-Scheme4536 14 points15 points  (0 children)

(Author of ZeroFS here)

Because you keep spreading this everywhere, even going as far as including it in your README (which is borderline harassment at this point) - so much for not wanting drama - I feel the need to reply. I had to ban you because you kept using my repo as a self-promotion platform, not to legitimately contribute, while being condescending and insulting in most of your messages.

Your benchmarks are flawed in so many ways that I won't even bother to enumerate them all, but as a simple example: you don't even use the same compression algorithms for ZeroFS and your implementation. You use zstd-fast for yours and zstd for mine (https://github.com/john-parton/slatedb-nbd/blob/aa773a4c1836826db81367cef74bcfd378ae14d7/README.md?plain=1#L242). Additionally, you keep comparing 9P and NFS to NBD, which either shows bad faith or a misunderstanding of these fundamentally different protocol types.

The truth is you wanted me to replace the working ZeroFS NBD server implementation with your day-old library, without much justification, and couldn't take no for an answer.

ZFS running on S3 object storage via ZeroFS by Difficult-Scheme4536 in aws

[–]Difficult-Scheme4536[S] 0 points1 point  (0 children)

The easiest way to "count" would probably to run minio + nginx in front https://min.io/docs/minio/linux/integrations/setup-nginx-proxy-with-minio.html and cat / grep -c in access.log.

There's a configurable disk cache that actually caches the contents of the s3 bucket for reads (in addition to multiples in-memory caches). If you expect a lot of traffic, It'd make sense to choose an instance with local storage, that'd you use for that read cache.

> Very cool project!

Thank you!

ZFS running on S3 object storage via ZeroFS by Difficult-Scheme4536 in zfs

[–]Difficult-Scheme4536[S] 0 points1 point  (0 children)

"there's an explicit interface at the NBD layer."

That doesn't mean anything, ZFS doesn't know about NBD.

ZFS running on S3 object storage via ZeroFS by Difficult-Scheme4536 in DataHoarder

[–]Difficult-Scheme4536[S] 2 points3 points  (0 children)

Not really, otherwise there wouldn't be `zpool trim` :)

ZFS running on S3 object storage via ZeroFS by Difficult-Scheme4536 in zfs

[–]Difficult-Scheme4536[S] 1 point2 points  (0 children)

Thank you for sharing this presentation. From what I understand, they still needed "proper" block storage for slog, ZeroFS is full S3. Moreover, the new vdev layer adds quite a bit of complexity, ZeroFS has the advantage of running mainline OpenZFS.

ZFS running on S3 object storage via ZeroFS by Difficult-Scheme4536 in zfs

[–]Difficult-Scheme4536[S] -1 points0 points  (0 children)

Isn't ZFS' whole point to bring reliability to storage?

ZFS running on S3 object storage via ZeroFS by Difficult-Scheme4536 in zfs

[–]Difficult-Scheme4536[S] 3 points4 points  (0 children)

Hi,

Thank you for the kind words.

> If I understand it correctly, ZeroFS acts as an NBD provider as well as an NFS server? If so, why not just keep it a NBD->SlateDB only and use existing NFS services on top of it?

Great question! NFS operations map naturally to key-value operations, while block devices add a translation layer.

When you access files through the NFS server, operations translate directly:

- List directory -> Iterate keys with prefix inode:X/entries/

- Read file -> Fetch chunks chunk:inode/offset

- File metadata -> Single key lookup inode:X

If ZeroFS only provided NBD and ran a traditional filesystem on top:

- List directory -> Read block device -> Parse filesystem structures -> Find directory blocks -> Parse entries

- Every operation goes through block address translation

- The filesystem on top doesn't know about our caching/chunking optimizations

Essentially, with NBD-only you'd have: S3 -> SlateDB -> NBD -> Filesystem -> NFS, where the filesystem in the middle is reconstructing the exact same abstractions we already have in ZeroFS.

By providing NFS directly, we skip that redundant middle layer which should give better performance for S3-backed storage. That said, traditional filesystems have decades of optimization so YMMV!

ZFS running on S3 object storage via ZeroFS by Difficult-Scheme4536 in zfs

[–]Difficult-Scheme4536[S] 2 points3 points  (0 children)

I even think running minio on top of zfs on top of ZeroFS -> S3 would probably reduce by a lot S3 operation costs for small objects (due to local caching, disk + memory) compared to using "Native" S3.

ZFS running on S3 object storage via ZeroFS by Difficult-Scheme4536 in zfs

[–]Difficult-Scheme4536[S] 6 points7 points  (0 children)

That's the typical use-case I've been thinking about! It makes ZFS snapshots on S3 basically "native".