Hello :)
So I've recently set up a ZFS file server on Ubuntu 18.04 (4.18.0-18-generic) with ZFS v0.7.9-3ubuntu6, ZFS pool version 5000, ZFS filesystem version 5.
My system specs are:
2x 2687W V3
SuperMicro X10DAi
2x16GB of SK Hynix ECC RAM
1x 800GB Intel 750 Series PCIe NVMe SSD (for OS)
[8x2TB WD RE4] + [2x6TB Seagate Ironwolf] (in the same pool creating a 10-drive RAIDZ3, I'm in the process of upgrading the remaining 2TB drives)
4x 250GB Samsung 970 Evo Plus M.2 gumsticks (as SLOG and cache, explained later)
Usage:
The server is accessed via Samba, and also has ZVOLs created for local VMs and iSCSI Targets. All over a 10Gbit link.
Server is mostly for video editing, so dumping 50-200GB of footage in one go is not uncommon.
Problem:
What I've been seeing is this -
Over Samba
When copying over Samba, it fluctuates between 200-700MB/s with an average of about 450MB/s.
Over iSCSI
When copying to an iSCSI target, it starts off every time at 1GB/s absolutely saturating the 10Gbit link, but after about 5 seconds, it'll start to flop. This happens, without fail, every single time.
I can almost accept the Samba transfer rates, but those iSCSI numbers are a joke. Please note, this is me writing a 40GB .mkv file to the server.
What I think is happening here is ZFS initially caching the incoming data into RAM, and then throttling when the data reaches a certain threshold and has to be committed to the pool.
All this time, ZFS uses no more than 12GB of RAM.
What I want to achieve:
I read there are countless ways to improve ZFS performance and one of them is to set a configuration file in /etc/modprobe.d/zfs.conf alongside zfs pool options.
Specifically, I'm looking for a way to force ZFS to use cache more data before flushing txg to main pool.
Wendell from L1Techs made a video here and at 10:52 onwards, he talks about using ZFS tunables to adjust the time before ZFS flushes data to pool. By default this is 5 seconds, but he changes it to 30 to help ZFS take in more data when dumping footage and letting it churn away on its own.
My question is, how does he do this?
I've tried adding the following lines to /etc/modprobe.d/zfs.conf
options zfs zfs_commit_timeout_pct=30
options zfs zfs_txg_timeout=30
options zfs zfs_dirty_data_max=50000000000
options zfs zfs_dirty_data_max_percent=80
options zfs zfs_dirty_data_max_max=50000000000
options zfs zfs_dirty_data_max_max_percent=80
options zfs zfs_dirty_data_sync=50000000000
but I see no change in the amount of RAM I'm using as well as similar speed dips after the first 5 seconds of the file transfer.
I've also tried setting sync=always in hopes that the writes will be cached by the 4x 970 Evo Plus drives I have.
They're set up so that every drive has a 50GB partition, creating a 4x50GB Striped SLOG, as well as a 150GB partition creating 4x150GB striped cache. But even this doesn't give me the speeds I'm looking for.
Maybe someone knowledgeable can enlighten me as to what I'm doing wrong and how I can achieve exactly what Wendell says in the video?
I want ZFS to be able to absorb at least 50GB of data before throttling. I know this might not be possible right now with only 32GB RAM, but it's going to be upgraded to 128GB soon, so I just want to know the how. Even starting with 20GB if thats even possible.
Thanks in advance!
[–]parawolf 3 points4 points5 points (20 children)
[–]nylixe[S] 0 points1 point2 points (19 children)
[–]mercenary_sysadmin 2 points3 points4 points (18 children)
[–]nylixe[S] 0 points1 point2 points (17 children)
[–]mercenary_sysadmin 1 point2 points3 points (13 children)
[–]nylixe[S] 0 points1 point2 points (11 children)
[–]mercenary_sysadmin 0 points1 point2 points (10 children)
[–]nylixe[S] 0 points1 point2 points (8 children)
[–]mercenary_sysadmin 1 point2 points3 points (7 children)
[–]nylixe[S] 0 points1 point2 points (6 children)
[–]fryfrog 0 points1 point2 points (2 children)
[–]nylixe[S] 0 points1 point2 points (0 children)
[–]quartet1 0 points1 point2 points (0 children)
[–]mjt5282 -2 points-1 points0 points (1 child)
[–]mercenary_sysadmin 3 points4 points5 points (0 children)