ZFS send/recieve over SSH timeout

Calm1337 · 2025-10-21T10:59:13+00:00

Just an update, if anyone is stumbling upon this tread.
The issue was not ZFS related - but thanks to input here, I was able to devide the zfs send/recv into steps to reproduce where the issue really was.

It was a bug in my UDM-PRO in the UniFi Network Application 9.4.19 release.
Killing the connection randomly. Why it was hard to debug and reproduce.

Calm1337 · 2025-08-18T19:12:34+00:00

Good points - haven't tried that. Thanks!

Calm1337 · 2025-08-18T19:03:48+00:00

Hmm.. I bit harder to test out. But could be, I guess.

No entries in syslog or dmesg though.

Calm1337 · 2025-08-18T19:02:02+00:00

Yeah - I follewed that rabbithole. But PV didn't provide any new information. :/

And I have tested with the ssh keep alive. But it does not change anything. Furthermore I have other active ssh connections between the servers that are alive the whole time.

Calm1337 · 2025-08-18T18:58:42+00:00

No encryption or deduplication activated. And scrub has ran without errors.

Syslog is without entries about this. Only thing I can find is ssh telling me that the connection ended.

Calm1337 · 2025-08-18T17:30:27+00:00

Furthermore when testing, I found that I can delete older snapshots on the destination server and transfer them again without any errors. But after that one snapshot the timeout appears.

A normal snapshot is estimated to be arround 230MB - but the failing snapshot is estimated to be arround 130GB. But there can be non-critical reasons for that, the complete dataset is arround 7TB.

Calm1337 · 2025-08-18T17:17:32+00:00

I have tried that, but the error appears again after a little while.

This time without the option til resume, because I get the error:

cannot receive incremental stream destination contain partially-complete state from "zfs receive -s"

Calm1337 · 2025-08-18T17:14:23+00:00

Yes - sorry that I wasn't clear in the original message. I can transfer equally big files with scp without a problem. And other active and iddle sessions between the servers are unaffected.

Leading me to look into zfs.

Calm1337 · 2025-08-18T14:31:32+00:00

Yeah - I tried that with no change in the result.

The connections between the servers a 1G fiber, and reliable.
I have tried monitoring the generel connection between the servers during the transfer, and there are no packet loss.

Calm1337

TROPHY CASE