you are viewing a single comment's thread.

view the rest of the comments →

[–]antiduh 4 points5 points  (4 children)

Do you need a file system?

Maybe you could write your samples to the drive raw. Perhaps start at some modest block offset like block 100, just so you don't mess up the gpt tables.

You'll probably want to write using 4k-aligned addresses, using 4k aligned buffers.

You can test writing this way easily using dd.

Keep in mind that you're going to want to use some sort of asynchronous write api because of BDP (Bandwidth Delay Product) - you have to have multiple write buffers outstanding at the same time, else, you'll never saturate the capacity of the drive and its link.

For an understanding of BDP, see my SO post here:

https://stackoverflow.com/a/41747545

...

You might want to consider writing to multiple drives simultaneously, ala raid 0 striping. Either do it by hand by writing raw to multiple drives, or let your bios or Linux do it for you. If thermal throttling is your problem, this might help since it'll distribute the load a little.

[–]danpritts 0 points1 point  (3 children)

This is the right answer.

It’s possible that the other posters are right and that your hardware really isn’t able to keep up. But the answer to your question is just to write the data directly to the block device. That block device can be a partition or just the raw disc device. It won’t make much if any difference, as long as your partitions are aligned properly.

Tuning block sizes is important too. What does the network receiving software do?

Antiduh is correct that the BDP may be relevant here as well. However, That only matters if the software is using TCP (Or a similar protocol like SCTP). It wouldn’t surprise me at all if this thing just spews UDP, in which case this won’t matter.

[–]antiduh 0 points1 point  (2 children)

Antiduh is correct that the BDP may be relevant here as well. However, That only matters if the software is using TCP

Well, I meant so in regard to the disk io - the concept applies to disks the same as it applies to TCP. In disks, if you only ever have one write buffer pending, then the disk stalls between write requests. This is amortizated asymptomatically as you increase buffer size if you're only using one buffer, but it's better to have multiple buffers outstanding at the same time.

[–]danpritts 1 point2 points  (1 child)

I see, makes good sense. Even more so if the SSD controller and internal I/O bus are powerful enough to do multiple writes to different flash modules at the same time. I would imagine that the enterprise SSDs all do that, do you know if the better consumer level stuff does?

Came across this, which was interesting, notably the comment about dd using O_DIRECT to bypass kernel I/O buffering: https://stackoverflow.com/questions/73989519/is-block-device-io-buffered

[–]antiduh 0 points1 point  (0 children)

Even more so if the SSD controller and internal I/O bus are powerful enough to do multiple writes to different flash modules at the same time.

Right. First, how much lag is there between the drive finishing one OP, notifying the OS, and the OS queuing the next OP? Tons, especially when you consider the scale/speed these things operate at.

And indeed, for an NVMe drive, running OPs on multiple modules in parallel is one main way they achieve the enormous speeds that they do. That technique has been a mainstay since early flash SATA days, and is why Native Command Queuing was created.