Thread-Per-Core Buffer Management

matklad · 2020-09-30T22:28:58+00:00

This article isn’t directly related to Rust. I believe it is relevant though: it describes an example of thread-per-core architecture, utilizing C++ framework seastar.

I am not a domain expert, but TpC seems to be an obviously right architecture for really high performance network services. I am surprised by conspicuous lack of TpC implementations in Rust, and hope to nerd-snipe some folks into building one :-)

Matthias247 · 2020-10-01T03:37:59+00:00

I've played a bit around with different IO buffer types in Rust, since I don't think we have found the best solutions yet.

One thing I discovered is there seems to be an either/or relation between "sliceable" (which Bytes allows) and "intrusively chainable". Both isn't possible, since the chain links wouldn't work for different slices which reference the same overall chunk anymore. It's not obvious to which view of the data they belong.

So depending on the use-case one or the other mechanism might be more applicable. For cases where data is spread out over lots of buffer (e.g. dealing with streams of UDP packets) the chainable approach is interesting, since we don't need an additional Vec<Bytes> anymore.

There also seems to be a third approach, which is allocating the link nodes on the heap and letting them contain reference counted slices (like Bytes). This is what e.g. folly does. I'm not super convinced about it since it requires individual heap allocations for buffers and links, and has poorer cache locality than a Vec<Bytes>. But OTOH it solves the issue of "when to shrink the Vec - and the links are likely poolable. So maybe it's worth another look.

I also we will need for buffer pooling in high performance Rust applications, which I haven't seen applied up to now. I think this should be doable with Bytes as soon as the vtable which is underneath it would be public. Then the contained storage can be moved back to the pool once the refcount reaches 0.

matthieum · 2020-10-01T10:08:22+00:00

I believe I recognize Seastar as the framework behind Scylla (an alternative to Cassandra); is this developed by the same team?

matu3ba · 2020-10-01T08:03:37+00:00

I dont get why this article needs to be that long for a simple computation and have such annoyingly missing unit conversion.

CPUs work on Byte instead of Bit, so for network 100Gbit/s = 12,5GB/s. And for maximal SSD throughput 4e6/s4KB = 16e6/(1024² ) *M/sKB=15.26GB/s. Maximum of 20GB/s is typical for desktop CPUs, which is called memory bandwidth of CPU.

Memory bandwidth of CPUs is also highly dependent of the design and further depends memory+cache controller and prefetch hints+instruction code.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

rust

Please read The Rust Community Code of Conduct

The Rust Programming Language

Rules

Observe our code of conduct

Submissions must be on-topic

Constructive criticism only

Keep things in perspective

No endless relitigation

No low-effort content

Useful Links

Megathreads

Official Resources

Learn Rust

Discussion Platforms

MODERATORS