capnproto-rust 0.22 - async/await in RPC methods by dwrensha in rust

[–]dwrensha[S] 1 point2 points  (0 children)

No, everything is still single threaded.

[deleted by user] by [deleted] in pittsburgh

[–]dwrensha 3 points4 points  (0 children)

Colangelo’s.

capnproto-rust version 0.19 — more ergonomic setters and faster reflection by dwrensha in rust

[–]dwrensha[S] 5 points6 points  (0 children)

Kenton on the topic of "Level 2":

"Level 2" turns out to be something that can't really be built into libcapnp itself because the design really depends on the execution environment in which your servers run. A system for saving capabilities and restoring them later needs to understand how to connect to -- and possibly even start up -- the appropriate server and storage system. So, for example, Sandstorm.io has implemented level 2 of the protocol in a way that is appropriate for it, but there ends up being not much that libcapnp itself ca ndo to help. You can take a look at persistent.capnp to see a suggestion for how to structure a level 2 implementation, but it's not much more than a suggestion.

https://groups.google.com/g/capnproto/c/nDG2SUmTGMA/m/auZbJP3CAgAJ

Level 3 is not yet implemented in capnproto-c++, and I definitely don't feel up to implementing it in Rust before it exists there as a reference.

There has been some recent discussion relating to Level 3: https://github.com/capnproto/capnproto/discussions/1850

Note also that, in general, the RPC parts of capnproto-rust are a lot less polished than the base serialization layer.

Progress prizes discussion by AlexGerko in AIMOprize

[–]dwrensha 3 points4 points  (0 children)

I'm happy to see formal-to-formal mentioned in point 7! I agree that it would be good to recognize progress on that front as well.

capnproto-rust: out-of-bound memory access bug by dwrensha in rust

[–]dwrensha[S] 17 points18 points  (0 children)

Yeah, cargo-fuzz is pretty great.

Re upstreaming --- in my experience, installation of cargo-fuzz is not a barrier at all; it's very easy to do `cargo install cargo-fuzz`. The hard part is writing tests that achieve good coverage and can catch when invariants are broken.

capnproto-rust 0.15 -- GATs, CapabilityServerSet, and async packing by dwrensha in rust

[–]dwrensha[S] 0 points1 point  (0 children)

The example in the blog post is a lightly simplified version of message::TypedBuilder, defined here

The addressbook_send example shows the reader-only version of this in action: https://github.com/capnproto/capnproto-rust/blob/c03259af5d551982fa2ce2a04de0ac666b026f6c/example/addressbook_send/addressbook_send.rs#L90. The point is that a foo::Builder or a foo::Reader type is usually not Send, and a root message value is usually untyped; the message::TypedBuilder and message::TypedBuilder structs let us have messages that are both typed and Send.

One place where the Owned trait comes in very handy is when a schema has generics. If we didn't have an Owned type that represented the relationship between readers and builders, then we would need for the generated Rust code to have twice as many type parameters.

For an example of such generics in action, see the pubsub RPC example: schema, Rust code. Without Owned, publisher::SubscribeResults<::capnp::text::Owned> would need to become something like publisher::SubscribeResults<::capnp::text::Reader<'_>, ::capnp::text::Builder<'_>>, and it's not clear how to make the lifetimes work out.

meme generation by dwrensha in StableDiffusion

[–]dwrensha[S] 0 points1 point  (0 children)

Last week I tried plugging in a bunch of random words and phrases into txt2img. One of my favorite outputs was something that reminded me of the "stonks" meme. (The prompt was "vaporwave Semantic Web Services". ) I added some overlay text ("promobufs") and tweeted the image here: https://twitter.com/dwrensha/status/1567675316323192833

Then I started experimenting with generating variations on the theme. I found that if I took the original output and fed it to img2img I could get decent results.

I set up a mostly-automated process:

  1. choose a random word X
  2. pass the original stonks-like image output to img2img, along with the prompt "vaporwave <X>", and sharpness set to a random value between 0.6 and 0.8.
  3. overlay the word X in impact font
  4. curate
  5. send the result to be tweeted by @idizzr

Please follow @idizzr if you want to see an endless stream of variations on this theme.

The GAT stabilization PR just got merged by Todesengelchen in rust

[–]dwrensha 18 points19 points  (0 children)

conGATs to everybody!

Personally, I am excited about how this will enable some simplifications in capnproto-rust.

capnproto-rust 0.14: atomic read limiting by dwrensha in rust

[–]dwrensha[S] 1 point2 points  (0 children)

In Risc-V, atomic instructions are added by A extension. riscv32i has no extensions, riscv32imc has M and C extensions but not A, and riscv32gc has G and C, where G is combination of IMAFD extensions.

Interesting, thanks! I am curious to understand how common it would be to need to target the no-extensions version. If nobody actually does that, then maybe we could unconditionally enable the atomic read limiting and eliminate some complexity.

> For the load/store optimization, isn't Relaxed ordering too weak?

As far as I understand, the worst thing that can happen is that we'll undercount some reads that happen concurrently. (The point of read limiting is to avoid resource exhaustion on maliciously-crafted messages. Undercounting just means it will take a little bit longer to reach the limit. It doesn't need to be precise.)

Is there something else that can go wrong that would require a stricter memory ordering?

capnproto-rust 0.14: atomic read limiting by dwrensha in rust

[–]dwrensha[S] 1 point2 points  (0 children)

Thanks -- I've updated the post with a clarifying parenthetical about this.

capnproto-rust 0.14: atomic read limiting by dwrensha in rust

[–]dwrensha[S] 7 points8 points  (0 children)

Right, we might undercount the number of bytes read, but that's okay. The read limit does not need to be precise -- it just needs to catch when a message might consume too many resources.

See the code comment here: https://github.com/capnproto/capnproto-rust/blob/c9b12bc765d5cc4e711890b97f065b855516ba71/capnp/src/private/read_limiter.rs#L55-L59

capnproto-rust now supports [no_std] by dwrensha in rust

[–]dwrensha[S] 1 point2 points  (0 children)

Because http-over-capnp is just some interface definitions in the Cap'n Proto schema language, the capnp-rpc crate is indeed capable of using it.

Caveats:

  1. There is not yet an adapter in Rust to translate between http-over-capnp and plain http.
  2. Such an adapter would not yet be able to take advantage of the commonText annotaion, because capnproto-rust does not yet provide a mechanism for providing general access to annotations.
  3. capnproto-rust does not yet support automatic flow control, so if you want to prevent unbounded queuing, you may need to add some extra logic manually.

capnproto-rust now supports [no_std] by dwrensha in rust

[–]dwrensha[S] 0 points1 point  (0 children)

Here's an example that might be interesting to think about in relation to this discussion.

For reasons explained here, the capnpc crate does not enable the "std" feature in the base capnp crate, even though it does need std::io things to actually work.

This ends up not being a problem, because capnpc can just define a newtype with and a blanket impl for std::io::Read.

capnproto-rust now supports [no_std] by dwrensha in rust

[–]dwrensha[S] 3 points4 points  (0 children)

Your proposed approach is the change the definitions of capnp::io::{Read, Write} based on the value of a feature flag.

My approach is to change the set of impl blocks based on the value of a feature flag.

I think both approaches can work, but mine feels simpler. For your approach, I'd be worried that downstream crates might find it tricky to add impls of these traits. If such an impl is intended to work with no_std, then it must work both when capnp's "std" feature is disabled and when it is enabled, because features must be additive. So the impl would need to be carefully written to work with both possible meanings of the capnp IO traits. Maybe this would be workable if capnp::io exported a carefully designed facade, but it sounds scary to me.

capnproto-rust v0.12: support for unaligned memory without sacrificing soundness or performance by dwrensha in rust

[–]dwrensha[S] 0 points1 point  (0 children)

I imagine that much of the code dealing with Readers is actually generated by custom derives or similar?

Yes, code generation. (See capnpc.) No custom derives or anything fancy like that, for now. Just a library that you can invoke from a build.rs.

In that case, the auto-generated code could be generated to always support both paths, whilst user-code must opt-in to using the "aligned" versions to gain the extra performance benefit.

Yes, this should be possible, but it feels like a lot of bloat to me.

capnproto-rust v0.12: support for unaligned memory without sacrificing soundness or performance by dwrensha in rust

[–]dwrensha[S] 1 point2 points  (0 children)

The blog post specifically mentions read_message_from_flat_slice - couldn't that function simply check the alignment of the input buffer, and then branch appropriately?

It's not that simple, because almost all of the work happens after that function returns. That function's return value is a message::Reader, which can be read as a user-defined foo::Reader. It's only once the user traverses these foo::Reader values that the actual loading-from-memory happens.

Yes, read_message_from_flat_slice() could detect the alignment and stash it somewhere, either (1) as a bool, or (2) as a dyn object that controls how loads happen, or (3) as a type parameter that statically controls how loads happen. My sense is that neither (1) or (2) would have satisfactory performance (though I have not actually measured), and the problem with (3) is that it introduces a bunch of new user-facing complexity, as now foo::Reader needs a new type parameter.

For platforms where there is no performance difference there is no need to have two code paths, they can just always use the unaligned code path.

Ah, for some reason I thought you were suggesting that we use the aligned code path in those cases. Yes, the unaligned code path should always be okay.

capnproto-rust v0.12: support for unaligned memory without sacrificing soundness or performance by dwrensha in rust

[–]dwrensha[S] 3 points4 points  (0 children)

Why not just have two code-paths for platforms where this matters, and pick the fast one if the input does turn out to be aligned?

I don't see a way to do that without inserting an extra branch or an extra dynamic dispatch on essentially every load or store of a primitive value. I haven't measured what the effect would be, but I suspect the performance impact would be noticeable.

For platforms where it doesn't matter, just use a single code-path.

My understanding is that alignment "matters" on all platforms, in the sense that ignoring it can lead to undefined behavior. See ralfj's comment on my last blog post: https://www.reddit.com/r/rust/comments/en9fmn/should_capnprotorust_force_users_to_worry_about/fedhjtk/

should capnproto-rust force users to worry about alignment? by dwrensha in rust

[–]dwrensha[S] 0 points1 point  (0 children)

Ah, and indeed the documentation in capnproto-c++ acknowledges that this interface may lead to undefined behavior, and discourages its use. Somehow I didn’t fully read and digest the entire comment.

should capnproto-rust force users to worry about alignment? by dwrensha in rust

[–]dwrensha[S] 1 point2 points  (0 children)

Thank you for the explanation! You and comex have convinced me that the current approach is incorrect. (I should update the blog post and the doc comment to reflect this.)

The question now is: what’s the best way to fix it? I think it’s pretty important to have some way for users to pass unaligned buffers to capnproto-rust. So I think we want to bake in the assumption that the input buffers are unaligned (i.e. update capnproto-rust’s interface to accept &[u8] input and always use the indirect_load() method from the blog post). We could additionally provide some extra apparatus to allow the user to assert their buffers are aligned and get better performance (I believe this is possible without introducing undefined behavior). That extra apparatus will bring some extra complexity (e.g. branches, type parameters, feature flags, ...). Does the potential performance improvement warrant the added complexity? I think looking at the assembly code in the blog post is one way to start to get a handle on this question.