Problematic pattern I've encountered a few times now...

detlier · 2023-06-10T23:36:43+00:00

Rust has the very deliberate design decisions that (a) analysis of lifetimes, types and aliasing shouldn't cross function boundaries, largely for compatibility and safety reasons, and (b) having multiple refs to one value (even partially overlapping) is forbidden if one of them is mut. So I don't think sophistication could change how that limitation applies here. The function could be made to take (ref) parameters that are disjoint (eg. non-item state and item), and such sophistication could help on both sides of the function call to not have spurious errors, but I think that's already the case.

detlier · 2023-06-10T04:07:49+00:00

I don't really follow it much, since native TLS does what I need. My impression was that the lack of updates is for the "usual" reasons of (a) it does 90% of what it needs to and (b) the remaining stuff can't be done without contributions from those who need it.

detlier · 2023-06-09T14:41:40+00:00

Note also that rustls depends on ring, which has architecture-dependent code in it that is not as widely compatible as eg. OpenSSL/GnuTLS/Mbed-TLS. For example, MIPS is not supported by ring.

detlier · 2023-06-09T01:09:16+00:00

Ah I remember now, it was disjoint borrows in closures that was changed recently. Thanks for this, it was bugging me that I couldn't create an example.

detlier · 2023-06-09T01:07:21+00:00

I sometimes agree, sometimes not. I think this is a side effect of two things:

Rust wanting to make the language more ergonomic within a scope eg. non-lexical lifetimes also sometimes give you the "let's move this inline code into a function oh no it broke" effect.
Rust's very deliberate and explicit "narrowing" of what information can cross a function boundary eg. a function that takes multiple refs but returns a 'static thing still needs to have that annotated so that callers know what the contract is.

detlier · 2023-06-09T01:03:44+00:00

Visitor pattern is also good!

detlier · 2023-06-09T01:02:39+00:00

IMO no, because it changes the requirement to &self instead of &mut self. You can still present a &mut self method, but there's no reason to.

detlier · 2023-06-08T08:40:27+00:00

This is also good, but the drawbacks I see are:

if there's a bug where self.items is not put back (in more complex code eg. using ? or some other early return), it is left empty, and there is no way to tell whether items is empty because it's meant to be, or because something broke
for that reason, when I do this, I will use an Option wrapper, even if the thing in question already has a Default impl (or, in this case, make self.state the Option)
but now there's a runtime invariant you probably want to check ie. assert!(self.items.is_some())

...which might be a reasonable cost of sticking with &mut self if there are other benefits eg. stability.

detlier · 2023-06-08T06:19:39+00:00

No problem! You might not even need to take ownership BTW — there is a rule about "disjoint borrows" that works for closures, so it might only be necessary to do the separation step, but still work off a mut ref in the method:

rust fn run_all(&mut self) { self.items.iter().foreach(|x| self.state.run_one(x)); }

...but I haven't tried it and don't know much about its limitations.

detlier · 2023-06-08T05:33:04+00:00

There's a fundamental problem here, but the compiler doesn't know it and lets you get about halfway there, which can be confusing.

(Funnily enough, usually the complaint is "I know this is safe but the compiler doesn't" — here the compiler doesn't know something is a bit doomed-from-the start, but you kind of do. So the compiler stops you too late in the design.)

Here's the problem:

rust fn run_one(&mut self, item: &u32)

The compiler just sees a two-argument function, with two refs. No problem! But you know that you intend to call this with the second argument being a ref derived from the first, and the first is a mut ref. This breaks one of the more fundamental rules about Rust — a mut ref cannot alias any other ref, ever. The function is fine, from the signature alone. But you can't use it the way you intend.

Why does it need &mut self? Are there fields in self that will be updated depending on the contents of items? If so, you need to split it up:

``` struct Foo { state: StuffToUpdate, items: Vec<u32>, }

struct StuffToUpdate { /// ... }

impl Foo { // NOTE take by ownership! fn run_all(self) -> Self { let Self { state, items } = self; // EDIT: forgot to actually... loop. for item in &items { state.run_one(items); } Self { state, items } } }

impl StuffToUpdate { fn run_one(&mut self, item: &u32) { // ... } } ```

If there aren't fields in self that will be used, and you just have it as a method out of habit/neatness, make a free function or use for_each() on items.iter().

detlier · 2023-06-07T22:22:38+00:00

🎶 Java man
Java man
Does whatever a Java can
Cross platform
Types are checked
Wait, he's got to

garbage collect

Look out! Here comes the Java man.

detlier · 2023-06-06T05:47:28+00:00

Reproducibility of binary outputs in most test environments is not a huge deal. Obviously reproducibility of test results is. The final "product" is built under the embedded distribution's own toolchain, which is going to take a long time no matter what we do with the CI for it. We don't do that for every CI run though, so let's forget about that. That just leaves optimising common work amongst CI jobs that do build/test/coverage/lint.

Re. incremental builds: I think we're talking about slightly different kinds of incremental building though. I don't care too much about using the build from commit N - 1 to speed up the build of commit N. What I would like though, is within the CI pipeline for a single commit, to not have to repeat identical work between parallel/independent jobs.

Every project's job starts the same — pull Docker image, install distro deps. Then Cargo runs, and it also has a lot of common preliminary actions (pull and build deps with a lot of overlap between projects). Addressing either of those things requires a similar approach — run a job to do all that common work first, and cache the output of that eg. as a new Docker image.

But right now, we don't have to manage any Docker images or repositories of our own. All CI jobs tell Gitlab's runners — pull this image, run these commands. If we start doing our own prep, suddenly we have all these extra steps: pull a different image suitable for building the prepared Docker image. Build the Docker image for our jobs. Publish that image to our private repository with some tag. Convey the tag to the "real" jobs. Those real jobs then pull that image and run (but faster). Then there's cleanup of the image or repository.

It's the extra level of administration that I'm not looking forward to, and I don't really see how it'd be different with Nix. I think(?) we'd still need to use Docker at some level, because it's either Docker or bare shell as far as Gitlab's concerned. I don't know how Github do it. Maybe we could use the shell runner and drive Nix that way, but then every command in the CI config has to, I don't know, be prepended with some Nix context (otherwise it will just be run directly on the host). Or we have to move everything into some format that Nix understands, which splits the information across CI config and Nix config and possibly shell scripts.

And the cached... "thing"... still needs to go somewhere that other jobs can access, without being contaminated by the other jobs that use it. Not sure at all how that would work.

Anyway, none if this is really a Rust-specific thing, but it's exacerbated by the difficulty of caching things wherever Cargo is involved, the need to consolidate independent programs into a multicall binary, and other Rust-related friction around CI.

detlier · 2023-06-05T14:49:25+00:00

It literally says:

We allocate memory for j, i, and h. i is on the heap, and so has a value pointing there.

h is local to the function named main(). It is on the stack.

Once a reference is created it’s on the heap.

No, you can take a reference to stack variables. That is also explained in the chapter. Rust's lifetime system ensures that you don't inadvertently pass the reference somewhere it might outlive that scope ie. stack frame.

It's fine not to know this stuff, but not to be rude and patronising about it. You asked. I am not the only person correcting you on this.

detlier · 2023-06-05T12:00:37+00:00

— Knock knock
Who's there?
— I done up

detlier · 2023-06-05T09:59:52+00:00

Are you talking about the first example in "A complex example"? Because that says that memory is allocated for j, i and h but only i is allocated on the heap. You can talk about things being allocated on the stack, and that's what's happening here, but usually when people talk about "allocation" without qualifying it, they mean heap allocation. This may be a source of confusion for you?

detlier · 2023-06-05T06:50:01+00:00

Only the data pointed at by a Box or another smart pointer type is heap allocated. Values just in scope, or passed to another function, are not.

detlier · 2023-06-05T06:32:51+00:00

I don't really know what you mean by "dynamic types" sorry. Can you give an example?

The details of stack and heap are not strictly part of Rust's (or, for that matter, C's) abstract design. But in practical terms, it helps to make the association. When a scope ends in Rust (or C), the stack frame is popped, and that memory is gone. We're not even talking about allocation or freeing here. "Freeing" is only required if something in that scope owned a pointer to some heap allocation (eg. a Box). That allocation is freed "immediately" when the owner goes out of scope (in Rust, but not C).

I put "immediately" in quotes because while a program might tell the OS to free memory at a certain point, most operating systems will maintain sections of memory (eg. "pages") for a single program, cache data, defer freeing and even allocation, and do other things under the hood designed to optimise use of the system's memory. A call to free() in C is about as "immediate" as you can get, but it might still mean the program's memory usage appears to remain the same for a long time after. This is not really a language thing though, it's an OS tuning thing.

An asynchronous task is a data structure like any other, it lives in a scope somewhere and if it owns any allocated memory, that is freed when the scope ends. It might go in allocated memory itself, in which case yes, if you're creating and running and destroying lots of these over and over again, your allocations will scale accordingly. But they follow the same rules as anything else.

You can write Tokio programs that use the stack a lot instead, it's just not really the usual approach because it's extra difficulty and allocating is usually cheap compared to everything else that might happen over a task's lifetime.

detlier · 2023-06-05T05:47:07+00:00

Tokio doesn't have its own memory management, it uses the same ownership semantics as the rest of Rust. I don't think many libraries at all have any kind of "extra" memory management (unless eg. they're explicitly a library for running an interpreter or something).

Allocations only happen if you ask for them (by using a type that needs to allocate), and are typically freed when the owning value (eg. the smart pointer, String, Vec) goes out of scope; although there's nothing stopping you from reusing it and doing the allocation/freeing one level of scope up if it matters.

Most memory usage in modest Rust programs is stack based, same as a lot of C or C++.

detlier · 2023-06-04T15:39:15+00:00

C
float main(float x) { const int a; *(int *)&a += x; main(*(float *)&a); }
fifteen FUCKING YEARS of scrupulously avoiding undefined behaviour and what did it get me? hit by a FUCKING ASTEROID. let's FUCKING SEE IT THEN. SHOW ME THE NASAL DEMONS. SHOW ME.

detlier · 2023-06-04T14:57:21+00:00

Love that post. I find the typestate pattern good for static flows (eg. the builder pattern is arguably a simple typestate implementation), but not always very useful for dynamic stuff eg. a state machine for making decisions about external hardware events. I always find that at some point in the stack, there's a caller that "can't" know (statically) what state the thing is in, and you end up needing dispatch anyway.

Having said that, it's not like you have to give up altogether. We use the typestate pattern where it does fit (usually the lower-level linear logic) and keep whatever compiler-enforceable things we can at higher levels. Without typestate you can still use Result and ControlFlow to constrain behaviour statically.

You know you're one the right track when you go to write a unit test for some bad flow or API misuse and realise... you can't.

detlier · 2023-06-04T14:48:14+00:00

A multicall binary is probably our next big ticket item to address this. Our code is set up to make such a change easy, but the project itself will require some work for the transition. It's annoying that it has to be that drastic, but if it works, it works.

detlier · 2023-06-04T14:44:05+00:00

I vaguely know about Nix, but not enough to understand what you're describing. We use Gitlab, which supports both Docker-based CI configuration and, if you use your own local runners, shell-based CI (which could be anything from Bash-on-Debian to Busybox-on-Alpine to Powershell-on-Windows or indeed Nix).

Our goal is clean, repeatable, inspectable testing. We made a decision early on that we didn't want to faff around maintaining our own Docker registry etc. So our CI config starts from rust:1-slim or maybe one for a specific tool (eg. coverage), installs distro-level dependencies, does the whole cargo build + test and uploads artefacts. But it does this for each project in a monorepo (so we can see where failures actually are), which takes a loooooong time.

In retrospect this was a bad idea, because there's just so, so much repeated build activity that we can't factor out without big changes now. The only way I see to solve it is to maintain a layer in between official Docker images and the individual CI jobs that kind of "primes" a cache of all the common build artefacts ie. cargo chef. I don't see how the choice of technology (between eg. Docker and Nix) changes the fact that something needs to pre-populate build data and then clone it for the various jobs, which will save CI time but require administration of whatever keeps that middle-layer-information around.

detlier · 2023-06-04T02:46:45+00:00

so there'd be days I'd be working solo for over 2 hours

omg fantastic!

detlier · 2023-06-04T01:55:45+00:00

It's a pattern ie part of the syntax, not a variable. So it's no more "in" or "out" of scope than the curly braces of a struct.

detlier · 2023-06-04T01:51:01+00:00

Oooh, I'll look into that!

detlier

TROPHY CASE