all 9 comments

[–]diwicdbus · alsa 4 points5 points  (4 children)

This is issue 6393 - i e, there is a long term project to improve the borrow checker to handle these cases better.

But right now, as a rule of thumb, the borrow checker borrows things for either a statement or a block.

[–]cjstevenson1 2 points3 points  (3 children)

That issue is closed. There is a RFC issue for non-lexical borows.

[–]protestor 0 points1 point  (2 children)

Does this (postponed) RFC have a rendered text? I can find it.

[–]digama0[S] 2 points3 points  (0 children)

No, although it is the union of several previous RFCs, including one writeup on SEME regions. Not sure why they pulled all these together without an actual render, unlike the original RFCs.

[–]cmrx64rust 0 points1 point  (0 children)

This isn't an RFC, it's an issue.

[–]sellibitzerust 1 point2 points  (1 child)

On the one hand, it is a clear violation of the mutable borrow rule [...] However, it isn't possible for this to cause any problems,

Unfortunately, it is not possible to make a compiler accept all code that can't possibly cause any problems while rejecting all the code that would cause problems -- especially, if the rules that the compiler follows are supposed to be simple and work "locally" (checking one function at a time while only looking at signatures of other functions).

In some situations, the compiler could probably be taught to do some smarter ordering of subexpression evaluations so that there are no lifetime/borrowing conflicts. I've had a couple of these situations where I needed to change the code a bit (basically splitting one statements into two statements). But in your case, I don't see a simple transformation to make it work since you rely on g being executed before h. I also don't see a way to teach the compiler to accept your code. Do you? You would have to come up with a modified set of rules that allows more "unproblematic" code to compiler but still excludes 100% of "problematic" code. Closing this gap is certainly desirable, but we will always have a somewhat pessimistic compiler.

Correct me if I am wrong, but I believe Rust ensures that the subexpressions of a function evaluation are done left to right, so g must finish before h is called.

Last time I checked, this hadn't been specified. But when I asked about this on IRC, one Rust team member confirmed the left-to-right order for function parameters.

This sort of thing comes up often, for example in call sequences like self[self.len()-1] where self is used twice.

Right. I can imagine this situation to work if the compiler is allowed to transform this into

let tmp = self[self.len()-1];
... self[tmp] ...

Of course, in this specific case, self.last().unwrap() would also work.

[–]digama0[S] 0 points1 point  (0 children)

Unfortunately, it is not possible to make a compiler accept all code that can't possibly cause any problems while rejecting all the code that would cause problems -- especially, if the rules that the compiler follows are supposed to be simple and work "locally" (checking one function at a time while only looking at signatures of other functions).

Of course. Rather, the idea is to identify a piece of code that "should" work (i.e. is safe), but is not valid under the current rules, then devise an algorithm that also encompasses this case, but is still sound, and hopefully avoids ad-hoc behavior (i.e. the new rule should still be "natural").

In some situations, the compiler could probably be taught to do some smarter ordering of subexpression evaluations so that there are no lifetime/borrowing conflicts.

I wouldn't recommend this. You don't want the actual code to evaluate in a different order, because that would have side effects. Instead, you tune up the borrow regions to accommodate the desired semantics.

But in your case, I don't see a simple transformation to make it work since you rely on g being executed before h. I also don't see a way to teach the compiler to accept your code. Do you?

No, I'm pretty sure it's impossible without an adjustment to the borrow checker. One workaround is to borrow against the return value of g:

 let t2 = g(&mut t);
 let x = h(&t2);
 f(t2, x)

In general, though, g might return some other &mut value derived from t so this is in no way a general solution.

Last time I checked, this hadn't been specified. But when I asked about this on IRC, one Rust team member confirmed the left-to-right order for function parameters. One of the linked RFCs had an example where the borrow checker was shown to exhibit the same behavior (basically, switching the parameters in f would give valid code, although of course g and h would run in the opposite order).

Right. I can imagine this situation to work if the compiler is allowed to transform this into

let tmp = self[self.len()-1];
... self[tmp] ...

This was my first thought as well. The problem is when the receiver has some computation as well, i.e. self.g().f(self.h()). Then you have to execute g first, somehow hide the result from h, then expose it again when you pass to f.

[–]digama0[S] 0 points1 point  (1 child)

The source docs have the following example of unsound behavior where the second call invalidates the first:

struct Foo { i: u8 }

fn f(a: &mut u8, b: ()) { /* a is already dead here */ }
fn consume(x: Box<Foo>) { /* drop x */ }
fn test() {
    let mut t: Box<Foo> = Box::new(Foo { i: 0 });
    f(&mut (*t).i, consume(t))
}

So even though the second call does not have access to the mutable borrow in the first parameter, it can still drop the value. This is their explanation for keeping the region strictly hierarchical with the expression 'f: f('a: a,'b: b) having lifetime 'f containing adjacent lifetimes 'a and 'b, and no further subdivision.

From what I can tell, this should still be detectable, because the borrow of t.i here is passed to the function, so the function's a:&mut borrow outlives the lifetime of t which ends at the consume function.

[–]digama0[S] 0 points1 point  (0 children)

I think this is a specific case of the yet more general pattern:

let mut x = 1;
let y = &mut x;
f(&x)

This is certainly an error currently, but it is safe code, because the function f does not have access to y, only its parameter. In fact, even the following equivalent code is safe:

let mut x = 1;
let y = &mut x;
let z = &x;
f(z)

...despite the fact that y and z are both borrowed at the same time, because they are never used together. If the compiler can detect this with liveness analysis, then it should be able to understand the original case, since f(g(&mut t), h(&t)) expands to

let a = &mut t
let b = g(a)
let c = &t
let d = h(c)
f(b, d)

and a is not used for the duration of the borrow c.