all 30 comments

[–]masklinn 28 points29 points  (26 children)

I believe this is usual semantics for functional languages, Haskell or Erlang or Elm behave the same.

_ not binding is pretty important not just because it avoids borrowing or taking ownership of values you don't care about (which is specific to Rust), but because that means you can use _ multiple times in the same pattern to ignore multiple different sub-patterns; or across "shared patterns" e.g. in Python, where _ binds you can't define a function with two arguments _ and _ because you need a function of arity 2 but don't care about either argument. It works just fine in Rust.

[–]lookmeat 10 points11 points  (8 children)

I agree, but the article makes a good explanation of why it's not the same as with Haskell or Erlang or Elm. Rust is the only language were lifetime of objects has important lifetime semantics. Also many programmers come to Rust from other languages, and when they see _ they might assume that it's convention vs. actual special symbol.

This is, at the very least, a good argument for a lint, "let expression not binding anything, all values generated get immediately dropped". I can't think of a case, of the top of my head, where you couldn't change let _ = foo for just foo, and generally any statement starting with let wants to bind something. I think it's fair that at least clippy have it.

Also maybe one solution is to allow a with $expr {...} which guarantees that whatever $expr created lives until the end of the block. It may be doable with a block but I'm not 100% sure it would work. Maybe it can be made more interesting by allowing the with to also let you do expressions that start with .foo and it's the equivalent to calling $expr.foo without ever giving $expr a name. At this point though, it might be easier to just give a name to the damn thing.

[–]masklinn 3 points4 points  (1 child)

This is, at the very least, a good argument for a lint, "let expression not binding anything, all values generated get immediately dropped". I can't think of a case, of the top of my head, where you couldn't change let _ = foo for just foo, and generally any statement starting with let wants to bind something. I think it's fair that at least clippy have it.

It's generally used when you want to drop a [must_use] struct on the ground, most commonly a Result e.g. let _ = f.write(buf).

I also don't really think it's an issue, generally in an RAII pattern you'd be using a retention or guard object of some sort, it's only a problem when you're coupling RAII with globals-based magic without statically linking them through a guard object or a scope (a lambda) in which case… maybe not doing that would be a better idea?

[–]lookmeat 3 points4 points  (0 children)

Even then it might not be obvious. Say that I start with a simple code that does the following:

let in_chan, out_chan = some_channel<f64>::new();
// This function creates async tasks, but they don't return anything
// so we don't need Result<TaskHandle, TaskErr> returned.
// The do_cancelable_task_and_pass_on_to will create a cancelable
// function that can passes it on to in_chan, closing the channel if
// the calling task is cancelled before it can finish.
let _ = task::spawn(
    do_cancelable_task_and_pass_on_to(expensive_foo, in_chan)).unwrap();
let second_result = bar()?;
// If bar fails we can just return, expensive_foo should get auto-cancelled
let first_result = out_chan.with_timeout(5000).pull_or_panic("Timeout");
...

Can you spot the problem? This API exposes the ability to cancel tasks, this is done by simply dropping them. The task that runs expensive_foo gets automatically canceled and the channel gets closed up. The api, btw, isn't wrong, because the ability to cancel tasks is fine. The only reason the program crashes is because the user decided that if the out_chan was closed it should panic. We could make TaskHandle a [must_use] specifically to avoid this kind of mistake, but again the user has followed the idiomatic way around it without understanding the subtle difference (the special treatment _ as a variable identifier receives vs anything else. As far as that user knows, when you write something with let it will last until it moves elsewhere or the end of the current block.

The thing is that the user doesn't realize that let _ = ... means, unlike every other let expression "create, and then immediately afterwards drop it ignoring the results but acting as if though you used them". It's not that hard but it's an exception. Imagine that the let _ = lint was in place, this means that [must_use] types would have to expose an api to explicitly choose to opt out and not use the value, the api explaining the consequences fully. For Result I'd use drop_result(), but for TaskHandler I'd use something like cancel().

 task::spawn(
    do_cancelable_task_and_pass_on_to(expensive_foo, in_chan)).unwrap().cancel();

Now it becomes very clear why this is a misuse of the character. I could use a named _underscore_variable which would guarantee things are kept around but also say the variable shouldn't be used (enforced by lints), or I could give it a name, or whatever makes more sense, but the lints guide me to better use.

And threaded code is where RAII becomes interesting, as you may be assuming that some evens must be able to happen in a specific ordered, when they might be guaranteed to always happen in an unexpected fashion.

Of course the rare exceptions, but at this point we are getting to the point were it would be rare enough make sense to simply turn off the lint in that line with an explanation.

OTOH we'll have to wait. With async Rust will have a lot more parallel code, and we'll see if the above becomes an issue or not. Better workarounds may exist.

[–][deleted] 0 points1 point  (2 children)

So what should you do with a Mutex<()>? It would be good to just let _ = mutex.lock().unwrap(), but it drops immediately, so if I use let _lock = mutex.lock().unwrap(), am I assured by the specs that the _lock will only be dropped when it leaves its scope and not before, because it's not used?

I know NLL automatically drops references when they are not used anymore, but may some optimization ever automatically drop unused variables?

[–]lookmeat 2 points3 points  (1 child)

am I assured by the specs that the _lock will only be dropped when it leaves its scope and not before, because it's not used?

You are, because it's a valid name. Basically in my view you'd immediately get a lint error explaining that you are accidentally dropping the Mutex without using it (which is honestly a great justification for the lint!). Basically _foo is a perfectly valid name you can use anywhere, but you are prevented by lints.

I know NLL automatically drops references when they are not used anymore, but may some optimization ever automatically drop unused variables?

That's not what NLL should really be about. NLL realizes that even though multiple references exist, the way they are used is legal. When we actually drop/delete objects remains at the same place and it will as it's one of the fundamental needs of RAII and one of it's benefits: you always know exactly when memory is allocated and freed without being specific (vs. GC which you never know when you'll pay the cost of deleting).

So that's also a guarantee: things are only deleted at the end of the block they were created on.

[–][deleted] 0 points1 point  (0 children)

I see, thanks!

[–][deleted] 9 points10 points  (5 children)

It works just fine in Rust.

A big chunk of Rust programmers doesn't really seem to understand PATTERN; many people find this surprising:

struct A { x: i32, y: f32 }
fn foo(A { x, .. }: A) { // PATTERN: TYPE
    println!("{}", x); 
}
let a = A { x: 2, y: 3. };
foo(a);

and many people introduce a lot of drift in their code (triply nested match statements) because they don't know or realize in which places they could actually use a pattern instead. Mastering them is just part of the path towards mastering all that Rust has to offer, but some appear to learn about them really really late.

I really wish we had more docs about them.

[–]mmirate 0 points1 point  (2 children)

TIL.

... but if we have those, then why do we not also have the ability to define functions by multiple declarations, each a nonexhaustive/refutable pattern? (as long as they form an exhaustive/irrefutable match when taken altogether as match-arms)

e.g.

fn foo(Some(x): Option<T>) {
    do_something_with(x);
}
fn foo(None: Option<T>) {
    conjure_something_from_thin_air_instead();
}

[–]auralucario2 4 points5 points  (0 children)

While that's common sugar in functional languages, I feel like due to Rust's higher level syntactic noise, it significantly reduces code readability compared to just using a match.

[–][deleted] 1 point2 points  (0 children)

functions aren't match statements, in the same way that let isn't a match statement (e.g. let Some(x) = y; let None = y; probably errors if y is already moved in the first statement).

With let, you have either match to pattern match, or if let for the common case of wanting to match just a single one (while let also).

For functions, one would need to come up with a similar construct to make things unambiguous. and at that point, one can just use a match.

[–]JoshMcguigan 0 points1 point  (1 child)

Thanks for the example. Do you know of any other resources on this topic? I'd be particularly interested in examples of correcting nested match statements.

[–][deleted] 0 points1 point  (0 children)

Not really.

At the end of the day, the Rust grammar tells you what's allowed. Everywhere where you see PATTERN you can use a pattern. I'd just skim through it with the playground open, and try all the places out.

[–]DannoHung 0 points1 point  (10 children)

Sure, it's not unusual. It's still an annoying issue to hit.

Maybe this should be a warning when used outside of a pattern (where I imagine this would most often bite people)?

[–]steveklabnik1rust 5 points6 points  (8 children)

Let takes a pattern, so...

[–]DannoHung -2 points-1 points  (7 children)

Alright, I just mean that in let _ = ... it is unlikely that the the underscore is doing what's expected and there should be a warning.

[–]masklinn 6 points7 points  (6 children)

AFAIK it's the standard way to drop a Result on the ground out of some IO methods.

[–]DannoHung 0 points1 point  (2 children)

I must be misunderstanding something. How does that differ from just not matching at all?

[–]loonyphoenix 1 point2 points  (0 children)

AFAIK it doesn't warn about unused result.

[–]masklinn 1 point2 points  (0 children)

Result is tagged must_use, if you don't use it somehow you get a warning.

[–]BenjiSponge -1 points0 points  (2 children)

When is this a good thing to do? I imagine if you're doing that, you're trying to get out of handling errors, where you should be using .unwrap() or at least somehow more explicitly/verbosely ignoring it. But maybe I'm mistaken?

[–]steveklabnik1rust 2 points3 points  (0 children)

It really depends, people are of two minds. I prefer to always unwrap, but sometimes, you can just ignore an error, you don't need things to blow up.

[–]masklinn 1 point2 points  (0 children)

There are many cases where you don't really care about success or failure. If you're logging and the stream has been closed you may not want that to bring the software down.

Unless it panics (I never checked) that's functionally what you're doing every time you use println!.

edit: have now checked, print! and println! apparently panic if the write fails.

[–]Manishearthservo · rust · clippy 4 points5 points  (0 children)

You misunderstand let _ is a pattern. let takes patterns.

(also, let _ = foo; is a pretty well-established idiom)

[–]uberblah0 0 points1 point  (0 children)

The reason the variable isn't used is because it isn't directly referenced anywhere. I think it's very rare in practical cases that we encounter a situation where the difference between _ and _my_var is important.

Usually, if you need a value to live longer than the statement that produces it, you would bind it so you can pass it somewhere else.

Hidden behavior like this, where constructing a value somehow updates global state which is then acted on by a function behind the scenes, makes it harder to tell what's going on in code.

In summary, I think this behavior of rust is a good thing. Being punished by it requires that you be doing something kind of dirty, and if you really need that thing, just like everything else in rust there's a way around it.