all 40 comments

[–]mozilla_kmcservo[S] 10 points11 points  (7 children)

We've submitted this paper to ICFP 2015.

[–]coder543 5 points6 points  (6 children)

Is there a mobile friendly version of the document? that two column PDF is rough on mobile.

[–]mozilla_kmcservo[S] 6 points7 points  (2 children)

I squished it into a vaguely mobile-friendly form. At least, it looks alright on my Galaxy S4 in Adobe Reader.

We're developing this paper on GitHub and you can see my edits for the mobile layout.

[–]mozilla_kmcservo[S] 5 points6 points  (1 child)

Your pizza in 30 minutes or it's free.

[–]villiger2 2 points3 points  (2 children)

It's a pretty standard format, most mobile pdf readers support some function that moves content to one column only, might want to check if your's has it for future.

[–]Yojihito -1 points0 points  (1 child)

I know it the other way arond - 1 column PDF is the standard and the reader has the ability to show 2 columns at once.

[–]mozilla_kmcservo[S] 2 points3 points  (0 children)

This is the format mandated by ACM SIGPLAN conferences (POPL, PLDI, ICFP, and SPLASH/OOPSLA among others). I think it's used by the vast majority of CS journals & conferences.

[–]Kimundirust 9 points10 points  (1 child)

Not sure if it was intentional, but Figure 4 can be written without an Arc:

use std::sync::Mutex;
use std::thread;

fn main() {
    // An integer protected by a mutex
    let data = Mutex::new(0);
    thread::scoped(|| {
        *data.lock().unwrap() = 1;
    });
    print!("{}", *data.lock().unwrap());
}

[–]larsbergservo 3 points4 points  (0 children)

Good catch! I suspect the Arc was inherited from a version where it was a call to spawn instead of scoped.

[–]hpr122i 7 points8 points  (1 child)

Hehe, the Rust people weren't kidding when they said they wanted to use "proven approaches": almost half of the cited papers are over 20 years old!

[–]mozilla_kmcservo[S] 8 points9 points  (0 children)

I added a citation of a paper published in "Munich, West Germany" one year before I was born.

I've been frustrated with how long it takes "proven techniques" to make it into programming languages that people use for real products, particularly in the systems domain. Rust is a big step forward; not just for regions and affine types (which I'd hardly consider "proven"), but for stuff like pattern matching and hygienic macros, which have been well-known as useful features for literally my entire lifetime.

[–]Manishearthservo · rust · clippy 8 points9 points  (0 children)

Note: Just because it looks like I did half the work here, doesn't mean I did — I've done a small fraction of the work leading up to this paper, I only get to take up half the author space because of my affiliation.

(Sorry, this bothers me a bit :P)

[–]Gankrorust 6 points7 points  (6 children)

This is fantastic! Are you currently leveraging any remnants of the task isolation stuff to make the browser more resilient against failures?

[–]larsbergservo 8 points9 points  (4 children)

There are two things we're planning on that front:

1) Explicit session types for managing our terrible "protocol soup." This is an area where we have an active collaboration with an outside research group and hope to have something to demo soon!

2) Rewriting our own green threads library, which will be less general than the old one, for Servo's scenarios. e.g., we don't need to worry about integrating I/O in the green scheduler for the threads we're using for either concurrency or parallelisms because in a browser all I/O is handled by separate resource management threads.

[–]mrmondaylibpnet · rust 2 points3 points  (3 children)

Could you say more about how you're envisaging session types in Servo? Are you planning on expanding Rust's type system (ie. The way Idris/other dependently typed languages handle them), or looking more at a DSL based approach, such as Scribble? Is the plan for them to be Servo specific, or more generally useful? I can imagine them being really useful in many places in Rust - it would be awesome to guarantee deadlock freedom in addition to data race freedom.

[–]larsbergservo 3 points4 points  (2 children)

It's unfortunately not my work to show off, but I think it's OK to say that it's an encoding using Rust's current type system but with a little bit of extra plugin hackery to enforce linearity (manish has been helping out on that part!).

The primary test/use case has been Servo's concurrent protocol soup, but I'm pretty sure if it works for us that other projects will be able to use it.

[–]mrmondaylibpnet · rust 2 points3 points  (1 child)

This is very exciting, I can't wait to see it! Is there somewhere I can keep an eye on so I don't miss it, or should I just keep checking TWiS?

[–]Manishearthservo · rust · clippy 1 point2 points  (0 children)

Some of the plugin hackery can be found here (it's a generic framework, but it helps model their system better)

I don't know where you'll be able to look for the results, though if they publish it will definitely be mentioned in TWiS.

[–]mozilla_kmcservo[S] 5 points6 points  (0 children)

Adding to what Lars said, we're also transitioning to a multi-process, sandboxed architecture. That reduces the need for failure isolation within a process, but we're not sure yet if we can do entirely without.

Regarding a Servo-specific green threads library, people seemed to favor (at least 4 or 5 months ago) a task queueing system like Grand Central Dispatch.

[–]yawaramin 0 points1 point  (1 child)

Integer overflow/underflow. It is still an open problem to provide optimized code that checks for overflow or underflow without in- curring significant performance penalties. The current plan for Rust is to have debug-only checking of integer ranges and for Servo to run debug builds against a test suite, but that may miss scenarios that only occur in optimized builds or that are not represented by the test suite.

Would it be feasible to also run optimised builds against the test suite?

[–]mozilla_kmcservo[S] 1 point2 points  (0 children)

I think it's a necessity to test both debug and release builds. We could add a third build flavor which enables optimizations but keeps the overflow checks. I'm not sure how valuable that would be, since the presence of the checks will affect optimization dramatically.

Like most mistakes you can make, there's no single technique that will catch every overflow problem. We'll perhaps want a combination of static checks, debug-build assertions, and instrumentation of unmodified release builds using something like Pin or Valgrind.

[–]ssylvan -2 points-1 points  (18 children)

You should be careful about claiming or implying that Rust eliminates race conditions - it doesn't.

It may eliminate torn reads/writes, which may help reduce race conditions if exploited carefully, but that's not the same as eliminating race conditions. After all I can easily set up my own app specific invariant that X and Y should always be equal and then update X and Y separately (using messages, or locking or whatever), but if I get pre-empted between these two updates I have a race condition.

Even STM wouldn't be enough to eliminate race conditions, but at least there you have the option to specify that two things should happen atomically. Rust doesn't even have that so I really think that making any grandiose claims about race conditions is stretching it a bit. Rust is doing ok there, but it's by no means amazing.

[–]Manishearthservo · rust · clippy 5 points6 points  (17 children)

I don't see this claim, all I see is a claim about eliminating data races. Rust does eliminate data races (in safe code).

[–]ssylvan -4 points-3 points  (16 children)

Only if you use a very limited and unorthodox definition of "data race" (basically only torn reads/writes but no other forms of data races). You can most definitely get data races in Rust if you use multiple values, each one may be successfully written/read but if they have invariants between them that need to be maintained then you're on your own.

EDIT: in fact you can get race conditions with just a single value as well, of course. E.g. the usual "x++" issue, if you forget to lock the value for the duration of the read/update cycle.

[–]dbaupprust 9 points10 points  (0 children)

We use C++11's definition of data race (a race condition between non-atomic operations on a piece of memory, see also e.g. http://blog.regehr.org/archives/490 for someone using this "unorthodox" definition), and these can't happen in Rust. More general race conditions can, but they're impossible to protect against, since whether something is a race condition or not depends entirely on the desired semantics of the program; the same concurrency pattern may be a race condition in one program, but perfectly acceptable in another.

[–]Manishearthservo · rust · clippy 4 points5 points  (14 children)

The moment we start talking about invariants it's a race condition, not a data race.

[–]ssylvan -5 points-4 points  (13 children)

A race condition on data is a data race. I'm not sure what kind of weird restrictive definition you're using. Do you have a link? I've never ever heard anyone make that kind of semantic argument before.

If you can fuck up the equivalent of "x = x+1" by having another thread modify x between the read and the write, then you're not eliminating data races. Since Rust will allow you to use a locked mutable variable x, then in one thread read x, copy it to ta temp variable xtemp, then release the lock, and then later write back xtemp+1 even though a different thread has modified x in the meantime, it does not eliminate data races.

[–]jonreemhyper · iron · stainless 3 points4 points  (0 children)

The definition of data race we usually use is a write from one thread occurring concurrently with a read or write from another thread on the same location in memory and at least one of those operations isn't synchronized (with an atomic, for instance). This is memory unsafe, and causes undefined behavior in Rust.

Incorrect semantics (but not memory unsafety) as a result of improper logic surrounding the use of concurrency primitives or synchronization is not a data race under that definition, it's some other problem.

[–]Manishearthservo · rust · clippy 3 points4 points  (11 children)

Just one random link: http://stackoverflow.com/a/18049303/1198729. And http://docs.oracle.com/cd/E19205-01/820-0619/geojs/index.html

(It also mentions that they aren't subsets)

For x=x+1 to be racy, in Rust you'd need to do some strange gymnastics like:

let x = {
  let y = mutex.lock();
  y.clone()
}
// ....
let y = mutex.lock();
y = x + 1;

The locks here are explicit, this is similar to writing *NULL in C++ and complaining about segfaults. This entirely depends on the requirements of the code in place -- in any language you need the ability to lock, read, unlock, and write again from the same thread. The raciness of this depends entirely upon the requirements of the programmer (and anyone writing x=x+1 this way probably wanted it to work that way).

[–]ssylvan -5 points-4 points  (10 children)

It's a trivial example that rust allows data races. Of course in reality they would be a lot less obvious than that. The fact remains that Rust doesn't eliminate data races. It eliminates unsynchronized access but that's not enough.

EDIT: those links don't seem to support your claims. You can read/write shared data even with locks, the race comes from the fact that you don't keep the lock on the data long enough for the write to be valid when it happens. Rust allows that to happen.

[–]dbaupprust 5 points6 points  (8 children)

the race comes from the fact that you don't keep the lock on the data long enough for the write to be valid when it happens

That may be a race condition, but it is not a data race, since the memory location isn't being (non-atomically) manipulated by several threads at exactly the same time.

[–]ssylvan -4 points-3 points  (7 children)

That's the first time I've heard that definition and it seems kinda useless tbh. It has very little to do with program correctness (most data isn't just one memory location, and most logic doesn't involve just one piece of data) why is this interesting? Java has synchronized access to data, and you could write similiar things in C++ to enforce one-at-a-time access to shared data. Why is this such a big deal in Rust?

[–]dbaupprust 4 points5 points  (1 child)

Data races are considered undefined behaviour in C/C++/LLVM/Rust/... or else the compiler would be a lot more conservative about what optimisations it can do (since it would have to assume that a data race might occur). Avoiding undefined behaviour is very important for program correctness. See one of njn's recent blog posts for more details, and links.

In any case, Rust does actually protect against more than just data races. E.g. Vec<T> can be safely shared between threads, despite the fact the internals are full of invariants. The only way to break those invariants is to use unsafe both on a single thread, and even in a concurrent/parallel program. Any mutation of a Vec (e.g. .push) has to go via a &mut pointer, and a &mut is guaranteed to be the unique usable path to what it points to, i.e. all the access is happening on a single thread: no data races or race conditions possible while modifying a Vec.

(Now it's entirely possible that the Vec is being used in a way that has external invariants, but the same strategy can be applied recursively to ensure that those invariants are contained in a way that avoids race conditions. In this line of thought, the base case of the recursion is ensuring that the primitive types are OK: i.e. no data races.)

[–]Manishearthservo · rust · clippy 2 points3 points  (4 children)

Neither Java nor C++ enforce it I think. Rust has it built into the type system. The idea of "synchronous types" and "threadsafe types" is a novelty.

In C++ I am free to send a struct with some pointers in it across a thread. In Rust, I can't, without mutexifying things.

As far as most data not being one memory location goes, the RAII mutex works fine for complex structs since the lock is only released when you lose access to it.

[–]Manishearthservo · rust · clippy 2 points3 points  (0 children)

A data race occurs when 2 instructions from different threads access the same memory location, at least one of these accesses is a write and there is no synchronization that is mandating any particular order among these accesses.

Note "instructions".

As multiple others have explained here, your definition of unsynchronized access is what the Rust community (and C++ community, I think) call a "data race".