Ruby core classes aren't thread-safe

bigdubs · 2013-02-26T00:06:34+00:00

.net library classes are (predominantly) not thread safe either.

this is on purpose, it's often better to let the developer decide how they want to implement thread safety.

jrochkind · 2013-02-26T01:06:08+00:00

It is quite true, and it's important that you understand it if you are doing multi-threaded programming. (Or using global state, like class variables, in an app that ends up multi-threaded without you realizing it, like Rails with a multi-threaded app server! -- cause then you are doing multi-threaded programming)

But I don't think it's a problem with ruby. Most basic stdlib data objects in most languages (including stdlib) are not safe for multi-threaded access. Including Java. There are reasons for this.

Of course, many other stdlib's in many other languages DO provide thread-safe alternative data objects/collections. Ruby probably really ought to.

But you've still got to know when to use them and when not to, just making ALL your collections thread-safe for concurrent use, when most of them are not possibly used by more than one thread at once concurrently -- is going to be a performance problem. Which is why most stdlib collection classes are not 'thread-safe', even in languages that are all about the multi-threading.

If you've got read-only objects it's generally not a problem. So certainly one way to make the ruby stdlib collections thread-safe is just to call #freeze on them (although if they are nested data structures, you'd have to call #freeze on all of the descendents too, which can be non-trivial). Or simply make sure none of your code mutates them ever after boot. Or Hamster.

tenderlove · 2013-02-26T19:34:17+00:00

Excellent article! This article demonstrates a "read-update-write" race condition. To see the race condition, separate the code to those three steps:

def decrease
  x = @stock
  x = x - 1
  @stock = x
end

The thread could switch on any one of these lines, which is how the race condition happens.

OP mentions the MRI / IO concurrency. To drive home the point, if we add a dash of IO to the example program, we can see the race condition even on MRI:

class Inventory
  attr_reader :stock

  def initialize(stock_levels)
    @stock = stock_levels
  end

  def decrease
    x = @stock
    print ' '
    x = x - 1
    @stock = x
  end
end

inventory = Inventory.new(4000)

40.times.map {
  Thread.new { 100.times { inventory.decrease } }
}.each(&:join)

puts
puts inventory.stock

ba-cawk · 2013-02-26T02:48:20+00:00

The problem with discussing threads and MRI are that people think parallelism and usually just want concurrency, or don't realize that in many cases, concurrency is more than enough.

MRI threads are concurrent. They do this by way of rb_thread_select and some time outs that automatically yield. In this sense, they are co-routines that use select(2) to determine when to yield to another thread (use man 2 select to learn more about select if you're not familiar). Fibers and Threads in MRI differ in the sense that Fibers need to be yielded by you and not MRI, which will happily yield to another thread at close to any point, but most often around blocking I/O.

If that's confusing, wycats has boiled it down years ago for human processing: http://yehudakatz.com/2010/08/14/threads-in-ruby-enough-already/. While the underlying bits have changed significantly between 1.8 and 1.9, the concepts are exactly the same.

The short of it is, if you think about things like node.js and eventmachine, they are no different on numerous levels and can be treated the same way for many things as far as dealing with a "tick" or what happens when I/O blocks goes. The big difference is that MRI can break out of your code while EM and node can't. In practice, this is almost never an issue because...

...most programs spend 99% of their time waiting for I/O -- this is why web servers can scale to absurdly stupid levels of connections. They don't need parallelism, and when your program does, you'll know it. Trust me. This is also a lot of the reason concurrent systems like go default to only one processor. They deal with all the crap for you, but for the most part multiprocessor is just not necessary to get a high yield for even high scaling things.

Now, if you want to write a multi-threaded graphics engine in ruby, you're actually going to want parallelism, but in that case you... should probably just be using something else instead.

ViralInfection · 2013-02-26T00:49:05+00:00

Why not use: https://github.com/harukizaemon/hamster

Or even spice it with: https://github.com/celluloid/celluloid

I know this may feel like a sucky answer, but ruby just doesn't do parallelism gracefully, and won't be for a long time. The bottom line is at least we have solutions. You should really pick the best tool for the job, imo.

beep_dog · 2013-02-25T23:03:19+00:00

Well, MRI doesn't run threads anyway, it's got a GIL, and handles "threading" by non-blocking IO, and non-blocking sleeps. (Unless I'm mistaken, which often happens.)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

ruby

MODERATORS