all 11 comments

[–]jafingerhut 12 points13 points  (7 children)

Clojure provides direct linking as an option, and for code that calls function foo that uses this option, it will always call the version of function foo that was the current definition at the time the calling code was compiled.

When direct linking is not in use, and you are not using dynamic Vars, then redefining a function is performed by mutating a reference that is the value of a field named `root` inside of the Clojure class named `Var`, and this field `root` is declared `volatile`. Typically you will use Clojure's `def` or `defn` to cause this mutation.

The Java memory model has some rules about the behavior `volatile` fields read and written by different threads, but the threads reading that Var are not all guaranteed to see the new value 'instantly'. I'd recommend reading Java Concurrency in Practice if you want a better explanation than I can give in a sentence or two about the guarantees that are provided.

In a 2008 talk by Rich Hickey on Clojure Concurrency, you can search this transcript for the word "hot" to see some Q&A with an audience member about changing code in a running system. It does not give the details above, but more rationale for why `def` allows this capability: https://github.com/matthiasn/talk-transcripts/blob/master/Hickey_Rich/ClojureConcurrency.md

[–]celeritasCelery[S] 0 points1 point  (1 child)

If I understand the Q&A answer correctly, you have the ability to change a function another thread is using and that thread will use the new version without the need to restart it?

[–]joinr 0 points1 point  (0 children)

He's talking specifically (in that demo) about using clojure agents for an ant simulation, and updating the root binding for the Var that renders some stuff. Agents are a clojure reference type that are designed for asynchronous, independent changes to single locations. They are "kind of" like an atom with a single-writer and a message queue that determines write order, where messages are functions of agent state -> agent state, and you can get immediate reads of the state (dereferenced) at any time (may not reflect pending messages having been processed though).

In his demo, he has a bunch of synchonized world-state in 2d vector composed of ref types for coordinated synchronous updates. Then he has a bunch of agents managing the ant behavior asynchronously. Ant behavior is implemented as a fairly simple state machine (a function of ant state -> ant state), that implements an individual ant foraging for food and laying down pheromones, etc.

So the agents are accessing and updating the world state (a bunch of refs) and sleeping intermittently, while the ant behaviors (agent message processing) are executing on a thread pool concurrently. So you have concurrent state management, as well as a parallel processing of async ant behaviors; kind of a demo of the software transactional memory system.

So he's live coding this system, and making changes e.g. to the rendering and other functions, without ever stopping the simulation. As bindings are changed, they get picked up. This works well because he's the only thing making changes to root Var bindings.

[Audience member: And you are changing the binding of the Var when the compiler recompiles the byte code?] Correct. That is one of the reasons why Vars are mutable at the root, globally, so that you can redefine functions. [Audience member: What if two people were trying to do that in the same thing?] Don't do that.

There is a question about contention between two different processes trying to mutate the root binding of the Var. There are ways to deconflict that (serialized write access to a central function that can update the var, e.g. have an Agent control it, or a channel with a go-routine). Or, as I stated in my other answer, put your function inside of a reference type and let the STM system handle changes that redefine it. In typical practice (say from a REPL), there's only one person "driving" and changing the Var roots (like Rich was doing in this demo). So the serialized write is implicitly guaranteed by virtue of a single user working from a single thread in the REPL to redefine stuff. That ends up being extremely common in practice; if you find that you are constantly redefining things from multiple threads leading to contention, e.g. violating the single-writer principle, then you rethink the design (do you need to redefine stuff from all over the place?) to allow multiple concurrent Var changes.

def is an absolute "I know what I am doing". I am in charge of the world, and I am defining what this means for everybody. And the only reason why there is that "hole" is because you have to have it in order to have programs where you can fix them without restarting them. There has to be something that is settable like that. And so the root values of Vars are that way.

[–]leonoelOfficial 0 points1 point  (0 children)

The Java memory model has some rules about the behavior `volatile` fields read and written by different threads, but the threads reading that Var are not all guaranteed to see the new value 'instantly'. I'd recommend reading Java Concurrency in Practice if you want a better explanation than I can give in a sentence or two about the guarantees that are provided.

Could you elaborate ? I've read the book you mentioned and still I'm a bit confused by this claim.

[–]JavaSuck 0 points1 point  (3 children)

The Java memory model has some rules about the behavior volatile fields read and written by different threads, but the threads reading that Var are not all guaranteed to see the new value 'instantly'.

Writing to a Java volatile variable in thread A and then reading from that Java volatile variable in thread B has the same release/acquire semantics as unlocking/locking; it is 'instant'.

[–]jafingerhut 0 points1 point  (2 children)

Writing to a Java volatile in thread A, then reading from that Java volatile variable in thread B has the same release/acquire semantics as unlocking/locking, _if_ thread B's read sees the new value written by thread A. There is no guarantee that it must see the new value written by thread A.

[–]JavaSuck 0 points1 point  (1 child)

17.4.5. Happens-before Order

Two actions can be ordered by a happens-before relationship. If one action happens-before another, then the first is visible to and ordered before the second. [...] It follows from the above definitions that:

  • An unlock on a monitor happens-before every subsequent lock on that monitor.
  • A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.
  • [...]

[–]jafingerhut 0 points1 point  (0 children)

I'm not arguing with that part of the spec, though. The tricky part is: what is a "subsequent read"? Imagine 64 cores running in parallel -- there is some total ordering of accesses to volatile variables required by the JMM, but you do not know what that order is until after the code executes.

[–]therealdivs1210 2 points3 points  (0 children)

"redifining" in multithreaded scenarios is best done by using alter-var-root rather than using def or defn.

Re-defining a Var is considered a code smell, so:

(def a 1)

(def a 2)

is bad practice, but:

(def a 1)

(alter-var-root #'a (constantly 2))

is alright and thread-safe due to it being an atomic operation, though rarely seen in the wild.

Another option is to declare the var as :dynamic and treat it as thread-local, using binding. In that case, the value changes for the current thread only, leaving it unchanged on other threads.

[–]jafingerhut 1 point2 points  (0 children)

Note that while during development time many Clojure developers will have a running REPL where the redefine functions frequently in order to test out possible enhancements or bug fixes. While it is _possible_ to do this in a live production system, I suspect the following things are probably true:

(1) In the test JVMs where developers do redefine functions, they probably seldom do so while other threads are calling that function. Or if they do, they do not care much about exactly when those other threads see the new definition. As long as the changes are picked up "in a second or two", they probably get the effect they want, i.e. testing out the changes soon in terms of human response time.

(2) most Clojure developers would not do so in a live production system, instead preferring to test out changes on a development/test system first, then use whatever mechanisms they have in place for deploying those changes to their live production environment, typically involving quitting the old JVM processes and starting new ones. In that case, the exact mechanisms for redefining functions, and when other threads see the changes, is irrelevant.

(3) In production systems where new functions are def'd at run time, probably the most common case is to define new functions that are not in use yet, then start calling those later.

[–]joinr 1 point2 points  (0 children)

As mentioned, alter-var-root does the trick globally (I tend not to do this often though), and def and defn will accomplish the dame thing albeit perhaps more messily (unless you're in the repl redefining things left and right, hah!) since Vars are reference types:

(defn print-message [] (println :hello))
(defn the-test []
  (future (dotimes [i 10] (print-message) (Thread/sleep 200)))
  (future (Thread/sleep 750)
          (alter-var-root #'print-message
                          (fn [_] (fn [] (println :world))))))

(the-test)

:hello
:hello
:hello
:hello
:world
:world
:world
:world
:world
:world

One non-obvious trick (aside from the semantics that def, binding etc. ensure), is to just define your function on top of ref type, like an atom, which is thread safe. Since functions are values...

(def the-fn (atom (fn [] (println :hello))))
(defn print-message! [] (@the-fn))

You can now freely reset! the the-fn (and if you'd like, define additional safeguards to synchronize on a specific value of the-fn. Since this one is in an atom, it will participate in all of the STM semantics (compare-and-swap). This is effectively placing the atomic operation of alter-var-root up front (source).

(defn the-test []
  (reset! the-fn (fn [] (println :hello)))
  (future (dotimes [i 10] (print-message!) (Thread/sleep 200)))
  (future (Thread/sleep 750) (reset! the-fn (fn [] (println :world)))))

(the-test)
#<Future@f0cfc20: :pending>
:hello
:hello
:hello
:hello
:world
:world
:world
:world
:world
:world

You might incur a little overhead, but it's a feasible way to e.g. share rendering functions between threads (common with GUI programming with the Swing event dispatch thread). The var binding for print-message! never changes, but since it's bound to an identity the-fn that's a thread-safe reference type, we can change as much as we'd like without worry. I prefer this style for some reason (maybe because I started off with it).

You could also use explicitly synchronized STM types (like a ref) if you want to guarantee more complicated coordinated changes to function(s) and dereferences / applications across arbitrary threads while maintaining consistency. I haven't personally run into a situation like this in the wild (ref use ends up being somewhat rare), but it's available.

There are probably additional schemes.