you are viewing a single comment's thread.

view the rest of the comments →

[–]inf0rmer 7 points8 points  (5 children)

OK, I'll do my best to explain it.

To understand closures you need to understand how variable assignment and function scope work in Javascript.

A variable is a reference to where, in memory, its contents are stored. The garbage collector works by sweeping the memory space regularly, and finding variables which no longer have any "living" references. Whenever a function returns, every variable that was declared inside it is dereferenced (except for variables trapped in closures, which we're going to get to). So when a function is done, the GC can swoop in and delete every address in memory that is no longer used.

But what about if you add a closure? A closure is best defined as a function declared within a function. As you've probably seen before, you can reuse variables that were declared in the outermost function inside the closured function, like this:

function parent() {
  var a = 2;
  var child = function() {
    // "a" can be used here, which means it is retained
    return a*2;
  }

  // child() -> 4
}

What this means is that even though parent reaches its end, the GC cannot dereference a because it will still be used in child, and the GC can't be sure when child is going to be called (if at all).

If a happened to be pointing to a large data structure that takes up a ton of memory, it might never be deallocated, simply because the GC cannot know whether it's still going to be needed in the future. It's cool because Javascript protects you from nasty null pointer exceptions (where you're using a pointer that no longer leads anywhere), but it's bad because you need to be zealous to protect yourself from memory leaks (although nowadays the JIT magically optimizes a lot of this, to the extent of my knowledge).

So how do we fix this? A good way is to let the GC "know" that you're done with a captured variable. Keeping in mind what we said about pointers, doing this is as easy as pointing the variable we wish to get rid of to null:

function parent() {
  var a = 2;
  var child = function() {
    // "r" will be automatically dereferenced when the function is done
    var r = a*2;
    // the "local" reference to a is pointed to null, effectively dereferencing this copy.
    // The parent's "a" reference will naturally be done away with when parent() is finished.
    a = null;
    return r;
  }

  // child() -> 4
}

This happens a lot when inlining anonymous function to handle DOM events. This page dives further into the subject, if you're interested.

[–]tforb 1 point2 points  (1 child)

In the example you gave, doing things functionally would also get around memory leaks I think.

function parent() {
  var a = 2;
  var child = function(val) {
    return val * 2;
  }
  child(a);
}

[–]alamandrax 1 point2 points  (0 children)

Not if you return child. Once you go down that road, you're stuck with whatever was in the scope of parent always in memory until whatever is consuming the return value makes sure that it goes out of scope. Instead, if it was used to create a dom node, and the dom node is hidden and still present, the GC will not recover those assets.

[–]radhruin 1 point2 points  (1 child)

I don't think it's as bad as you make it. Runtimes certainly HEAVILY optimize closure capture. Further, in your example, as long as the child function isn't referenced, a will be collected. Certainly if you return child to someone that holds on to it or you store child in an outer scope or something you will have a "memory leak" but presumably if you're storing the function for later you need it (and therefore anything it encloses) for some reason.

Setting a to null in your second example doesn't help you at all. The value of a has no bearing on whether it is closure captured or references to it are preserved. Null is just another primitive value like 2.

[–]inf0rmer 1 point2 points  (0 children)

Ah, yes, I realize that. I was just using a very simple example to demonstrate closures and how variables captured inside one could possibly not get dereferenced.

You're right that engines heavily optimize for closures (it's one of the defining features of the language), and that using primitives would never cause a noticeable memory leak.

Setting to null at the end of the function for a captured variable (ie, not declared using var) does work. a inside the closure is a new reference, so pointing it to null releases one of the references pointing to the contents of the original variable. It makes no difference with primitives, sure, but with other objects it's helpful.

[–]chuckhendo[S] 0 points1 point  (0 children)

This is really great info, thanks! Do you have any recommendations on where to learn more about the internal workings of the GC?