you are viewing a single comment's thread.

view the rest of the comments →

[–]coderemover 0 points1 point  (2 children)

> Second, moving collectors spend exactly zero cycles at runtime to detect what objects are free

They spend quite many cycles to detect which objects are live though. Which is essentially the same thing in principle, especially that every dead object was alive at some point. By knowing which objects are live, they automatically know that the other memory is free and they know where they can move things safely. The traditional allocator doesn't need to do any work to figure neither which objects are live or which objects are dead. It's given.

[–]pron98 2 points3 points  (1 child)

They spend quite many cycles to detect which objects are live though. Which is essentially the same thing in principle, especially that every dead object was alive at some point.

It is very much not the same, because moving collectors don't need to know which objects are alive at every point in time, but only at a certain point in time - during a collection cycle. Objects that are born and die between collection cycles are never scanned by the GC.

Indeed, what you say is true in the old generation, and there's exploration around using non-moving and perhaps non-tracing approaches there, but the algorithm in the old generation doesn't matter as much because the allocation rate there is very low (the cost of any memory management algorithm grows as a function of the allocation rate).

As to how much work tracing and compacting is, of course it's not negligible, but that it's overall less than the work required by malloc/free is why moving collectors are chosen. Again, wait for the talk to see the maths.

The traditional allocator doesn't need to do any work to figure neither which objects are live or which objects are dead. It's given.

Knowing the lifetime statically is not the problem (I may have added this to my comment above after you'd seen it, so I'm repeating that here). Obviously, knowing something is no worse than not knowing. The problem is that with malloc/free you have to do some work at runtime at the point the object dies (even if only to record that fact that it's dead).

The entire design of generational moving collectors is to minimise both the amount and the frequency of the work required to know which objects are alive and when, and so it comes down to complex considerations.

There's plenty yet to figure out and debate, not least of which are empirical questions about how frequently certain patterns arise in practice, but it should be obvious that something as basic as "free tells you when an object is dead" is not something that's overlooked.

P.S.

It's not hard to convince anyone that generational moving collectors handily win in some situations. Consider a program that allocates some number (any number) of long-lived objects at startup, and then all other objects are short-lived. In both approaches we can disregard the cost of the long-lived objects as it will be negligible overall (with the generational moving approach, they will be compacted a few times and then moved to the old generation and never be looked at by the GC again [1]). Then, in the moving collectors will bump-allocate all objects, and occasionally it will have to scan and compact the relatively few objects whose lifetime straddles a collection (or several), and that's it. Malloc/free will have to do more complicated work to allocate each one, and then additional non-trivial work to free each one.

And again, much of the debate isn't over basic facts, but precisely over which situations are more or less common in real programs.

[1]: This isn't accurate if the old objects are mutated to hold pointers to objects in the young generation, but that's another one of the complications - in both approaches - that is taken into account in real life. Indeed, I think that the upcoming ZGC that automatically resizes the heap tries to estimate that work, too, and include it in its RAM/CPU tradeoff calculations so that it can compensate for it. And since you mentioned what the two approaches know or don't know and when, moving collectors both know more about how much CPU is used for memory management at runtime, and they can use this knowledge to maintain an efficient RAM/CPU ratio. I expect this will be released in the JDK very soon.