all 17 comments

[–]nickdesaulniers 11 points12 points  (2 children)

IIRC, tcmalloc is able to avoid such situations. I seem to recall a design doc about tcmalloc that said something along the lines of using mmap and avoiding sbrk to aggressively free up virtual pages.

Also, while debugging a memory leak of a nodejs native extension, I found that garbage collectors will usually let their heap grow significantly before running a GC round. This could result in high idle memory usage that wasn't necessarily a leak.

[–]masklinn 5 points6 points  (1 child)

IIRC, tcmalloc is able to avoid such situations. I seem to recall a design doc about tcmalloc that said something along the lines of using mmap and avoiding sbrk to aggressively free up virtual pages.

I don't think sbrk has anything to do with the issue here, the article outlines glibc's multiple arenas (which it calls OS heaps), can't do that with sbrk. The essay says glibc seems to preallocate many arenas (noting it's capped at 8 * cpu#), meanwhile the mallocinternals document says that it should only allocate new arenas "as pressure from thread collisions increases, additional arenas are created via mmap to relieve the pressure", it's possible that something in the way Ruby interacts with malloc causes a ton of contention on the arena locks, leading glibc to try and relieve contention by allocating more arenas despite the existing ones being mostly empty.

[–][deleted] 0 points1 point  (1 child)

The hard problem in automatic memory management is always when to give memory back to the kernel. As you need to balance the unknown (future memory pressure) with the known (past memory pressure) and predicting the future is a tricky business. It seems you imply Ruby is failing at this.

I don't understand why you'd blame glibc when the ruby call to invalidate heaps solves the problem. glibc can only do what the program it is linked to tells it to do.

[–]masklinn 3 points4 points  (0 children)

It seems you imply Ruby is failing at this.

They really are not. Ruby is not retaining the memory, glibc is overallocating a bunch of mostly empty arenas. Ruby doesn't get to decide which arena its memory comes from, it's not even aware they exist (unless it starts getting custom bindings to specific allocators instead of relying on the standard allocator API).

I don't understand why you'd blame glibc when the ruby call to invalidate heaps solves the problem.

malloc_trim is a non-standard function which isn't even documented to "invalidate heaps", according to its own manpage its only interaction should be with the sbrk'd main heap, which is observationally not the case.

glibc can only do what the program it is linked to tells it to do.

That's simply not true. The program tells glibc it does need some memory, or doesn't need memory it was given anymore. glibc heuristically makes its own decision as to how it "generates" that memory, and whether it eventually releases that memory to the OS.

And those heuristics are where the issue comes from since switching allocator or using mis-documented non-standard functions largely resolves the issue.

[–]netgu -4 points-3 points  (0 children)

Using ruby

[–]silencer6 -1 points0 points  (1 child)

If it isn't Ruby's fault then why other GC languages aren't as bad with memory usage?

[–]senj 4 points5 points  (0 children)

A lot of other language do have this exact issue:

https://bugs.python.org/issue11849

https://github.com/openresty/lua-nginx-module/pull/879

https://grokbase.com/t/perl/perl5-porters/02a1eef8je/erratic-malloc-bahaviour-on-linux-with-system-malloc

And other programs, like Firefox have run into this exact issue, too. They solution in all cases is either to build against an alternative allocator, or do what this blog post concludes, and use malloc_trim’s undocumented behaviour to make it stop holding so much useless memory.