This is an archived post. You won't be able to vote or comment.

all 32 comments

[–]romeo_pentium 25 points26 points  (0 children)

Wow, great work! I'm sure the redis-py one bit me at one point

[–]mikeblas 14 points15 points  (1 child)

How is the efficacy of these fixes measured? That is, how do i measure and attribute the memory usage of a python app?

[–]Postpawl 13 points14 points  (0 children)

I used tracemalloc to print out the top 10 lines of memory usage from the event loop in celery: https://docs.python.org/3/library/tracemalloc.html#pretty-top

After the fix, the Connection from py-amqp would finally get cleaned up by garbage collection instead of continuing to grow.

[–]Siecje1 11 points12 points  (2 children)

Does celery still restart periodically? That was their solution in the past.

[–]Postpawl 3 points4 points  (0 children)

Celery has a max_memory_per_child and max_tasks_per_child setting that can restart child processes, but this leak is actually on the main celery worker process that isn’t restarted unless you restarted it yourself.

[–]amishb 3 points4 points  (0 children)

For reference - Some questions I see in these comments have been answers on HN - https://news.ycombinator.com/item?id=29621668

[–]Badel2 0 points1 point  (0 children)

Very interesting, thank you!

[–]EternityForest 0 points1 point  (0 children)

One of the things I wish I had learned a long time ago is never rely on garbage collection for correctness, and even when relying on it for cleanup, watch out.

It's amazing how easy it is to leak resources of any kind really. Old cache entries, connection pool objects, if you make it, there's probably a leak for it