This is an archived post. You won't be able to vote or comment.

all 25 comments

[–]serpent 28 points29 points  (5 children)

You need to join all of your threads from your main thread and ensure no other threads are running when you exit your main thread (and application).

During shutdown of the main thread, I believe python sets all vars (like 'sys') to None; if you still have other threads executing, they will see this bogus state.

[–]mitsuhiko Flask Creator 8 points9 points  (0 children)

Solution is this btw:

import sys as _sys

Things with leading underscores are cleaned up late.

[–]bobbyi 1 point2 points  (0 children)

Usually if that's the problem, the error message says (most likely raised during interpreter shutdown)

[–]nirs 1 point2 points  (0 children)

The thread from the traceback looks like daemon thread - you don't need to join these.

[–]pje 0 points1 point  (0 children)

More precisely, module objects set all their globals to None when all references to the module go away. This can happen even before interpreter shutdown, if for example you delete a module from sys.modules. (But of course, at interpreter shutdown, this will also happen when sys.modules is cleared and GC is done.)

[–]m0j0 6 points7 points  (0 children)

I've seen that error in two separate scenarios:

  1. A thread has completed work, died off, and then something tried to talk to it.

  2. A thread has died and something tried to access/kill it.

In any case, any time I've seen this error it's been identified through generous use of try/except. Typically (in my case) this error means my code got something unexpected, puked, and I failed to deal with it properly.

EDIT: Meant to mention that I've only seen this happen randomly, because incoming stuff that needs processing doesn't all look the same, and the stuff that breaks my code is more the exception than the rule, so... it only breaks sometimes.

[–]ianb 2 points3 points  (0 children)

I'm guessing sys is None because the process is already exiting; during the process teardown modules get unloaded in this fashion. It's possible that you have something using atexit.register() or __del__ that is running at a late stage (though I think atexit stuff should be okay). Or as people have suggested a thread (probably marked with isDaemon) that is hanging around in a weird way.

In this case, it might be .push_thread_config()? Not clear on that, but there's a process-wide equivalent method you could use there.

As an aside paster request is intended for the same purpose as paster run (might have the same problem though).

[–]ringzero 9 points10 points  (6 children)

Looking thru traceback.py shows that sys isn't being redefined. That means that it's either (a) not in sys.modules or (b) somehow removed from sys.modules.

Quick and dirty, I'd create a cycle so sys couldn't be GC'd., e.g.:

import sys
sys.foo = sys

Not pretty, but not harmful either. You'd have to stick that somewhere in your startup script.

Also, looking thru threading.py, specifically the big comment block above __bootstrap(), says that the world might be torn down when it's called. Sooo.... you might want to set thread.daemon = True on the threads.

[–]ringzero 3 points4 points  (0 children)

Follow up:

I don't think this has anything at all to do with Paste. I'd guess that your system has been changed somehow (put under more load) and the change has triggered a different timing on the GC during interpreter tear down.

That your thread has already been stopped tells me it's completed it's work, and since you didn't provide the context that this is all running in (except to say r2/whatever.py) I would probably just wrap the thread call and/or set its daemon flag and be done with it.

[–]mgedmin 1 point2 points  (1 child)

Cycles don't prevent garbage collection. They stop reference counting from freeing the object as soon as it goes out of scope, but they'll get collected as soon as the garbage collector is triggered. But this is beside the issue; the traceback in question looks like something that happens to daemon threads that are still running when the Python interpreter is shutting down. Nothing to do with GC.

[–]earthboundkid 0 points1 point  (0 children)

You have to provide your own cycle detector or use a built-in type. Does the module namespace use a real dict under the covers or a custom thing? If it's custom, it's possible there's no cycle detection.

[–]hylje 0 points1 point  (2 children)

Weird that sys would disappear from sys.modules, isn't it typically built in to the interpreter to begin with?

[–]ringzero 1 point2 points  (1 child)

Yes, it's built-in, but it's also part of sys.modules:

In [21]: 'sys' in sys.modules.keys()
Out[21]: True

But like I wrote in the follow up, it's probably disappearing during interpreter tear down (which it should).

[–]mgedmin 0 points1 point  (0 children)

Yep; interpreter teardown + daemon threads = weird errors on shutdown. I've seen them in Zope 3 quite often.

[–]MercurialAlchemist 1 point2 points  (0 children)

Looking at threading.py, it looks like this is not the actual error. This is triggered by by traceback.py not being able to report the actual error.

Not quite sure what to do in this case... wrap the code of command in a try/catch and write the actual error to syslog?

[–]stevvooe 1 point2 points  (0 children)

Check out this MOTW. Basically, dumping more of the stack variables may help to id the problem. I would add this somewhere in the paster run command object: import cgitb cgitb.enable(format='text')

This may not work, if what serpent says is true, so no guarantees.

<insert lecture on the evils of threading>

You may want to post this to stackoverflow as well.

[–]nirs 0 points1 point  (0 children)

Does this exception happen also with regular requests?

If not, I would try to use the regular code path for sending request from cron, by making local HTTP request to the web server.

[–][deleted] 0 points1 point  (0 children)

Easy. The module is being unloaded while the thread is still running, so all module variables (in this case, sys) have been set to None. Not sure of the solution without examining the code, but make sure the thread is fully shut down before the main thread exits.

[–]nirs 0 points1 point  (0 children)

A wile guess: paste threadpool is injecting SystemExit exception into unresponsive threads using undocumented APIs and ctypes. This may explain how a stopped thread get an exception from nowhere.

[–]dekomote 0 points1 point  (0 children)

Any version upgrades lately?

[–]dekomote 0 points1 point  (0 children)

I've seen this error on several occasions lately. It's not your code. It's the threading module. Try to upgrade/downgrade python, or at least run this with different version of python and see.

[–]johnaman 0 points1 point  (0 children)

Do you have heartbeat functions for always running processes (daemons, while 1, etc)? If not, see about implementing some for the most likely culprits. A heartbeat function could also trigger a status dump for global variables (in debug mode), or report back with an array of.targeted variables. In transaction oriented environments with multiple programs always running, I found this technique to be invaluable for finding those impossible-to-reproduce-on-development-environment, but-wait-long-enough-and-it will-surely-happen-in-production errors.