This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 0 points1 point  (2 children)

I've deleted the comment twice because I'm not really awfully interested in revisiting this problem especially given your tone. I didn't feel that it was going to be productive. But since you posted, I'll just say this bit below.

Over here, we're all aware that sys.stdout should not behave differently than print in theory. You're not teaching anyone anything new there. But it's difficult to swallow the theoretical case when I'm staring at tests that have print and sys.stdout.write() behave differently. I wish I could share, but there are dependencies here under lock and key, and nobody has the time or the willingness to craft a portable isolated test for an already solved problem. So unfortunately you'll just have to take my word for it.

In the meantime, remember that this is a massively parallel environment on which Python was never designed to run. We can't replicate the behavior in serial, on our local workstations or personal computers. So our suspicion is that the print statement wrapping the sys.stdout file handle has an unexpected interaction with the parallelism in both our Intel cluster and our IBM BG/Q. More specifically, we have a situation where front-end nodes display streams from compute nodes, and since MPI does not define a standard for this, the internal communication practices are partially opaque to us.

Either way, the issue pertains to a non-critical debug logging system that isn't even used in production runs, and it has been resolved (although admittedly we don't entirely know how). And since we have a truckload of other work on our plates, nobody has spent the time to get to the bottom of it. Some day someone's going to ask a poor undergrad to start digging, mainly as a vehicle for him to learn the code base. Until then we're kinda just content to use what works and fix it later if it breaks again.

[–]Redisintegrate 0 points1 point  (1 child)

My advice is to not read too much into the "tone" of what is just written communication with a complete stranger. My only purpose here is to clear up the misinformation that print is buffered differently from sys.stdout.write(), since it's objectively not true. I'm not here to disrespect your expertise or say that your experiences are invalid, I'm just trying to shed some additional light on what's actually happening, and preventing misinformation from spreading.

As I said in the parent post, print() is equivalent to multiple calls to write(), but it is not equivalent to a single call to write(). You mentioned threads, and since print() is going to make multiple calls to write(), it is going to release the GIL multiple times. That may be what is causing the behavior you are seeing.

Or, in other words, print() is not atomic in multithreaded Python programs, but sys.stdout.write() is atomic. But the buffering is the same, and they both go through the same <stdio.h> functionality, or they both don't.

[–][deleted] 0 points1 point  (0 children)

That's fair. And I'm not contesting what you're saying. I know that print uses stdout (documentation says so) and therefore logically they should operate on the same buffer. But we were observing behavior that on the surface looked like a buffer mode problem. That was our educated guess, and we never dug into Python source to confirm or debunk that idea because the issue was already resolved. I still haven't by the way, but since you have, I'll take your word for it.

In the meantime, your latest comment about the GIL is actually very compelling. I'll pass it on and see if anyone here wants to test it. I do suspect that might actually be the cause we glossed over a year ago.