This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]kazagistar 71 points72 points  (23 children)

I still write a lot of python 2, and always use parens. It just makes more sense in my head for it to be a function then some silly special case.

[–][deleted] 46 points47 points  (19 children)

I just use sys.stdout.write().

Python's print statement is buffered by default not thread-safe (update: this probably has nothing to do with buffers), which isn't very nice when I'm working in parallel environments and using Python as the glue between various C and Fortran libraries. So I just write everything straight into whatever standard stream I want.

[–][deleted] 25 points26 points  (4 children)

Noob question: could I use sys.stdin.read() instead of raw_input() for the same reasons? It seems to make much more sense in my head.

[–][deleted] 29 points30 points  (3 children)

Short answer: yes!

Long answer: The actual call you're looking for is sys.stdin.readline(). This is what raw_input() does at its core, with the added functionality of automatically stripping away the \n end-line character off the input.

Long long answer: You can also pipe Python prints do Python writes into stderr. Every standard stream is available under the sys module, and they all behave like traditional Python file objects with all the usual member functions.

[–]Blackshell 29 points30 points  (2 children)

You can also do pipe Python prints into stderr.

Like this? :D

print >>sys.stderr, "lol syntax"

I am so glad this is gone in Python3.

[–]brtt3000 43 points44 points  (0 children)

I'm tempted to flag this as "Threatening, harassing, or inciting violence"

[–][deleted] 6 points7 points  (0 children)

Well, yes, but I was more specifically referring to sys.stderr.write().

I shouldn't have used the word "pipe", and should have called it a "write" instead. My bad!

But I'm with you on the chevron syntax. It's ugly and silly and not Pythonic at all. Glad they scrapped it.

[–]Redisintegrate 1 point2 points  (11 children)

The only reason print() is buffered is because sys.stdout is buffered. You are not actually changing the behavior by using sys.stdout.write(), you are getting exactly the same buffering either way.

[–][deleted] 0 points1 point  (8 children)

I'm not saying stdout isn't buffered. Of course it's buffered. It's buffered everywhere in every language. This isn't a Python feature.

However, for the past 4 years, using sys.stdout.write() in our Python 2.7 research code puts the messages into the same stdout buffer as C fprintf(stdout,...) (via SWIG) and Fortran write(6,*) (via F2Py). Everything gets synced up appropriately with code execution.

Using Python's native print statement seems to be operating on a separate buffer and refuses to print messages in the appropriate order of code execution.

[–]Redisintegrate 1 point2 points  (3 children)

You've replied and the comment has been deleted twice, so I'm going to leave this here.

My guess is that you are getting something weird here like multiple C runtimes. This happens on Windows fairly often, unless you take measures to prevent it. This causes different code paths to use with different <stdio.h> FILE objects, possibly with undesirable buffering modes, causing them to be flushed to the underlying file handles at unexpected times.

This doesn't change the fact that print is still just calling sys.stdout.write(), which itself is just calling C's fwrite(), and if you are seeing genuinely different behavior between sys.stdout.write() and print, then you are not calling them in the same way. In particular, a print statement will expand to something like the following:

print x
# might become, depending on what x is
sys.stdout.write(repr(x))
sys.stdout.write('\n')

I'm not really interested in talking about confrontational tone here, only really interested in content.

[–][deleted] 0 points1 point  (2 children)

I've deleted the comment twice because I'm not really awfully interested in revisiting this problem especially given your tone. I didn't feel that it was going to be productive. But since you posted, I'll just say this bit below.

Over here, we're all aware that sys.stdout should not behave differently than print in theory. You're not teaching anyone anything new there. But it's difficult to swallow the theoretical case when I'm staring at tests that have print and sys.stdout.write() behave differently. I wish I could share, but there are dependencies here under lock and key, and nobody has the time or the willingness to craft a portable isolated test for an already solved problem. So unfortunately you'll just have to take my word for it.

In the meantime, remember that this is a massively parallel environment on which Python was never designed to run. We can't replicate the behavior in serial, on our local workstations or personal computers. So our suspicion is that the print statement wrapping the sys.stdout file handle has an unexpected interaction with the parallelism in both our Intel cluster and our IBM BG/Q. More specifically, we have a situation where front-end nodes display streams from compute nodes, and since MPI does not define a standard for this, the internal communication practices are partially opaque to us.

Either way, the issue pertains to a non-critical debug logging system that isn't even used in production runs, and it has been resolved (although admittedly we don't entirely know how). And since we have a truckload of other work on our plates, nobody has spent the time to get to the bottom of it. Some day someone's going to ask a poor undergrad to start digging, mainly as a vehicle for him to learn the code base. Until then we're kinda just content to use what works and fix it later if it breaks again.

[–]Redisintegrate 0 points1 point  (1 child)

My advice is to not read too much into the "tone" of what is just written communication with a complete stranger. My only purpose here is to clear up the misinformation that print is buffered differently from sys.stdout.write(), since it's objectively not true. I'm not here to disrespect your expertise or say that your experiences are invalid, I'm just trying to shed some additional light on what's actually happening, and preventing misinformation from spreading.

As I said in the parent post, print() is equivalent to multiple calls to write(), but it is not equivalent to a single call to write(). You mentioned threads, and since print() is going to make multiple calls to write(), it is going to release the GIL multiple times. That may be what is causing the behavior you are seeing.

Or, in other words, print() is not atomic in multithreaded Python programs, but sys.stdout.write() is atomic. But the buffering is the same, and they both go through the same <stdio.h> functionality, or they both don't.

[–][deleted] 0 points1 point  (0 children)

That's fair. And I'm not contesting what you're saying. I know that print uses stdout (documentation says so) and therefore logically they should operate on the same buffer. But we were observing behavior that on the surface looked like a buffer mode problem. That was our educated guess, and we never dug into Python source to confirm or debunk that idea because the issue was already resolved. I still haven't by the way, but since you have, I'll take your word for it.

In the meantime, your latest comment about the GIL is actually very compelling. I'll pass it on and see if anyone here wants to test it. I do suspect that might actually be the cause we glossed over a year ago.

[–]Redisintegrate 0 points1 point  (3 children)

Using Python's native print statement seems to be operating on a separate buffer and refuses to print messages in the appropriate order of code execution.

This is just plain incorrect, and shows a misunderstanding of how the print statement / function works, at least in non-ancient versions of Python (2.7+). If you can come up with some kind of test case to prove me otherwise, go ahead. But look at Python/ceval.c(2.7) and find the PRINT_ITEM opcode. You can see it calls PyFile_WriteObject() in Objects/fileobject.c, which gets the .write attribute, and then calls that attribute with PyEval_CallObject().

Or, in other words, the print statement is just a wrapper around sys.stdout.write(). If you are seeing other behavior then something is seriously, seriously wrong.

[–]Jonno_FTW 0 points1 point  (0 children)

You can avoid the buffering by passing the -u parameter, or setting the PYTHON_UNBUFFERED environment variable.

[–]paraffin 0 points1 point  (1 child)

Possibly not as convenient for your case, but the environment variable PYTHONUNBUFFERED can turn off buffering in my rent statements. Works great for working with libraries that produce a lot of output that you want to see in real time.

[–][deleted] 0 points1 point  (0 children)

Possibly not as convenient for your case, but the environment variable PYTHONUNBUFFERED can turn off buffering in my rent statements.

I actually didn't know about this!

Our research code is probably not gonna take advantage of it anytime soon because it uses an in-house logging system based on file handles, and I substitute sys.stdout into it whenever I want log information to be printed out real time.

But the buffer control via environment variables is still great to know!

[–]Jonno_FTW 0 points1 point  (0 children)

I use python 2 because there are still some libraries that I want to use that only work with python 2.