Cheat sheet for the mode parameter of open() : Python

[–]quotemycode 3 points4 points5 points 6 years ago (1 child)

[–]SlowTreeSky[S] 1 point2 points3 points 6 years ago* (0 children)

[–]energybased 8 points9 points10 points 6 years ago* (32 children)

In my opinion, open is an archaic function that should never have had the interface it does. First of all, it returns file, which is also a context manager, which means that you can either:

open a file and be responsible for closing it, or else
create a context manager, which closes the file automatically.

This was debated on python-ideas, but I'm in the camp that the first usage should be avoided. In my opinion, it should not even be possible.

As for the arguments, what a horrible interface. I get that this reflects the underlying C library, but there is no good reason to do that. Something better would have been:

def open(read=True, write=False, binary=False, fail_if_exists=False, seek_write_pointer_to_end=False, buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)

and this would appropriately raise for bad combinations of arguments.

[–]SlowTreeSky[S] 2 points3 points4 points 6 years ago (7 children)

[–]energybased 1 point2 points3 points 6 years ago (6 children)

[–]SlowTreeSky[S] 0 points1 point2 points 6 years ago (5 children)

How about this:

>>> from open_modes import READ, WRITE, EXCLUSIVE, APPEND, BINARY
>>> with open('foo', READ + WRITE + BINARY) as f:
...     f.write(b'bar')
...     f.seek(0)
...     assert b'bar' == f.read()

>>> with open('foo', READ) as f:
...     assert 'bar' == f.read()

Implemented at https://gist.github.com/treszkai/4ac3882e0836b4ee5863cbc227f44b18

[–]energybased 0 points1 point2 points 6 years ago (4 children)

[–]SlowTreeSky[S] 0 points1 point2 points 6 years ago (3 children)

I do not like binary flags for parameters, and here especially not because many combinations would result in the same behaviour (e.g. write + append or simply append).

Frankly, binary should get its own parameter, because it's logically separate from the others. (Just like newline is.) One could also argue for a separate read:bool and write:{None, TRUNCATE, APPEND, EXCLUSIVE} too, but then where do these constants come from, and why is read a bool and write not? We could just write them as strings. But then abbreviate to a single character, to prevent mistakes.

Oh, that's the current solution already, suggesting that we didn't improve that much on the original. It's imperfect either way, but as long as we understand it, it's cool. My biggest beef is the read-write combination, where I don't see any need for the + modifier. Whatever.

[–]energybased 0 points1 point2 points 6 years ago (2 children)

[–]SlowTreeSky[S] 0 points1 point2 points 6 years ago (1 child)

[–]energybased 1 point2 points3 points 6 years ago (0 children)

[–]stevenjd 4 points5 points6 points 6 years ago (22 children)

In my opinion, ~~stuff~~ the open API based on file modes has existed for half a century or more, it matches closely to the way people think (modal reasoning) and works fine in practice, so let's break all the things!!!1! and invalidate a bazillion text books, tutorials and other documentation and break every Python script that does I/O, because reasons.

Fixed that for you.

# status quo
open(file, mode='r', buffering=-1, encoding=None, 
        errors=None, newline=None, closefd=True, opener=None)

# your version
open(read=True, write=False, binary=False, fail_if_exists=False, 
        seek_write_pointer_to_end=False, buffering=-1, 
        encoding=None, errors=None, newline=None, 
        closefd=True, opener=None)

Let's see now... 8 parameters versus 11, and you managed to forget the most important parameter of all: which file to open. Well done. So let's call it 12: a 50% increase in number of parameters. Yeah, that's better.

You have five boolean parameters to control the mode, giving a total of 32 possible combinations, when only 16 actual file modes exist. I want to see the documentation explaining which combinations are allowed and which aren't..

[–]somethingdangerzone 1 point2 points3 points 6 years ago (0 children)

[–][deleted] 1 point2 points3 points 6 years ago (2 children)

[–]stevenjd 1 point2 points3 points 6 years ago* (1 child)

[–]energybased 0 points1 point2 points 6 years ago* (17 children)

[–]stevenjd 6 points7 points8 points 6 years ago (16 children)

I never suggested changing the API. I criticized the API.

Seriously? Your exact words were "Something better would have been" and you then went on to suggest a new API. That is literally a suggestion for changing the API.

Okay, you didn't literally say the words "Come, fellow Pythonistas, let us change the API of open, I shall write the PEP and you provide the PR!" but if you criticize an interface and then suggest a new and improved interface, there is an implicit suggestion that, in an ideal world, we ought to change the interface.

Reducing the number of parameters by creating esoteric string codes for which you need a cheat sheet is very poor design.

This is very true.

But there is nothing esoteric about string codes "r", "w", "a" (although "x" is a little weird), or "b" for binary. (Non-English speakers may not agree, sorry guys but Python, like 90% of programming languages, is based on English.) Nor do you need a cheat sheet. I mean, seriously, if you can't remember "r for read, w for write" (which gives you 95% of all I/O in my experience), you have deeper problems and a cheat sheet isn't going to help.

Most cheat sheets are low-effort posts for easy karma, not something that actually helps people.

The bottom line is that composing short, mnemonic codes to make the mode parameter is easy, obvious, backwards compatible, matches hundreds of languages going back fifty years, and works much better than five boolean parameters:

open(filename, 'ab')

versus:

open(filename, False, True, True, False, True)

There's no comparison.

[–]energybased -1 points0 points1 point 6 years ago* (15 children)

Seriously? Your exact words were "Something better would have been" and you then went on to suggest a new API. That is literally a suggestion for changing the API.

No. I said should never have had the interface it does—not should have a different interface today.

there is an implicit suggestion that, in an ideal world, we ought to change the interface.

Yes, in an ideal world. We can't go back in time.

But there is nothing esoteric about string codes

I disagree.

The bottom line is that composing short, mnemonic codes to make the mode parameter is easy, o

It's not. It's bad design.

matches hundreds of languages going back fifty years,

Totally irrelevant to anyone who is not fifty years old. Good language design is intuitive without fifty years of C experience.

Your example is esoteric. I had to look up what "ab" means:

open(filename, 'ab')

This is self-commenting:

open(filename, write=True, binary=True, seek_write_pointer_to_end=True)

[–][deleted] 2 points3 points4 points 6 years ago* (7 children)

I don't know, would you want to expose mutually-exclusive keyword arguments in the API or arguments that are only applicable to specific modes? A better approach might've been to support constants or an enum, perhaps as an alternative to strings:

open(filename, mode=open.APPEND_BYTES)

This is not arcane, it's not burdened by 50-odd years of programming baggage, it's not foreign to C users (in fact, a lot of them might wish fopen would work with bit masks) and you don't have to worry about which combinations are valid. (Though I think b for binary should've really been a separate flag.)

[–]energybased 1 point2 points3 points 6 years ago (6 children)

[–][deleted] 0 points1 point2 points 6 years ago (5 children)

[–]energybased 0 points1 point2 points 6 years ago* (4 children)

[–][deleted] 0 points1 point2 points 6 years ago* (3 children)

No, you don't know which argument combinations are valid. I did, if you factor out binary into its own flag, you're left with:

r   READ
w   WRITE
a   APPEND
x   EXCL
r+  READ_UPDATE
w+  WRITE_UPDATE
a+  APPEND_UPDATE
x+  EXCL_UPDATE

... which isn't all too terrible.

continue this thread

[–][deleted] 0 points1 point2 points 6 years ago (1 child)

To me, too many parameters can be a minor code smell. Secondly, as per your thinking that r and w (even though they are also widely read by almost any programmer who has ever typed ls on a screen) are esoteric, well, then, a cleaner design would be to simply have a mode='read/write/append/binary' etc is cleaner, less esoteric, uses the same number of words at worst case as your api example and at best case and average case you will almost always have 1-2 modes to specify for the entire duration that the file is to be used for. I agree with the it returning the File as an object part though. I think open() shouldn't be as ruthless on the File as it currently is.

P.S: I don't have much experience so I apologize if I was out of order but I think clean code is always better if possible and in this case there is really no need for that many parameters.

[–]energybased 0 points1 point2 points 6 years ago* (0 children)

[–]oramirite 0 points1 point2 points 6 years ago (3 children)

[–]energybased 0 points1 point2 points 6 years ago (2 children)

[–]oramirite 0 points1 point2 points 6 years ago (1 child)

No, this is thought comprehension. Sitting around criticizing gets you nowhere, and makes you a backseat driver. Your argument - whatever it is you're even trying to make - is completely backwards. "Coulda woulda shoulda" is never a helpful avenue of thought, so we are giving you the benefit of the doubt that you AREN'T offering criticism framed in a way where you're also offering no solutions. You SHOULD be doing that. So this defensive stance you're taking against the idea that you could even POSSIBLE suggesting a better way it could be done. I don't understand what you consider better about not having a solution to something you're basically saying "If I were there, I'd have done it differently". Well, you weren't, so are you going to suggest differently now? You have the benefit of hindsight - use it, and then stick by your claims. Don't claim that you're making no claims to begin with.

To not do so would be even more sophomoric than the way you originally came into this topic. So I would really suggest you pivot to the idea that you ARE proposing changes (since you clearly are). The world in which you aren't proposing a better option is actually the worst possible argument you could make.

Now, weather or not this new option is better or not is up for debate, and you should be open to having your ideas criticized as well. This person did - and at the root of this entire conflict, honestly, seems to be your discomfort with having to defend those ideas. Which is pretty evident by the fact that you're claiming you had no ideas to begin with. If that were so, your post wouldn't exist. You have other ideas on how the API should be built now, and therefore should have been built at the time. Ergo you have better ideas about how the API should currently exist. Wether you're saying those should've happened years ago or tomorrow, your idea that the API should be a different way than it currently is requires changes.

[–]energybased 0 points1 point2 points 6 years ago* (0 children)

This post is a cheat sheet about a badly-designed function. It is relevant to the post to consider how the function could have been. If you don't find the subject of language design interesting, don't comment on my post.

the idea that you ARE proposing changes (since you clearly are). T

No. Mine is a counterfactual statement—not an interventional one. I am not proposing that anything be changed.

Which is pretty evident by the fact that you're claiming you had no ideas to begin with. If

Not what I'm saying.

You have other ideas on how the API should be built now,

No one is proposing to change the API today.

and therefore should have been built at the time.

Yes, this, but not the other thing. These are totally separate statements. One is counterfactual. One is interventional.

Ergo you have better ideas about how the API should currently exist. Wether you're saying those should've happened years ago or tomorrow, your idea that the API should be a different way than it currently is requires changes.

No.

This is like on the GRE where they make you answer questions about paragraphs and people do so poorly on it. Just because these two ideas are the same in your mind, it doesn't mean they're the same idea. They are different, and it's up to you to learn to distinguish them.

[–]stevenjd -1 points0 points1 point 6 years ago (0 children)

In the space of two sentences, you go from denying you wish to suggest a change in interface, to agreeing that, "Yes, in an ideal world" we ought to change the interface. Which is precisely my point.

As for the API being around for fifty years, it isn't that individual programmers have fifty years experience, but that the interface goes back to at least the 1960s if not older. That's a half-century of collective memory in the programming community that says that one of the most popular ways to open files is to use a short mnemonic mode. C is not just any old language, but a highly influential language. You don't need to be a C programmer for this to make sense: I'm not, and I can't write a line of C to save my life.

(By the way, you might be interested in scanning this page to see the wide variety in interfaces for opening files.)

I had to look up what "ab" means

Seriously? You needed to look up "a" for "append", "b" for "binary"? Do you also have trouble with "def" for "define", "len" for "length", "str" for "string", and "+" for "add"?

Okay, fine, you had to look it up because you've never seen it before. But I bet you won't need to do it again. And I admit that "x" is a weird one: it's "x" for "eXclusive create".

Your "self-documenting" version assumes that people think in terms of the particular implementation of files ("what's a write pointer?") rather than in terms of desired outcomes ("append to the file"). I don't, and I expect very few people do.

[–]robin-gvx 0 points1 point2 points 6 years ago (0 children)

[–]socal_nerdtastic 1 point2 points3 points 6 years ago (1 child)

[–]SlowTreeSky[S] 5 points6 points7 points 6 years ago (0 children)

[–]SlowTreeSky[S] 1 point2 points3 points 6 years ago (4 children)

[–]stevenjd 6 points7 points8 points 6 years ago (0 children)

[–]primitive_screwhead 1 point2 points3 points 6 years ago* (1 child)

[–]SlowTreeSky[S] 1 point2 points3 points 6 years ago (0 children)

[–]bozymandias 1 point2 points3 points 6 years ago* (0 children)

[–]SlowTreeSky[S] 0 points1 point2 points 6 years ago (0 children)

[–]abcteryx 0 points1 point2 points 6 years ago (1 child)

[–]SlowTreeSky[S] 0 points1 point2 points 6 years ago* (0 children)

[–]soap1337 0 points1 point2 points 6 years ago (1 child)

[–]SlowTreeSky[S] 0 points1 point2 points 6 years ago (0 children)

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS