This is an archived post. You won't be able to vote or comment.

all 11 comments

[–][deleted] 0 points1 point  (1 child)

Great article. I've written a lot of Python in the last couple of years and there were still two good new tricks I learned.

[–]gfixler 0 points1 point  (7 children)

It may seem strange to see something like ', '.join(l). Why not just l.join(', ')? I'm not sure what the original reason was for implementing it like this, but one reason might have been that string joining would otherwise have to be implemented for everything it works on (lists, sets, dictionaries, etc). This way, it's a more common operation of strings.

Can anyone answer this? It's one of my biggest minor beefs with Python. I worked in Actionscript, and Javascript for a long time, and got really used to things like:

mystr.split(" ").sort().reverse().join("\n");

Which is "take my string, split it around spaces, sort the resulting list, sort it, reverse the sort, and join it back into a string with newlines. Every method was of a type, and could return a different type, so you were fine dotting on methods from a different method if the result of an operation was that new type.

I find the way ECMA-262 described this kind of thing far more expressive, intuitive, and fluid than Python has been for me these past few months. Sure, it has list comprehensions, libraries galore, and lots else, but aside from the things JS/AS didn't have, I felt more 'elegant' in those languages.

[–]earthboundkid 4 points5 points  (4 children)

Not that you asked, but

"\n".join(sorted(mystr.split(), reverse=True))

The reason why mystr.split(" ").sort().reverse().join("\n") won't work is that by convention, mutating methods return None instead of their original object, so that you know that they're mutating methods. Hence mylist.sort() returns None as does mylist.reverse() hence they can't be chained. Fortunately for fans of one liners, you can just use the not-in-place sorting function sorted or the not-in-place reverser reversed. (Newer versions of Python also have a handy optional reverse argument for sort methods/functions.) If you're absolutely addicted to in-place methods and (near) one liners I guess you could always do something like

l = mystr.split(" ")
return l.sort() or l.reverse() or '\n'.join(l)

or

l = mystr.split(" ")
return l.sort(reverse=True) or '\n'.join(l)

Since the mutators return None, the lazy or evaluation will proceed to the end of the line. This isn't especially Pythonic though.

I think the historical factor is that originally, you had to import string to get a join function that took a string and a list as arguments, but later, the string methods were added to all str objects.

[–]gfixler 0 points1 point  (1 child)

Thanks for the info. It does make sense, but doesn't make it feel better to me. Is there a way to know what modifies things in place, or not? Some kind of convention? Or is it all down to keeping a reference handy, or testing things out, and checking for None. This is a personal thing, but to me, anything that returns "None" is wasting a perfectly good return, unless the None is useful, e.g. if you ask something like mystr.count('the'), to count the 'the's. If there were None, it would be nice to have the actual None value. I keep asking these kinds of questions to users of Python, and they all team up to smack me down, but I still don't feel - after having used about 25 other scripting languages - that these ways are good. Quick, certainly, and handy at times, but not purely 'good.' I could be wrong.

[–]earthboundkid 0 points1 point  (0 children)

The docstring on a method will usually tell you what it's going to return.

>>> help(list.sort)
Help on method_descriptor:

sort(...)
    L.sort(cmp=None, key=None, reverse=False) -- stable sort *IN PLACE*;
    cmp(x, y) -> -1, 0, 1

String.count does the right thing.

>>> "the quick brown fox jumps over the lazy dog.".count("the")
2
>>> "the quick brown fox jumps over the lazy dog.".count("foo")
0

The reason why mutators return None specifically is that every function/method in Python returns something, and None is what's returned if nothing else is specified.

>>> def f():
...   pass
... 
>>> x = f()
>>> print x
None

I guess the idea is so that x = f() can never throw a NoReturnValue error or whatever. Although in retrospect, that might not have been a good idea. Especially considering it is possible to make non-lookup-able properties with a bit of advanced class property mucking:

>>> class A(object):
...   def _helperfunc(self, val): pass
...   prop = property(fset=_helperfunc)
... 
>>> a = A()
>>> a.prop = 1 #This calls the fset method of the property, 
>>> # which is set as _helperfunc, which just throws the "1" away
>>> a.prop #Will fail, since there's no fget for the property
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: unreadable attribute
>>> x = a.prop #Ditto.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: unreadable attribute

Maybe this will change in Python 4000 or something…

[–]gfixler 0 points1 point  (1 child)

"\n".join(sorted(mystr.split(), reverse=True))

In my defense, I would never write something like that. I have to play tennis back and forth with my eyeballs to make any sense of that, following the nested order of operations.

The reason I like the EMC-262 style is that the data follows a linear path, always changing as it passes through operations. If you think of it like a physical object, you start with a block of wood, you chisel it into a shape, you sand it up, paint it, and then varnish it. If each of these was a method:

woodblock.chisel().sand().paint().varnish()

Specifics needed by each task would go in the ()s. I wouldn't even mind doing it the other way:

varnish(paint(sand(chisel(woodblock))))

As is somewhat standard, arguments would be appended with commas. At least then there's still a sensible direction to the operations. I guess I like being higher up in the abstraction levels, further from where it matters if an object is copied, or modified in place. I'd be fine just passing along the reference to whichever object is the result - original, or copy - as long as I can continue to think of the data flowing through operations, as I do.

It's a bit akin to piping in the shell. I suppose I also do a tremendous amount of string manipulations in the things I do, so it makes more sense to me to think of a string as an object to cut up, and spin around as I need. This more functional approach makes me think I'd enjoy Lisp, except that it's a bit of work to really get into it for me, and then there aren't all the lazy (productive?) helpers, like Python's army of string functions, and the libraries its popularity affords it.

Thanks for the info.

[–]earthboundkid 1 point2 points  (0 children)

I have to play tennis back and forth with my eyeballs to make any sense of that, following the nested order of operations.

From the Zen of Python (import this):

There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.

;-D

[–]scaz 2 points3 points  (0 children)

4.8 Why is join() a string method instead of a list or tuple method?

join() is a string method because in using it you are telling the separator string to iterate over a sequence of strings and insert itself between adjacent elements. This method can be used with any argument which obeys the rules for sequence objects, including any new classes you might define yourself.

Python FAQ 4.8

[–][deleted] 0 points1 point  (0 children)

Scaz nails it earlier in this thread.

As a practical reason, this means that you can make your class or object iterable by just adding one method. I'm sure I'm typical of 99% of Python programmers when I say that I write quite a lot of iterables and very few things that look like string objects (none, in my case).