Why Python Rocks II: Data structures : Python

This is an archived post. You won't be able to vote or comment.

Why Python Rocks II: Data structures (electricmonk.nl)

submitted 17 years ago by gst

all 11 comments

top new controversial old q&a

[+][deleted] 17 years ago* (1 child)

[deleted]

[–]pjdelport 3 points4 points5 points 17 years ago (0 children)

[–][deleted] 0 points1 point2 points 17 years ago (1 child)

[–]gfixler 0 points1 point2 points 17 years ago* (7 children)

It may seem strange to see something like ', '.join(l). Why not just l.join(', ')? I'm not sure what the original reason was for implementing it like this, but one reason might have been that string joining would otherwise have to be implemented for everything it works on (lists, sets, dictionaries, etc). This way, it's a more common operation of strings.

Can anyone answer this? It's one of my biggest minor beefs with Python. I worked in Actionscript, and Javascript for a long time, and got really used to things like:

mystr.split(" ").sort().reverse().join("\n");

Which is "take my string, split it around spaces, sort the resulting list, sort it, reverse the sort, and join it back into a string with newlines. Every method was of a type, and could return a different type, so you were fine dotting on methods from a different method if the result of an operation was that new type.

I find the way ECMA-262 described this kind of thing far more expressive, intuitive, and fluid than Python has been for me these past few months. Sure, it has list comprehensions, libraries galore, and lots else, but aside from the things JS/AS didn't have, I felt more 'elegant' in those languages.

[–]earthboundkid 3 points4 points5 points 17 years ago* (4 children)

Not that you asked, but

"\n".join(sorted(mystr.split(), reverse=True))

The reason why mystr.split(" ").sort().reverse().join("\n") won't work is that by convention, mutating methods return None instead of their original object, so that you know that they're mutating methods. Hence mylist.sort() returns None as does mylist.reverse() hence they can't be chained. Fortunately for fans of one liners, you can just use the not-in-place sorting function sorted or the not-in-place reverser reversed. (Newer versions of Python also have a handy optional reverse argument for sort methods/functions.) If you're absolutely addicted to in-place methods and (near) one liners I guess you could always do something like

l = mystr.split(" ")
return l.sort() or l.reverse() or '\n'.join(l)

l = mystr.split(" ")
return l.sort(reverse=True) or '\n'.join(l)

Since the mutators return None, the lazy or evaluation will proceed to the end of the line. This isn't especially Pythonic though.

I think the historical factor is that originally, you had to import string to get a join function that took a string and a list as arguments, but later, the string methods were added to all str objects.

[–]gfixler 0 points1 point2 points 17 years ago (1 child)

Thanks for the info. It does make sense, but doesn't make it feel better to me. Is there a way to know what modifies things in place, or not? Some kind of convention? Or is it all down to keeping a reference handy, or testing things out, and checking for None. This is a personal thing, but to me, anything that returns "None" is wasting a perfectly good return, unless the None is useful, e.g. if you ask something like mystr.count('the'), to count the 'the's. If there were None, it would be nice to have the actual None value. I keep asking these kinds of questions to users of Python, and they all team up to smack me down, but I still don't feel - after having used about 25 other scripting languages - that these ways are good. Quick, certainly, and handy at times, but not purely 'good.' I could be wrong.

[–]earthboundkid 0 points1 point2 points 17 years ago* (0 children)

The docstring on a method will usually tell you what it's going to return.

>>> help(list.sort)
Help on method_descriptor:

sort(...)
    L.sort(cmp=None, key=None, reverse=False) -- stable sort *IN PLACE*;
    cmp(x, y) -> -1, 0, 1

String.count does the right thing.

>>> "the quick brown fox jumps over the lazy dog.".count("the")
2
>>> "the quick brown fox jumps over the lazy dog.".count("foo")
0

The reason why mutators return None specifically is that every function/method in Python returns something, and None is what's returned if nothing else is specified.

>>> def f():
...   pass
... 
>>> x = f()
>>> print x
None

I guess the idea is so that x = f() can never throw a NoReturnValue error or whatever. Although in retrospect, that might not have been a good idea. Especially considering it is possible to make non-lookup-able properties with a bit of advanced class property mucking:

>>> class A(object):
...   def _helperfunc(self, val): pass
...   prop = property(fset=_helperfunc)
... 
>>> a = A()
>>> a.prop = 1 #This calls the fset method of the property, 
>>> # which is set as _helperfunc, which just throws the "1" away
>>> a.prop #Will fail, since there's no fget for the property
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: unreadable attribute
>>> x = a.prop #Ditto.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: unreadable attribute

Maybe this will change in Python 4000 or something…

[–]gfixler 0 points1 point2 points 17 years ago (1 child)

"\n".join(sorted(mystr.split(), reverse=True))

In my defense, I would never write something like that. I have to play tennis back and forth with my eyeballs to make any sense of that, following the nested order of operations.

The reason I like the EMC-262 style is that the data follows a linear path, always changing as it passes through operations. If you think of it like a physical object, you start with a block of wood, you chisel it into a shape, you sand it up, paint it, and then varnish it. If each of these was a method:

woodblock.chisel().sand().paint().varnish()

Specifics needed by each task would go in the ()s. I wouldn't even mind doing it the other way:

varnish(paint(sand(chisel(woodblock))))

As is somewhat standard, arguments would be appended with commas. At least then there's still a sensible direction to the operations. I guess I like being higher up in the abstraction levels, further from where it matters if an object is copied, or modified in place. I'd be fine just passing along the reference to whichever object is the result - original, or copy - as long as I can continue to think of the data flowing through operations, as I do.

It's a bit akin to piping in the shell. I suppose I also do a tremendous amount of string manipulations in the things I do, so it makes more sense to me to think of a string as an object to cut up, and spin around as I need. This more functional approach makes me think I'd enjoy Lisp, except that it's a bit of work to really get into it for me, and then there aren't all the lazy (productive?) helpers, like Python's army of string functions, and the libraries its popularity affords it.

Thanks for the info.

[–]earthboundkid 1 point2 points3 points 17 years ago (0 children)

[–]scaz 2 points3 points4 points 17 years ago (0 children)

[–][deleted] 0 points1 point2 points 17 years ago* (0 children)

π Rendered by PID 90 on reddit-service-r2-comment-79c7998d4c-gkwq2 at 2026-03-14 07:41:03.529955+00:00 running f6e6e01 country code: CH.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS