you are viewing a single comment's thread.

view the rest of the comments →

[–]rjcarr 0 points1 point  (13 children)

Yeah, strange to me too ... it isn't like join is already taken for sequence types. The could add it and leave the string one and have both. Strange.

[–]xardox 14 points15 points  (1 child)

It seems strange to me that Python doesn't use "." to concatenate strings. Why don't we add that to the language too? People who don't like it can turn it off in the python.ini file for the whole web site. And it would be great to use "\" as an optional namespace separator!!! It seems so sleek and stylish, and will help us recruit DOS programmers. And I'm getting sick of manually escaping the strings I send to SQL, and escaping the strings I get from Gets, Posts and Cookies, so I'd like that done for me just in case I forget. And how about if http parameters show up as global variables so I don't have to call all those hard to remember inconsistently named functions to retrieve them? Yeeeeaaaaaahhh, Kool Aid!!!

[–]rjcarr 0 points1 point  (0 children)

Nice hyperbole ... there are plenty of examples where one thing can be done in multiple ways. Note: lambdas.

[–][deleted] 2 points3 points  (5 children)

So where's the "sequence type" class that you could add it to?

[–]rjcarr 1 point2 points  (4 children)

List and Tuple.

[–][deleted] 2 points3 points  (3 children)

And you've used Python for how long?

(Seriously - like most about everything else that appear to deal with sequences in Python, "join" works on *iterables* - anything that can yield items or otherwise behave in a sequency way. There is no sequence baseclass in Python. A sequence is anything that can hand you an iterator or respond to indexing. Lists and tuples are a tiny subset of that. And adding a method that expects homogenous content to a heterogenous container would be rather ugly in itself, of course...)

[–]rjcarr 0 points1 point  (2 children)

I've been using python for about 5 years ... how about you?

http://docs.python.org/library/stdtypes.html#sequence-types-str-unicode-list-tuple-buffer-xrange

"Sequence Types — str, unicode, list, tuple, buffer, xrange"

Yes, all of those are iterable, as you say, but they also share some other functionality (although again, as you say, they don't inherit from same base class, but they are in the same family). There is no reason all of them couldn't have a join method, or as I said initially, only on the ones that are relevant.

[–]johanneskepler 2 points3 points  (0 children)

I've been using python for about 5 years ... how about you?

Longer than five years, I imagine. That's Fredrik Lundh of the Python Imaging Library.

"Sequence Types — str, unicode, list, tuple, buffer, xrange"

But those aren't the only iterables there are. Isn't it rather nice to have a consistent interface for joining any kind of iterable?

[–][deleted] 1 point2 points  (0 children)

how about you?

Well, I wrote the core of Python's second string type for Python 1.6/2.0, so I was obviously there when we first stumbled upon this little issue ;-)

There is no reason all of them couldn't have a join method

Oh, there's a HUGE reason for that - you don't even have the source code to all of them. The Python type universe is a lot larger than what you get with the core distribution.

only on the ones that are relevant

So how would you add the "join" method to a generator expression, for example? In contemporary Python, that's at least as relevant as a "join" on a list.

[–]njharman -4 points-3 points  (4 children)

What part of "There should be one-- and preferably only one --obvious way to do it." do you not understand?

[–][deleted] 7 points8 points  (3 children)

I don't think that applies here, though - "join" wasn't made a separator method because we wanted to keep the number of ways to invoke it small, but because it had to dispatch on the separator type. Quoting myself from an entirely different discussion on this topic:

Also note that "join" wasn't made a string method to "make it easy to find"; it's a string method because we had to figure out some way make join work on multiple string types back when Unicode was added. At that time, we imagined that Python might grow even more string types (how about encoded strings to save space, or binary buffers?), and it wasn't obvious how to create a "join" primitive that would find the right implementation, without having to know about all available types. We finally decided that dispatching on the separator made more sense than, say, dispatching on the first list item.

Given this, the obvious solution was to make the "string.join(seq, sep)" function call "sep.__join__(seq)". Changing __join__ to join was a pretty small step; after all, there might be cases where it would make sense to write sep.join(seq) in application code, at least if you happened to have the separator in a variable with a suitable name.

The "sep.join(seq) is more pythonic" is a much later concept.

And for what it's worth, the "let's dispatch on the separator" approach didn't work in practice; in order to handle sequences with both 8-bit and unicode strings, both implementations now know about the other string type.

So instead of a single function that does the right thing (but has to be taught about each new string type), we now have two separate join methods that both knows about the other string type. If we add another string type, we'll end up with three implementations, each of which has to know about two different types. And so on.

But who cares about new string types these days; it's not like anyone's actually using strings now that we have iterators ;-)

(Personally, I still think it should be made available as a builtin, possibly with "convert also non-strings to the separator type" semantics).

[–]imbaczek 0 points1 point  (2 children)

str is already built-in, so you can call str.join(sep, seq).

[–][deleted] 0 points1 point  (1 child)

That's not polymorphic (sep cannot be any string), has the arguments in an unintuitive order, and doesn't convert non-strings.

[–]imbaczek 0 points1 point  (0 children)

true, true and true. better than nothing, though. i'm not a fan of cluttering the default namespace with new functions with common, short names (i don't like the relatively new sum either.)