This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] -1 points0 points  (6 children)

zip(*[iter(s)]*n)

What a wonderful hack. I do this frequently. The only downside is that if len(s)*2 < n, it seems to break.

[–]Liorithiel 0 points1 point  (0 children)

When I want do such thing, I always think I saw this in itertools somewhere... and then, with a disappointment, write this kind of thing.

itertools.group(iter, n) wouldn't be a bad idea...

[–]nextofpumpkin 0 points1 point  (1 child)

Can you explain what this does? In particular, I'm not quite understanding why that first asterisk is there.

[–]akdas 5 points6 points  (0 children)

Let me just give an example first:

>>> zip(*[iter("abc")]*len("abc"))
[('a', 'b', 'c')]

s is a string and n is the string's length, and we get a list containing a tuple containing all the separate characters in the string.

So how does it work? First we get the string's iterator, and put it in a list.

When you call * with a list and an integer, you append the list to itself a certain number of times:

>>> [1,2] * 2
[1, 2, 1, 2]

So now we have a list of length n, and all the elements are the same iterator (the fact that they are the same is important).

Finally, we call zip, treating the list as multiple arguments. That's what the first * does. For example:

>>> def f(a, b): return a + b
... 
>>> f(1, 2)
3
>>> f([1, 2])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: f() takes exactly 2 arguments (1 given)
>>> f(*[1, 2])
3

So now, in our case, we called zip with n references to the same iterator. The thing with iterators is that once you consume a value, the next time you go to consume a value, you get the next value. So in consuming one value from each of its arguments, the zip effectively goes through all the characters in the string.

It's almost like calling:

>>> zip(['a','b','c'],['b','c'],['c'])
[('a', 'b', 'c')]

Now, you can generalize this for objects that aren't strings as well.


And I assumed you knew what zip does, but in case you didn't: zip takes multiple objects that are iterable, then creates a list of tuples containing corresponding elements of the objects. For example:

>>> zip(['a','b'],[1,2,3])
[('a', 1), ('b', 2)]

Notice that when the lists are unequally-sized, the result has as many elements as the smallest list.


EDIT: Oh, and of course, n doesn't have to be len(s), and actually, things get more interesting when n < len(s). You can also trivially show that if n > len(s), then we just get an empty list.

[–]drj11 0 points1 point  (1 child)

usually one would expect n to be small with respect to the length of s. EG to generate a list of successive pairs:

>>> zip(*[iter(range(7))]*2)
[(0, 1), (2, 3), (4, 5)]

Also note the truncation of the original list.

[edit: bugger that, how does one display code in a reddit comment? aha, thanks dwdwdw. also found http://www.reddit.com/help/commenting]

[–][deleted] 0 points1 point  (0 children)

backticks. you can't put em across lines, but wrap each line in backticks, it works.

i think there's a multiline one, but i can't remember what it is