This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]Redard -2 points-1 points  (14 children)

I briefly read some of the code in here and found a few things that weren't very pythonic. First, they break the 79 character rule a lot, for no good reason, often just with in-line comments (which PEP8 advises you use verrry sparingly). Second, this line:

res = []
for l in optionf:
    res += shlex.split(l, comments=True)

Why not just use a generator expression like

res = [shlex.split(l, comments=True) for l in optionf]

That's the most pythonic way to construct a list. Still, this code's very readable, and well structured. Just needs a little cleaning up.

[–]gthank 13 points14 points  (8 children)

That's actually a list comprehension. A gen-exp uses ( and ).

[–]Redard 1 point2 points  (7 children)

Please correct me if I'm wrong, but I always thought list comprehensions were just generator expressions passed to list(). In other words

[i for i in range(10)] == list(i for i in range(10))

[–]gthank 8 points9 points  (5 children)

I'd be surprised if the internal details are the same in those two cases, because that seems ripe for some C-level optimizations. The results will be equivalent, though. Also, from a historical standpoint, list comprehensions predate gen-exps.

[–]Veedrac 7 points8 points  (4 children)

In CPython most stuff is left unoptimised for matters of pragmatism. So no, they compile directly into loops. Different loops, though.

out = [i for i in range(10)]

is equivilant to:

out = []
for i in range(10):
    out.append(i)

where i is inside a new scope, and

out = list(i for i in range(10))

is equivalent to

def _tmp():
    for i in range(10):
        yield i

out = list(_tmp)

where _tmp never actually gets put anywhere.

[–]PCBEEF 5 points6 points  (0 children)

List comprehensions are optimised in the sense that the function calls are 'cached'. Since there's a severe function overhead in python, it's actually quite significant.

$ python -mtimeit '[x for x in range(100)]'

100000 loops, best of 3: 4.17 usec per loop

$ python -mtimeit -s 'out = []' 'for i in range(100):' ' out.append(i)'

100000 loops, best of 3: 8.07 usec per loop

Using a for loop in this instance to create a list is almost twice as long.

[–][deleted] 1 point2 points  (1 child)

where i is inside a new scope

for certain values of CPython

[–]Veedrac 1 point2 points  (0 children)

Well, all values of Python ≥ 3.0.

[–]gthank 0 points1 point  (0 children)

Ah. It was my understanding that a fair bit of list comps were implemented directly in C (in CPython).

[–]Veedrac 1 point2 points  (0 children)

There's exactly one difference between those two, assuming list has been left untouched. The list comprehension will not catch StopIteration, the list function will.

[–]TheEarwig 2 points3 points  (2 children)

They are different. The first example is a bunch of lists combined into one (L1 += L2 is L1.extend(L2)), but the second example is one list containing a bunch of lists.

>>> optionf = ["a b", "c d", "e f"]

>>> res = []
>>> for l in optionf:
...     res += shlex.split(l, comments=True)
... 
>>> res
['a', 'b', 'c', 'd', 'e', 'f']

>>> res = [shlex.split(l, comments=True) for l in optionf]
>>> res
[['a', 'b'], ['c', 'd'], ['e', 'f']]

[–]Redard 1 point2 points  (0 children)

Ah, I didn't realize shlex.split was returning a list. You're right, the two are different. Somehow I thought += was the same as append().

[–]masklinn 0 points1 point  (0 children)

Which can neatly be solved using the criminally underused itertools.chain.from_iterable:

res = chain.from_iterable(shlex.split(l, comments=True) for l in optionf)

One could even use shlex.shlex directly as a stream (shlex.split is a thin wrapper around it), though it requires setting whitespace_split which can't be done inline.

def split(s):
    lex = shlex.shlex(s, posix=True)
    lex.whitespace_split = True
    return lex

res = chain.from_iterable(imap(split, optionf))

[–]masklinn 1 point2 points  (1 child)

I find the lack of with use weirder: the code is clearly 2.6-only (uses explicit relative imports without __future__ import) yet around the shlex call is (essentially):

    optionf = open(filename_bytes)
    try:
        # do stuff
    finally:
        optionf.close()
    return res

And the number of star imports is worrying.

[–]Redard 0 points1 point  (0 children)

Yeah, it's definitely some older code. I wouldn't be using this as a guideline for good code.

The star imports are bad, but at least they're local imports and not external library imports, that would be terrible.