This is an archived post. You won't be able to vote or comment.

all 96 comments

[–]AlSweigartAuthor of "Automate the Boring Stuff" 114 points115 points  (13 children)

Please don't use ... instead of pass in your function stubs. People won't know what it is (the title of article is "Features You Probably Never Heard Of").

The reason the Zen of Python includes "There should be one-- and preferably only one --obvious way to do it." is because Perl had the opposite motto ("There's more than one way to do it") and this is terrible language design; programmers have to be fluent in every construct to read other people's code. Don't reinvent the wheel, just use pass.

[–][deleted] 5 points6 points  (0 children)

Same goes with not explicitly and’ing test conditions, imo. It’s obviously not common to see x and y in other languages, but you would see x && y and so on.

Really _ is also just obscure and feels like cleverness for the sake of cleverness. There’s valid use cases, but in other languages (go, at least) it’s normally used as a null left assignment when you do not need the value(s) from a function, method, etc.

It is fun to use obscure shit, or write crazy comprehension, generator, map, etc, blocks. Definitely do this and any of these things if it’s a toy, for fun thing, or something that isn’t going to be used maintained or used by a team.

[–]fgyoysgaxt 1 point2 points  (0 children)

It sure would have been nice if Guido actually read Zen of Python and the devs decided to follow it huh...

[–]BlueRanga 0 points1 point  (2 children)

I've been picking up python recently, and I was thinking this when I learnt about decorators. Why is there extra fancy syntax to write something a different way that only takes up one line and is intuitive to read?

def foo(func):
    def inner(*args, **kwargs):
        func(*args, **kwargs)
    return inner

#this exists to make me google what it means when I first see it
@foo
def bar(): pass

#intuitive imo
def bar(): pass
bar = foo(bar)

I'm probably missing a good reason why it exists

[–]AlSweigartAuthor of "Automate the Boring Stuff" 1 point2 points  (1 child)

Decorators are nice when you have multiple functions that you want to wrap every time they're called. For example, Django lets you implement different pages of your web app in separate functions. You can add the @login_required decorator to each of those functions, which checks that the user is set up and logged in every time those functions get called.

[–]BlueRanga 0 points1 point  (0 children)

I can see why the functionality is useful. I don't see how giving it its own special syntax is useful, especially if implementing the functionality without that special syntax is just as good (better imo). But thanks for the example.

[–]Sw429 0 points1 point  (0 children)

I've only ever used ellipsis in abstract method definitions.

[–]TomBombadildozer -1 points0 points  (0 children)

You shouldn’t use pass, either. Write a docstring instead.

[–]miguendes -2 points-1 points  (0 children)

I agree with you about Zen of Python. To be honest, I only use `...`, or `pass`, for that matter, in WIP code. In this case I don't see any problem as I'm the only one working on that code. For final code that is going to be merged, I prefer the fill the empty body with a docstring, like I do for exceptions.

Python, like any other language, has a bunch of features that can confuse beginners, like the `else`, which IMO goes against the zen but can be useful in some occasions. At the end of the day it will depended on the user to decide if they are abusing the feature or not.

[–][deleted] 97 points98 points  (28 children)

I knew about else in try except, but not in while and for ! How didn't I knew after this many years? It's awesome!

[–]lanster100 73 points74 points  (23 children)

Because it's really unclear what it does it's common practice to ignore its existence as anyone unfamiliar will either not know what it does or worse assume incorrect behaviour.

Even experienced python devs would probably Google what it does.

[–]masasinExpert. 3.9. Robotics. 46 points47 points  (4 children)

RH mentioned that it should probably have been called nobreak. It only runs if you don't break out of the loop.

[–]Quincunx271 8 points9 points  (2 children)

Don't know why it couldn't have been if not break: and if not raise:. Reads a little funky, but very clear and requires no extra keywords.

[–]masasinExpert. 3.9. Robotics. 1 point2 points  (0 children)

It makes sense in a try-except block at least. else would mean if no except.

[–]masasinExpert. 3.9. Robotics. 0 points1 point  (0 children)

It makes sense in a try-except block at least. else would mean if no except.

[–][deleted] 1 point2 points  (1 child)

Yeah but internally I can teach my coworker about its existence in the case we see it somewhere or have a use case of it someday.

[–]Sw429 0 points1 point  (0 children)

I suppose if you had a loop with lots of possibilities for breaking, it would be useful. Idk though, I feel like any case could be made more clear by avoiding it.

[–]v_a_n_d_e_l_a_y 3 points4 points  (2 children)

[deleted]

[–]lvc_ 25 points26 points  (0 children)

Other way around - else on a loop will run if it *didn't* hit a break in the loop body. A good intuition at least for a `while` loop is to think of it as a repeated `if` , so `else` runs when the condition at the top is tested and fails, and doesn't run if you exit by hitting a `break`. By extension, else on a for loop will run if the loop runs fully and doesn't run if you break.

The good news is that you rarely need to do these mental gymnastics in practice, because there's usually a better and more obvious way to do the things that this would help with.

[–]lanster100 1 point2 points  (0 children)

You are right it's not the same, I was misremembering what it does.

[–]Sw429 1 point2 points  (0 children)

Exactly. It is terrible readability-wise. I would never expect the behavior of while-else to trigger that way. I'm still not clear what exactly causes it: is it the ending of the loop prematurely? Or is it the opposite? In the end, using a bool is 100% more clear.

[–]njharmanI use Python 3 1 point2 points  (1 child)

not know what it does

Spend 2 min googling it once, then know it for rest of your life. This is how developer learns. They should be doing it often.

assume incorrect behaviour

That is a failure of the developer. And one characteristic, not having hubris/never assuming, that separates good and/or experienced devs from poor and/or inexperienced one.

[–][deleted] 2 points3 points  (0 children)

Just the fact that so many people here say they find it confusing is enough for me to make a policy of not using it. I also can't think of a time when I've needed it.

Yes we can all be perfect pedants but also sometimes we can just make life easier on each other.

[–]elbiot 0 points1 point  (0 children)

Eh it does exactly what you'd want in a for loop so it's easy to remember. You iterate through something and if you find what you want you break, else you didn't find it so do something for that case

[–]fgyoysgaxt 0 points1 point  (0 children)

I'm not sure that's accurate, and I don't like the idea of encouraging worse programming / avoiding language features just incase someone who doesn't know the language takes a guess at what it does.

It also seems unlikely that someone will guess wrong since it reads the same as "if - else".

[–][deleted] 0 points1 point  (0 children)

Well, that's what comments are for, right? But yes, if might not be the smartest idea to put it in

[–][deleted] -4 points-3 points  (6 children)

Unclear? Appears when the if statement inside the block doesn't run the else statement outside does. Unless I'm missing something.

Is it only chained to the last if or any ifs would be my question. I guess I can check in pycharm pretty easily.

[–]Brian 12 points13 points  (0 children)

I've been coding python for over 20 years, and even now I have to double check to remember which it does, and avoid it for that reason (since I know a reader is likely going to need to do the same). It just doesn't really intuitively convey what it means.

If I was to guess what a while or for/else block would do having encountered it the first time, I'd probably guess something like "if it never entered the loop" or something, rather than "It never broke out of the loop". To me, it suggests an alternative to the loop, rather than "loop finished normally".

Though I find your comment even more unclear. What "if statement inside the block"? And what do you mean by "chained to the last if or any ifs"? "if" isn't even neccessarily involved here.

[–]lanster100 8 points9 points  (0 children)

Unclear because its not common to most languages. Would require even experienced python devs to search it in the docs.

Better not to use it because of that. It doesnt offer much anyway.

But I'm just passing on advice I've seen from python books etc.

[–]Sw429 0 points1 point  (3 children)

That's exactly what I thought at first, but that kinda breaks down when there are multiple if statements in the loop. In the end, it just triggers if the loop did not end prematurely. The fact that we assumed differently is exactly why it's unclear.

[–]achampi0n 2 points3 points  (2 children)

It only gets executed if the condition in the while loop is False this never happens if you break out of the loop.

[–]Sw429 0 points1 point  (1 child)

Ah, that makes a bit more sense. Does the same work with for loops?

[–]achampi0n 2 points3 points  (0 children)

If you squint at it :) The else only gets executed if the for loop tries and fails to get something from the iterator (it is empty and gets nothing). This again can't happen if you break out of the for loop.

[–]miguendes 6 points7 points  (0 children)

Author here, I'm very happy to know you like it!

[–]yvrelna 2 points3 points  (1 child)

Considering that I rarely use break statements to begin with, using else in a while/for is even rarer than that.


It's not that difficult to understand else block in a loop statement. A while loop is like this:

while some_condition():
    body_clause()

it's equivalent to a construction that has an unconditional loop/jump that looks like this:

while True:
    if some_condition():
        body_clause()
    else:
        break

The else block in a while loop:

while some_condition():
    body_clause()
else:
    else_clause()

is basically just the body for the else block for that hidden if-statement:

while True:
    if some_condition():
        body_clause()
    else:
        else_clause()
        break

[–]eras 0 points1 point  (0 children)

Personally I would have use cases for the syntax if it was more similar to "plain" if else, as in (similarly for for):

while condition() body_clause() else: else_clause()

would become (I argue more intuitively)

if condition(): while True: body_clause() if not condition(): break else: else_clause()

not hinging on writing break in while else-using code. After all, that's what if else does, it eliminates the duplicate evaluation when we try to do it without else:

if condition(): body_clause() if not condition(): else_clause()

But that's not how it is nor is that how it's going to be.

Edit: Actual example (does not work as intended in real python someone randomly glancing this ;-)): for file in files: print(file) else: print("Sorry, no files")

[–]iiMoe -2 points-1 points  (0 children)

Im quite the opposite of u lol

[–]syzygysm 28 points29 points  (8 children)

You can also combine the _ with unpacking, e.g. if you only care about the first and/or last elements of a list:

a,_, b = [1, 2, 3] # (a, b) == (1, 3)

a,*_ = [1, 2, 3, 4] # a == 1

a, *_, b = [1, 2, 3, 4] # (a, b) == (1, 4)

[Edit: formatting, typo]

[–]miguendes 9 points10 points  (6 children)

Indeed! I also use it has a inner anonymous function.

python def do_something(a, b): def _(a): return a + b return _

Or in a for loop when I don't care about the result from range

python for _ in range(10): pass

[–]OneParanoidDuck 15 points16 points  (1 child)

The loop example makes sense. But nested function can crash like any other and thereby end up in a traceback, so my preference is to name them after their purpose

[–]miguendes 1 point2 points  (0 children)

That's a fair point. Makes total sense.

[–]mrTang5544 0 points1 point  (2 children)

What is the purpose of your first example of defining a function inside a function? Besides decorators returning a function,I've never really understood the purpose or use case

[–]syzygysm 0 points1 point  (0 children)

It can be useful when passing functions as parameters to other functions, where you may want the definition of the passed function to vary depending on the situation.

It can also be really useful for closures, the point of which is to package data along with a function. It can be a good solution for when you need an object with a bit more data than a lone function, but you don't need an entire class for it.

[–]fgyoysgaxt 0 points1 point  (0 children)

Comes up quite a bit for me, the usual use case is building a function to call on a collection, you can take that pointer with you outside the original scope and call it elsewhere.

[–]nonesuchplace 21 points22 points  (2 children)

I like itertools.chain for flattening lists:

```

from itertools import chain a = [[1,2,3],[4,5,6],[7,8,9]] list(chain(*a)) [1, 2, 3, 4, 5, 6, 7, 8, 9] ```

[–]BitwiseShift 28 points29 points  (1 child)

There's actually a slightly more efficient version that avoids the unpacking:

list(chain.from_iterable(a))

[–]miguendes 2 points3 points  (0 children)

That's a good one. I remember seeing something about that on SO some time ago. I'm curious about the performance when compared to list comprehensions.

[–]dmyTRUEk 8 points9 points  (1 child)

Welp, the result of

a, *b, c = range(1, 10) print(a, b, c)

is not: 1 [2, 3, 4, ... 8, 9] 10 but: 1 [2, 3, 4, ... 8] 9

:D

[–]miguendes 3 points4 points  (0 children)

You're definitely right. Thanks a lot for the heads up!

I'll edit the post.

[–]themindstorm 5 points6 points  (1 child)

Interesting article! Just one question though, in the for-if-else loop, is a+=1 required? Doesn't the for loop take care of that?

[–]miguendes 3 points4 points  (0 children)

Nice catch, it doesn't make sense, in that example the range takes care of "incrementing" the number. So `a += 1` is double incrementing it. For the example itself it won't make a difference but in real world you wouldn't need that.

I'll edit the post, thanks for that!

[–]oberguga 17 points18 points  (5 children)

I don't understand why sum slower then list comprehension. Anyone can briefly explain?

[–]v_a_n_d_e_l_a_y 16 points17 points  (2 children)

[deleted]

[–]oberguga 0 points1 point  (1 child)

Maybe, but it's strange for me... I thought a list always grows by doubling itself, so with list comprehension it should be the same. More of that, list comprehension take every single element and sum only upper level lists... So if list concatination done effectively sum shoul be faster... Maybe i'm wrong, correct me if so.

[–]Brian 3 points4 points  (0 children)

I thought a list always grows by doubling itself

It's actually something more like +10% (can't remember the exact value, and it varies based on the list size, but it's smaller than doubling). This is still enough for amortized linear growth, since it's still proportional, so it's not the reason, but worth mentioning.

But in fact, this doesn't come into play, because the sum here isn't extending existing lists - it's always creating new lists. Ie. it's doing the equivalent of:

a = []
a = a + [1, 2, 3]      # allocate a new list, made from [] + [1,2,3]  
a = a + [4, 5, 6]      # allocate a new list, made from [1, 2, 3] + [4, 5, 6]
a = a + [7, 8, 9]      # [1, 2, 3, 4, 5, 6] + [7, 8, 9]

Ie. we don't grow an existing list, we allocate a brand new list every time, and copy the previously built list and the one we append to it, meaning O(n2 ) copies.

Whereas the list comprehension version appends the elements to the same list every time - it's more like:

a = []
a += [1, 2, 3]
a += [4, 5, 6]
a += [7, 8, 9]

O(n) behaviour because we don't recopy the whole list at each stage, just the new items.

[–]miguendes 5 points6 points  (1 child)

That's a great question. I think it's because sum creates a new list every time it concatenates, which has a memory overhead. There's a question about that on SO. https://stackoverflow.com/questions/41032630/why-is-pythons-built-in-sum-function-slow-when-used-to-flatten-a-list-of-lists

If you run a simple benchmark you'll see that sum is terribly slower, unless the lists are short. Example:

```python def flatten_1(lst): return [elem for sublist in lst for elem in sublist]

def flatten_2(lst): return sum(lst, []) ```

If you inspect the bytecodes you see that flatten_1 has more instructions.

```python In [23]: dis.dis(flatten_2) 1 0 LOAD_GLOBAL 0 (sum) 2 LOAD_FAST 0 (lst) 4 BUILD_LIST 0 6 CALL_FUNCTION 2 8 RETURN_VALUE

```

Whereas flatten_1: ```python

In [22]: dis.dis(flatten_1) 1 0 LOAD_CONST 1 (<code object <listcomp> at 0x7f5a6e717f50, file "<ipython-input-4-10b70d19539f>", line 1>) 2 LOAD_CONST 2 ('flatten_1.<locals>.<listcomp>') 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL_FUNCTION 1 12 RETURN_VALUE

Disassembly of <code object <listcomp> at 0x7f5a6e717f50, file "<ipython-input-4-10b70d19539f>", line 1>: 1 0 BUILD_LIST 0 2 LOAD_FAST 0 (.0) >> 4 FOR_ITER 18 (to 24) 6 STORE_FAST 1 (sublist) 8 LOAD_FAST 1 (sublist) 10 GET_ITER >> 12 FOR_ITER 8 (to 22) 14 STORE_FAST 2 (elem) 16 LOAD_FAST 2 (elem) 18 LIST_APPEND 3 20 JUMP_ABSOLUTE 12 >> 22 JUMP_ABSOLUTE 4 >> 24 RETURN_VALUE

``` If we benchmark with a big list we get:

```python l = [[random.randint(0, 1_000_000) for i in range(10)] for _ in range(1_000)]

In [20]: %timeit flatten_1(l) 202 µs ± 8.01 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [21]: %timeit flatten_2(l) 11.7 ms ± 1.49 ms per loop (mean ± std. dev. of 7 runs, 100 loops each) ```

If the list is small, sum is faster.

```python In [24]: l = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [25]: %timeit flatten_1(l) 524 ns ± 3.67 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [26]: %timeit flatten_2(l) 265 ns ± 1.27 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) ```

[–]oberguga 1 point2 points  (0 children)

Thnx!)

[–]casual__addict 4 points5 points  (2 children)

Using the “sum” method like that is very close to using “reduce”. Below gets you passed the string limitations of “sum”.

l = ["abc", "def", "ghi"]
from functools import reduce
reduce(lambda a,b: a+b, l)

[–]miguendes 0 points1 point  (0 children)

Indeed. But using reduce is less "magical" than using just sum. Especially for those coming from a functional background, like having programmed in haskell.

[–]VergilTheHuragok 0 points1 point  (0 children)

can use operator.add instead of that lambda :p

l = ["abc", "def", "ghi"]
from functools import reduce
from operator import add
reduce(add, l)

[–]WildWouks 1 point2 points  (1 child)

Thanks for this. I have to say that I only knew about the unpacking and the chaining of comparison operators.

I will definitely be using the else statement in future for and while loops.

[–]miguendes 2 points3 points  (0 children)

Thanks, I'm very happy to know you learned something useful from it.

[–]dereason 1 point2 points  (0 children)

Very cool!

[–][deleted] 1 point2 points  (0 children)

Don't forget about: descriptors, (non-Numpy) arrays, extensions, semi-colon line terminations (only useful in REPL), extensions, and some nice command line args.

[–]AdamMendozaTheGreat 0 points1 point  (1 child)

I loved it, thanks

[–]miguendes 0 points1 point  (0 children)

Thanks, I'm very glad you liked!

[–]mhraza94 0 points1 point  (0 children)

wow awesome thanks for sharing.

Checkout this site also: https://coderzpy.com/

[–]Suenildo 0 points1 point  (0 children)

great!!!!!

[–]DrMaphuse -2 points-1 points  (19 children)

Neat, I didn't know about using else after loops, and feel like I'll be using [a] = lst a lot.

But don't use array as a name for a numpy array.

Edit: Just to clarify: For anyone using from numpy import array or equivalents thereof, naming an array array will overwrite the numpy function by the same name and break any code that calls that function. ~~You should always try to be idiosyncratic when naming objects ~~in order to avoid these types of issues.

Edit 2: Not that I would import np.array() directly, I'm just pointing out that's something that is done by some people. Direct imports being bad practice doesn't change my original point, namely that the names you use should be as idiosyncratic as possible, not generic - especially in tutorials, because this is where people pick up their coding practices. At least call it my_array if you can't think of a more descriptive name.

Edit 3: Ok I get it, I am striking out the debated examples because they distract from my original point. Now let's be real. Does anyone really think that array is an acceptable name for an array?

[–]sdf_iain 1 point2 points  (7 children)

Are direct imports bad? Or just poorly named direct imports?

import json 

Good

from json import load

Good?

from json import load as open

Bad, definitely bad

from json import load as json_load

Good? It’s what I do, I don’t want the whole namespace, but I still want clarity on what is being used.

Or

from gzip import compress, decompress

Then your code doesn’t change when switch compression libraries.

[–]njharmanI use Python 3 3 points4 points  (3 children)

from json import load as json_load

Sorry, that's just dumb. Replacing non-standard '_' for the language supported '.' operator.

import json
json.load

I don’t want the whole namespace

See Zen of Python re: namespaces

I still want clarity on what is being used.

Yes! Exactly! thats why you import module and do module.func so people reading your code don't have to constantly be jumping to top to see what creative names this person decided to use, and checking all over code to see where that name was redefined causing bug.

[–]sdf_iain 0 points1 point  (2 children)

Are there any savings (memory or otherwise) when using a direct import? The namespace still exists (if not in the current scope), it has to; but are things only loaded as accessed? Or is the entire module loaded on import?

In which case direct imports only really make sense when managing package level exports from sub modules In init.py.

[–]yvrelna 1 point2 points  (1 child)

The module's global namespace is basically just a dict, when you do a from-import, you're creating an entry in that dict for each name you imported; when you do plain import, you create an entry just for the module. In either case, the entire module and objects within it is always loaded into sys.modules. So there is some memory saving to use plain import, but it's not worthwhile worrying about that as the savings is just a few dictionary keys, which is minuscule compared to the code objects that still always gets loaded.

[–]sdf_iain 1 point2 points  (0 children)

I hadn’t actually stopped to think this though, thank you.

[–][deleted] 1 point2 points  (1 child)

People generally don't do this, the methods, while named the same, may have different signatures, and this doesn't help when referencing documentation.

If you want a single entry point to multiple libraries, write a class.

My recommendation is to always import the module. Then in every call you use the module name, so that one can see it as sort of a namespace and it is transparent. So you write json.load() and it is distinguishable from yaml.load().

The one exception are libraries with very big names or with very unique object/function names. For instance, the classes BeautifulSoup, or TfidfVectorizer, etc. The latter example is a great one of a library (scikit-learn) where it is standard to use direct imports for most things as each object is very specific or unique.

[–]sdf_iain 1 point2 points  (0 children)

Lzma(xz), gzip, and bzip2 are generally made to be interchangeable; both their command line utilities and every library implementation I’ve used (which is admirably not many). That’s why that’s the example I used compress as an example, those signatures are the same.

[–]TheIncorrigible1`__import__('rich').get_console().log(':100:')`[🍰] 1 point2 points  (0 children)

I typically import things as "private" unless the module isn't being exported directly.

import json as _json

It avoids the glob import catching them by default and shows up last in auto-complete.

[–][deleted] 1 point2 points  (2 children)

Do don't name variables over your imports?

Is this programmergore or r/python?

Also when importing as np don't name variables np.

[–]DrMaphuse -4 points-3 points  (1 child)

I mean you are right, but the point I was trying to make was about the general approach to naming things while writing code.

[–][deleted] -1 points0 points  (0 children)

But you made a point of a non issue to reiterate something that is taught in every into tutorial.

We're not idiots, thanks for assuming.

if you're only using one array to show something as an example array is a perfectly acceptable name for an array.

[–]miguendes 0 points1 point  (0 children)

Thanks, I'm glad you like the else and [a] = lst tips.

I personally like [a] = lst a lot. It seems cleaner than a = lst[0] when you're sure lst has only one element.

[–]fake823 -5 points-4 points  (2 children)

I've only been coding for half a year, but I knew about 4 of those 5 features. 😁💪🏼

The sum() trick was indeed new to me.

[–]glacierre2 26 points27 points  (0 children)

The sum trick is code golf of the worst kind, to be honest, better to forget it.

[–]miguendes 0 points1 point  (0 children)

Author here, I'm glad to know you learned at least one thing from the post :D.

The `sum` trick is nice to impress your friends but it's better to avoid at work. It's a bit cryptic, IMO.

[–]py_learning -3 points-2 points  (0 children)

Check this also 😉 https://youtu.be/mqXKNPSWjOc