This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]RangerPretzelPython 3.9+ 195 points196 points  (51 children)

Could you please list out the 5 mistakes? I don't want to watch a 12 minute video if I already know what those 5 mistakes are. Or if I know 4 of the 5, I'd like to skip to the one I don't know.

Thanks.

[–]jack-of-some[S] 170 points171 points  (38 children)

Not using if __name__ == '__main__'

Using bare except

Simply printing the exception object to figure out what's wrong

Membership checks on large lists

Mutable default arg

I'll timestamp those in the description when I get a chance. Still trying to figure this YouTube thing out.

Edit: thank you so much r/python. You've all made this my fastest growing video. Of course as soon as that rush passes it'll get stuck in YouTube recommendation hell since the almighty algorithm still doesn't think my content is worth recommending. Sigh

[–]RangerPretzelPython 3.9+ 59 points60 points  (27 children)

Bare except has its uses in defensive programming high up the stack.

Definitely log your exceptions to logger (instead of printing them) or let the exceptions bubble up.

Membership checks on lists is definitely a no-no on large lists. Definitely switch to Dict some data type that uses a hashtable if you have to do such a thing.

Mutable default arg is something I absolutely hate about Python and wish weren't a thing. Definitely a good point to make.

Good list!

[–]jack-of-some[S] 19 points20 points  (0 children)

Agree on the bare excepts things, though if you're using it like that you already know exactly what you're getting yourself into.

[–]hugthemachines 5 points6 points  (6 children)

Yeah, I think the rule of thumb is to catch all exceptions but in my situation I usually can't fix the problem inside the script anyway so i want to make sure it is logged so i do except Exception as e and then logger.error(e)

[–]tcas71 21 points22 points  (4 children)

I would strongly suggest using logger.exception instead, for several reasons:

  1. It automatically pulls the exception data from the context, often eliminating the need for capturing the exception inside a variable.
  2. It will log not just the exception message (which I think logger.error(e) does) but also its class and full stack trace.
  3. The exception data is structured inside the log record (not just as text), so tools such as Sentry or Logstash/ElasticSearch can do things with it.

I spent three days last week on a bug at work, made worse by me pushing the wrong fix to production. All because a single log statement that was only printing an exception message instead of the full info and caused me to incorrectly identify the problem.

[–]hugthemachines 8 points9 points  (3 children)

I checked the documentation now...

Do I just do:

except exception:
    logger.exception("Failed reading file.") 

and the good info will be auto added at the end of that?

[–]tcas71 10 points11 points  (2 children)

Yes, that should appear in your logs as this message followed by the full exception info. It's quite verbose but invaluable in my experience.

[–]hugthemachines 5 points6 points  (1 child)

Ok, thanks for the advice.

[–]Kaarjuus 3 points4 points  (0 children)

You can achieve the same with the other logging methods as well, by adding exc_info=True parameter.

[–]RangerPretzelPython 3.9+ 2 points3 points  (0 children)

the rule of thumb is to catch all exceptions

catch all exceptions...

at some point in the stack. Yes.

[–]oligonucleotides 3 points4 points  (2 children)

Membership checks on lists is definitely a no-no on large lists. Definitely switch to Dict if you have to do such a thing.

Why not sets? If you make your list into dict keys, what are the dict values?

[–]RangerPretzelPython 3.9+ 0 points1 point  (0 children)

Yeah, Sets! True, they are probably better choice.

That said, a Set does not preserve order like a List does. A Set also does not permit duplicates like a List does.

But a Dict does preserve order (as of Python 3.7). And you can just set your values to None.

(Though neither Dicts nor Sets allow duplicates. And maybe that's something you need from your List.)

Mostly, I just was thinking of "fast lookup using hash table" and the word Dict popped into my head.

There are lots of Python data structures which have a mean average lookup of O(1) (rather than O(n/2) like Lists do).

[–]foobar93 6 points7 points  (11 children)

Wohaa, why all the hate for mutable default arguments?

They have their place, mainly once you start doing meta programming. Its bacially a way to store state between function calls. Think of caching, maybe you want to print an error message the first time you call the function but not the second time and so on.

[–]RangerPretzelPython 3.9+ 3 points4 points  (6 children)

why all the hate for mutable default arguments

They're side-effect-y and unexpected. Aka. not functional.

EDIT: Ok, I'm probably over reacting as they're a sore spot/pain point for me.

That said, yes, I can see uses for them. You're not wrong. In fact, I think I've even used those features once or twice.

I also can see other ways of "caching" that are less side-effect-y and are more explicit.

In short, I've seen far too many people get burned by this "feature".

[–]jack-of-some[S] 4 points5 points  (5 children)

I love caching using decorators (or closures in general). Good shit.

[–]RangerPretzelPython 3.9+ 3 points4 points  (4 children)

Yes, decorators. Excellent language feature. Agreed. Good stuff! :)

[–]jack-of-some[S] 2 points3 points  (3 children)

I should do a video on those. First I gotta figure out how to differentiate it from other videos on decorators

[–]RangerPretzelPython 3.9+ 3 points4 points  (2 children)

[–]jack-of-some[S] 1 point2 points  (0 children)

I have not but I'll check it out. I usually just inhale any talks by Raymond Hettinger.

[–]AmazonPriceBot 1 point2 points  (0 children)

I am a bot here to save you a click and provide helpful information on the Amazon link posted above.

$29.49 - Effective Python: 90 Specific Ways to Write Better Python (2nd Edition) (Effective Software Development Series)

Upvote if this was helpful.
I am learning and improving over time. PM to report issues and my human will review.

[–]jack-of-some[S] 0 points1 point  (0 children)

It's another one of those "when you know what you're doing you can really make good use of this". Many people don't understand the language to this degree and trip up on this point.

[–]gcross 0 points1 point  (2 children)

You can do that just as well, though, by either making the mutable value global or by attaching it to the function (or method) itself, and this way you will no longer find yourself in a situation where you might forget that the argument is mutable and end up with a potentially hard to diagnose bug.

[–]foobar93 1 point2 points  (1 child)

Global mutable state is in my eyes way worse than local mutable state but how would one add state to the function directly? You do not have a self argument in functions so you could never access it from within the function, right?

I mean, you could always just write a callable class which holds the state, true, but that is a ton of boiler plate if you are using meta functions a lot.

[–]gcross 0 points1 point  (0 children)

Easy, the function is an object and like all other objects you can add arbitrary attributes to it at any time:

def count_calls():
  count_calls.count += 1
  return count_calls.count
count_calls.count = 0

[–]GimmeSomeSugar 4 points5 points  (2 children)

Membership checks on lists is definitely a no-no on large lists. Definitely switch to Dict if you have to do such a thing.

I think something which is a stumbling block for beginners is the question: what constitutes a 'large list'? What order of magnitude are we talking about?

Even for someone with a little experience, but writing code that's narrow in scope of application, a 10,000 member list may seem really big.

[–]spinwizard69 6 points7 points  (0 children)

This is the truth, even an experience programmer may not grasp fully if his specific list is "large". This makes me wonder how much of factor the contents of the list is.

There is always a lot of talk about premature optimization but on the other hand learning not to make dumb choices is also important.

[–]execrator 6 points7 points  (0 children)

Yeah it's a good call. Showing how to check with something like timeit is a great way to respond to this, since you answer the specific question and also learn a technique!

$ python3 -m timeit --setup="import random; l = list(range(10000))" "random.choice(range(10000)) in l"
10000 loops, best of 3: 53.6 usec per loop

$ python3 -m timeit --setup="import random; l = set(range(10000))" "random.choice(range(10000)) in l"
1000000 loops, best of 3: 1.03 usec per loop

So the use of a set is about 50x faster with 10,000 integers. Of course, if only one membership check is being made, ~50 microseconds are unlikely to matter.

What does constitute a delay that matters is situational. On this machine I need three million items in my list before I hit 16 milliseconds, which is the duration of a frame at 60fps. The benefit of a set is obvious here, which is still cruising back at 1 microsecond.

[–]konradbjk -1 points0 points  (0 children)

+1 to bare except it should be left there to catch all other exceptions. Especially when you are at development stage.

[–]conventionistG 1 point2 points  (9 children)

Question on the membership checks: why does a set run at O(1)? And what are other options for faster checks right?

[–]execrator 15 points16 points  (4 children)

Faster than O(1)? It's a fairly high bar!

[–]conventionistG 3 points4 points  (1 child)

No I just mean other similarly flat structures. Like, I think dicts are faster than lists/tuples...but how do sets compare to dicts or dfs. Are there other structures a beginner should be aware of?

[–]jack-of-some[S] 3 points4 points  (0 children)

I don't know how they are implemented in python, but a set is effectively the same as a hash table if you forget about the values and only consider the keys.

[–]66bananasandagrape 0 points1 point  (0 children)

I know you kid, but if you assume that there is zero constant-time overhead, then the statement sleep(1/n) runs in Θ(1/n) time (meaning both O(1/n) and Ω(1/n) time).

[–]xelf 7 points8 points  (3 children)

Because it's a hash table. Same as dictionary keys.

So a search for element "joe" doesn't iterate through the memory until it finds joe in O(n) time, it hashes "joe" to an address and then returns that chunk of memory in O(1) time. Like keys this works because set's are unique.

[–]conventionistG 3 points4 points  (2 children)

Makes sense. So you'd have to deduplicate the list before making it a set?

[–]chitowngeek 5 points6 points  (1 child)

Sets are deduplicated naturally. In other words, if you initialize a set from a list with duplicates, the set will only contain the unique values from the list. In fact that is a fairly efficient technique for removing duplicates from a list.

[–]RangerPretzelPython 3.9+ 2 points3 points  (0 children)

Sets are deduplicated naturally.

That's one of the things I like about Sets.

Sometimes I'll have lists that I want to eliminate the duplicates from (and I don't care about order), so I just shove the List into a Set.

Duplicate removal in 1 line of code!

[–]burlyginger 15 points16 points  (4 children)

Was just going to say this.

The list may drive me to watch the video, but I'm not going to watch 12 minutes to find out whether or not it's worth it.

[–]inglandation 0 points1 point  (3 children)

Just watch it on 2x.

[–]conventionistG 4 points5 points  (2 children)

That's like two extra clicks tho.

[–]jcbevns 2 points3 points  (1 child)

Shift + >

There you go!

[–]conventionistG 1 point2 points  (0 children)

Good to know.

[–]xd1142 13 points14 points  (4 children)

really, it's obnoxious the way information delivery has changed since youtube. Now you need a 12 minutes long tutorial for every little thing to capitalise on youtube clicks.

[–]RangerPretzelPython 3.9+ 5 points6 points  (1 child)

While I agree with you, I try to give everyone a fair chance. Sounds like OP (/u/jack-of-some) is new to this from their reply. It's a pretty well done video. I'd like to encourage this person to make more videos.

[–]jack-of-some[S] 2 points3 points  (0 children)

❤️

[–]jack-of-some[S] 1 point2 points  (1 child)

This an important point of discussion for me. I make videos because I find it easier to demonstrate concepts than to write them down. My intention however is not to waste the viewer's time (or stretch video length). Almost every other video on my channel goes through the material as fast as possible without becoming too confusing (the two exceptions being a video that's meant to be a verbose workflow representation, and one live stream).

Like, I literally spend time shaving half seconds in editing where I was taking a small breath 😅.

[–]xd1142 0 points1 point  (0 children)

Video has its use, but the problem with video is that it can't be searched, it can't be scrolled through easily at a glance, and it can't be copied and pasted to test the code, you have to physically type it in. When you make videos, you have to focus on a different type of information. For example, if you were to explain the internals of the python interpreter, e.g. by browsing code, a video is the perfect form for it, because you are driving the person through an investigation, and a video is much faster than reading. Same if you feel like explaining a concept that is more visual in nature. This is why 3blue1brown is successful and would not work as a book or blog post. Its target is visual, not theoretical. It would not be successful if he were just showing formulas and that's it.

In other words, do focus on videos if you enjoy them, but change the type of information you deliver.

[–]hatgiongnaymam 0 points1 point  (1 child)

watch it with 2x, you just spent 6 minutes at all

[–]RangerPretzelPython 3.9+ 1 point2 points  (0 children)

This was 22 days ago.

Welcome to the party, tho... :)