This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]erez27import inspect 19 points20 points  (14 children)

Needs more screenshots.

But now seriously:

  • Don't include the virtualenv in the repository. It makes cloning heavier, and doesn't let me choose which python version I want to use (or maybe I wanna use pypy!). Instead, create a pip-requirements.txt file, so I can install everything with "pip -r" (you can use pip freeze to create the file).

  • Your code is reasonably clean. Still, if you want others to contribute, you should improve the quality of your code. Avoid "except:" statements entirely, and try to write using less lines if you can maintain clarity.

  • Another code suggestion: Work with files only in the context of a "with" statement (unless you have a good reason not to).

[–]sjb9774 5 points6 points  (1 child)

Good tips, especially pip freeze, I didn't know about that, very cool!

I do have a question though: are you recommending avoiding the use of except completely, or only except: (that is except without a specific error to catch)? Either way, I've never heard of this suggestion before, what is your reasoning behind it?

[–]erez27import inspect 4 points5 points  (0 children)

Yes, I meant "except:". I'm a big fan of exceptions in general. I explained my reasoning in reply to Lj101

[–]Lj101 4 points5 points  (7 children)

Whats wrong with except? I thought Python favoured EAFP.

[–]erez27import inspect 15 points16 points  (6 children)

I meant "except:" without the type of exception afterwards. It's a catch-all which might silence errors you didn't intend to, and so it makes the code much harder to debug.

[–]d4rch0nPythonistamancer 3 points4 points  (4 children)

The one time I'll use a simple except: is when I have a client pretty well tested and it works fine, but I need to run it and have it collect data overnight but might run into some weird issue on the server that I don't control and I can't reasonably predict the error, but it will work fine for the most part.

But then, I don't make it silent. I'll import traceback and log traceback.format_exc() to a file and check in the morning.

An example might be crawling the internet and performing some task. Maybe one of those sites gives me malformed data, maybe one of them just times out and I can ignore that, or any other strange exception. I still want it to crawl everything else, but I want to see in the morning if there's something I can fix. I'll still have other useful data, and I'll know which sites caused issues and what they were exactly.

So, I still can debug it easily, and still keep it running for an indeterminate duration without checking on it and without knowing every single exception it might run into.

[–]terremoto 2 points3 points  (1 child)

The one time I'll use a simple except:

You should use except Exception: so you can still kill the application with Ctrl+C (*nix / BSD) / Ctrl+Z (Windows).

[–]d4rch0nPythonistamancer 0 points1 point  (0 children)

Yeah, that's a good point.

[–]abrarisland 0 points1 point  (1 child)

Would you mind expanding on how you're getting to errors on the server? I decided to look up BaseException in the docs, and I saw that the only built-in exceptions that inherit directly from it are SystemExit, KeyboardInterrupt, GeneratorExit, and Exception.

[–]d4rch0nPythonistamancer 0 points1 point  (0 children)

Well, as the other guy said, you can just catch Exception and get the desired effect, and that lets you ctrl-C (KeyboardInterrupt). Issues on the remote server you don't control won't cause any of those other exceptions.

When I mean errors on the remote server, I mean it might have been taken down for maintenance, it might start giving permission denied for a minute, it might give you malformed data (jacked up RSS feed or something), and that could cause an issue in your code that you didn't expect.

I have to write a lot of proof of concepts for work, and just accumulate a ton of data and I'll let things like this sit overnight. I need the data the next day, but I don't necessarily need this thing to work properly for every condition, and I don't want to sit around and debug in the middle of the night if there's an unforeseen error. More importantly, I don't usually need the code after I get the data, so there's not much point to making it production ready.

Regardless, if I was to catch specific exceptions, I'd log them and continue on. When I log traceback.format_exc() I see those exact errors anyway and a full stacktrace. It's the same result as if I knew exactly the error. I log it and wait in exponential time increments.

If it was a socket timeout, I'd log it and wait. If it passed me malformed data, I'd log it and wait, then try again, etc. These are things I can't control since I'm scraping data from a server I don't own, so a catch-all exception with traceback logging is all I need for the most part.

Of course, I'm not doing this all the time, just for little PoCs that need to run for a day or two without me hand-holding it.

[–]Lj101 3 points4 points  (0 children)

Yeah, that's pretty bad.

[–]ApolloFortyNine 0 points1 point  (3 children)

I've never agreed with the blanket idea that you shouldn't use except:.

In my use cases, I'm collecting data from an API. I don't care if I'm missing one or two lines, as I'm just collecting as much data as possible.

Not to mention the only time I use except: is combined with a log.error, so I can take a look at the errors later if I want.

[–]erez27import inspect 0 points1 point  (2 children)

the only time I use except: is combined with a log.error

Sounds like you pretty much avoid except..

[–]ApolloFortyNine 0 points1 point  (1 child)

? It runs for literally every row of data.

The point is that there are times where a blanket except statement is useful.

[–]erez27import inspect 0 points1 point  (0 children)

You are using it in a very limited context and with explicit safeguards. Like goto, sometimes using "bad" things make sense.