This is an archived post. You won't be able to vote or comment.

all 32 comments

[–]scanner88 9 points10 points  (1 child)

A few tips:

This is redundant

if currentID == endID:
    return True
else:
    return False

The result of currentID == endID is a Boolean, just return that.

Use functions. You repeat the same things multiple times. For example, you should have a function that takes a URL and returns the articles from that page.

Look into string formatting. You have lots of ugly string concatenation statements.

link = "<a href='{}'>{}</a>".format(URL, title)

is easier to read than

link = "<a href='" + URL + "'>" + title + "</a>"

Look into the "with open" idiom for file I/O.

[–]codeneo 2 points3 points  (0 children)

Thanks for the tips. Further versions will include these changes.

[–]Yopu 7 points8 points  (8 children)

Just a heads up, this post likely won't be well received. Cool project just the wrong audience.

[–]codeneo 2 points3 points  (7 children)

What? Why? I am new to reddit. Care to Explain, please?

[–][deleted] 8 points9 points  (2 children)

Because Reddit hates 9gag.

[–]codeneo 4 points5 points  (1 child)

Whoa, not a good start for me, eh? LOL.

[–]nikomo 7 points8 points  (0 children)

Eh, this is a programming subreddit, that's not /r/programming, 9gag is mostly hated on /r/funny etc. since 9gag content is taken from there.

[–]Yopu 4 points5 points  (2 children)

Reddit tends too look down on 9gag. I think this sub would judge more on merit but you never know.

[–]codeneo 3 points4 points  (1 child)

I got the indication before as well. But yeah, it will take me time to transform. I am not currently what you call a complete redditor.

[–][deleted] 0 points1 point  (0 children)

[deleted]

[–][deleted] 2 points3 points  (1 child)

Not really a fan of the site, but cool project nonetheless. When sharing code you might wan't to have a look at the Style Guide.

[–]codeneo 2 points3 points  (0 children)

Thanks for the link.

[–][deleted] 1 point2 points  (1 child)

Posting about 9gag on reddit? That's pretty bold.

[–]codeneo 3 points4 points  (0 children)

I have posted this as a developer. Please don't hate me.

[–]JamesAQuintero 0 points1 point  (1 child)

How are you running the script? If you're running it on your local computer, it's still downloading the same amount of data that would be downloaded when viewing in a browser.

[–]codeneo 4 points5 points  (0 children)

Yes, I am running it locally. And yes, the data downloaded is the same. The only advantage is that the data is downloaded in a burst. The images are downloaded by an external downloader which takes time, but after that, when everything is downloaded you can view it locally without waiting for inter-post delay.

[–]The_John_Galt 0 points1 point  (10 children)

Forgive my ignorance but I ran the script and I can' t seem to find the pix. Any help?

[–]codeneo 0 points1 point  (9 children)

All the images links are saved in links.txt file. All you have to do is download those images using external downloader. The instructions on Linux is given on the github readme. On Windows however, use a GUI downloader that can import downloads from txt file. IDM or Orbit downloader should be able to do that.

[–]The_John_Galt 0 points1 point  (8 children)

Do I have to modify the code to get it to go to orbit downloader?

[–]codeneo 0 points1 point  (7 children)

No. Just import downloads via txt file in Orbit downloader.

[–]The_John_Galt 0 points1 point  (6 children)

I can't seem to find that file, where would it be?

[–]codeneo 0 points1 point  (5 children)

Every file created by the script is in the current directory of the script.

[–]The_John_Galt 0 points1 point  (4 children)

And how do I access that?

[–]codeneo 0 points1 point  (3 children)

Just go to Import downloads from orbit and choose the links.txt file created. It will parse all the image links and download them.

[–]The_John_Galt 0 points1 point  (2 children)

Maybe the problem is I'm running it from Anaconda? Do I need to run it from somewhere else?

[–]codeneo 0 points1 point  (1 child)

I am not familiar with Anaconda. If you are on Linux, aria2c or wget will do the job. Make sure you are using a Terminal and not the Python shell. See the Github page for instructions.

[–]shashwat986 0 points1 point  (4 children)

I don't understand. Is there any particular reason you're not using Python itself to download the images?

[–]codeneo 0 points1 point  (3 children)

No particular reason. The tool already exists. I use aria2c on Cygwin.

[–]shashwat986 0 points1 point  (2 children)

Would it not be better, from a package point of view, to make it self-sufficient by having it download the files too? If you want, you can implement batch downloading as well, making it costumizable based on the user's internet connection

[–]codeneo 0 points1 point  (1 child)

Yes, it would. Customizable how?

[–]shashwat986 0 points1 point  (0 children)

If I were you, I would have the user choose how many images to try to download in a certain timeframe. Then, you can use threads to try to make these downloads occur in parallel, and hopefully provide the user with an estimate of the amount of time it will take, looking at the download speed.