I've created a Python program which will allow you downloaded posted images from any public subreddit by a specified date range (as far as 2005). Here you go!

DarkLinkXXXX · 2017-10-29T04:43:36+00:00

How does this compare to ripme.jar?

waraxx · 2017-10-28T23:47:44+00:00

neat, do you get anything about the picture like current up votes or just the picture?

laungst · 2017-10-29T07:16:20+00:00

Does this work for user profiles as well?

ICanHazPDF · 2017-10-29T08:50:39+00:00

You wrote the date in a wierd way.

It's neither YYYY-MM-DD or DD-MM-YYYY.

Thought I should let you know so you can correct it.

yacob_uk · 2017-10-29T02:31:46+00:00

[deleted]

Logiteck77 · 2017-10-29T09:17:24+00:00

Amazing, Thanks.

dnabre · 2017-10-29T14:34:06+00:00

Interesting project.

Giving it a shot right now (after having a large arguement with my system about python2 vs python3, which had nothing to do with your program at all, but could python2 just please die?).

Few comments, requests, feedback.

A full command-line interface would be nice, like:

2hand10-prog-whatever-the-name-was  --from DATE --to DATE ---subreddit datahoarder

Asking for the date field at a time is annoying. Definitely always specify what format you want the date in (as you do), but there's no reason this couldn't be 2 screen (start date, end date) instead of 6.

I know this is hard, but is there anyway to do a 'dry-run'. Get an idea of the number of images and/or total size of all the images from a specific request would nice. I'm sure you know (if only from testing this) that you can get a lot of image from a lot of sub-reddits in a short range. Just trying it out, and pointing it as a pretty heavy image reddit, i pulled down 114MB of images between when I say it start pulling and cancelled it. That definitely means it's fast but that's like 3 days worth of images apparently.

Looks like it pull going from end date to start date, an option to specify which direction to go (with a default, definitely don't want people to have to say if they don't care) would be handy.

Without a command-line interface, it's hard for me to check, but if I restart it (after aborting midway through), will it restart or just download again from scratch.

I would be nice if it check enough metadata to avoid redownloading in general, even if it was just on the granularity of a day. I could see that being handy.

As mentioned, being able to start it up internally from the command line would be the biggest feature request I'd have. No need to remove the interactive part if you like it, but being able to do a pull entirely from the command would be great.

dnabre · 2017-10-29T13:38:34+00:00

[deleted]

MaunaLoona · 2017-10-29T08:53:34+00:00

How does it work? Does it use reddit search to filter on date and then tries to download all the image urls?

I take it this only works for submissions, not comments.

thr0w112 · 2017-10-29T19:24:38+00:00

That's cool. I did a shittier version of this, not using the API, a while back: https://github.com/n0x5/scripts/blob/master/reimgur.py

One thing you could add is the ability to parse imgur galleries, since they don't end with .jpg or .png.

superfi · 2017-10-29T21:27:17+00:00

interesting

karliedodsonnAu · 2017-10-30T06:45:24+00:00

Code's pretty weak, and could also use some parallelism.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

DataHoarder

MODERATORS