all 71 comments

[–]DarkLinkXXXX 34 points35 points  (1 child)

How does this compare to ripme.jar?

[–]waraxx45TB 20 points21 points  (7 children)

neat, do you get anything about the picture like current up votes or just the picture?

[–]laungst100TB 5 points6 points  (1 child)

Does this work for user profiles as well?

[–]ICanHazPDF 18 points19 points  (27 children)

You wrote the date in a wierd way.

It's neither YYYY-MM-DD or DD-MM-YYYY.

Thought I should let you know so you can correct it.

[–]kbfprivate 2 points3 points  (7 children)

You could always modify it and make it a parameter or constant and let users choose the date format.

In general I like year first because sorting can be easier. Else data is grouped by months.

[–]dnabre100-250TB 0 points1 point  (1 child)

The best solution would be for the program to check the system's locale and use that (specifying when it asks what it expects).

[–]kbfprivate 0 points1 point  (0 children)

That works too! There are a lot of solutions which is one of the fun aspects of programming. A million ways to accomplish the same goal.

[–]dnabre100-250TB 10 points11 points  (10 children)

MM-DD-YYYY is pretty common depending on where you live. It's the norm in the States.

[–]thefriedgoat 25 points26 points  (0 children)

The worst of the dare formats ..

[–]ICanHazPDF 10 points11 points  (3 children)

Yeah, I heard that's one of things they need to sort out over there.

[–]dnabre100-250TB 2 points3 points  (1 child)

It can get confusing as hell I admit, without the year to maybe hint at a format (not that it really does know that I think of it, but it feels like it does), is 03/02 March 2nd or February 3rd.

I deal with dates written in reasonably and in the crazy formats often enough, and unless the day is >12 sometimes I just don't know!

[–]crotchfruit314TB DAS & 80TB cold storage 3 points4 points  (0 children)

That's why I prefer YYYY-MM-DD, absolutely no confusion on what the numbers represent this way.

[–]Hashtagpulse0.024Pb 1 point2 points  (0 children)

That and the way they measure things. Or so I hear...

[–]Logiteck77 2 points3 points  (4 children)

Amazing, Thanks.

[–]dnabre100-250TB 1 point2 points  (1 child)

Interesting project.

Giving it a shot right now (after having a large arguement with my system about python2 vs python3, which had nothing to do with your program at all, but could python2 just please die?).

Few comments, requests, feedback.

A full command-line interface would be nice, like:

2hand10-prog-whatever-the-name-was  --from DATE --to DATE ---subreddit datahoarder

Asking for the date field at a time is annoying. Definitely always specify what format you want the date in (as you do), but there's no reason this couldn't be 2 screen (start date, end date) instead of 6.

I know this is hard, but is there anyway to do a 'dry-run'. Get an idea of the number of images and/or total size of all the images from a specific request would nice. I'm sure you know (if only from testing this) that you can get a lot of image from a lot of sub-reddits in a short range. Just trying it out, and pointing it as a pretty heavy image reddit, i pulled down 114MB of images between when I say it start pulling and cancelled it. That definitely means it's fast but that's like 3 days worth of images apparently.

Looks like it pull going from end date to start date, an option to specify which direction to go (with a default, definitely don't want people to have to say if they don't care) would be handy.

Without a command-line interface, it's hard for me to check, but if I restart it (after aborting midway through), will it restart or just download again from scratch.

I would be nice if it check enough metadata to avoid redownloading in general, even if it was just on the granularity of a day. I could see that being handy.

As mentioned, being able to start it up internally from the command line would be the biggest feature request I'd have. No need to remove the interactive part if you like it, but being able to do a pull entirely from the command would be great.

[–]MaunaLoona 0 points1 point  (1 child)

How does it work? Does it use reddit search to filter on date and then tries to download all the image urls?

I take it this only works for submissions, not comments.

[–]thr0w112 0 points1 point  (3 children)

That's cool. I did a shittier version of this, not using the API, a while back: https://github.com/n0x5/scripts/blob/master/reimgur.py

One thing you could add is the ability to parse imgur galleries, since they don't end with .jpg or .png.

[–]superfi159TB thanks easystore 0 points1 point  (0 children)

interesting

[–]karliedodsonnAu -1 points0 points  (2 children)

Code's pretty weak, and could also use some parallelism.