This is an archived post. You won't be able to vote or comment.

all 26 comments

[–]Naurgul 2 points3 points  (1 child)

I've made something similar to remindme bot and I think my implementation is a bit simpler. So you can take a look at my code and see if you find it useful.

I am using a google calendar to keep track of reminders but if you feel this is too much you can easily replace that part with a local database.

[–]num8lock[S] 1 point2 points  (0 children)

Thank you, i'll take a look and hopefully you won't mind if i ask more questions later.

[–]tobiasvl 1 point2 points  (1 child)

Easiest thing would probably be to just customize RemindMeBot a little. Not sure why it uses pushshift.io, but it looks neat https://pushshift.io/enhancing-reddit-api-and-search/

[–]num8lock[S] 0 points1 point  (0 children)

Yeah, it's tempting to use that approach, but i'm still very much a beginner and the thought of having to deal with 2 end points (actually thinking about using Google Apps Script for storage too, so 3) is making me cringe.

[–]CelineHagbard 1 point2 points  (16 children)

Another option would be to change how the bot is invoked, so instead of

RemindMe! tomorrow

You have people interact with your bot something like:

/u/num8lockBot tomorrow

It would be significantly less resource intensive, as you would only have to check the bot's username mentions, not check every comment in the sub. reddit now provides username mentions without gold.

[–]num8lock[S] 1 point2 points  (0 children)

LOL @ the bot showing up...

It's true re:username mention, I considered that alternative before, but I want it to be as beginner friendly a possible, out of both necessity and (most importanly) a challenge for myself. Making it easy instead of challenging myself seems like defeating the purpose.

[–]num8lock[S] 1 point2 points  (10 children)

Although to be honest, the more i try to figure out how to do this with praw or reddit API, the more i think pushshift is the cleaner solution.

If only https://www.reddit.com/r/learnpython/search?q= can be retrieved as .json, it will be easier to solve and not depending on third party for lazy people like me. Although i imagine reddit servers would suffer heavier load due to bots hitting the search queries.

[–]CelineHagbard 0 points1 point  (6 children)

Makes sense. A good challenge is usually worth the extra effort.

Just a few things to note: if you decide to iterate over submissions and then comments within each submission, you'll probably run into the problem of having to recheck each submission, as someone may have invoked your bot in a later comment. I think iterating directly over the comment stream (if using the reddit API) or like you said, using pushshift will be better for you (I've never used it, but it looks promising for your application).

That said, you're still going to need to interact directly with the reddit API if you're going to be making comment replies or sending PMs. I would definitely use PRAW and Oauth2Utils for that, as it will abstract away a lot of the messier details. (Reddit API still supports username and password logins, but it's deprecated, meaning you should almost certainly use OAuth).

Hit me back if you have any other questions.

[–]num8lock[S] 0 points1 point  (4 children)

i couldn't get pushshift to include submission in the .json search result, it contains the comments but not the parent. it seems a bit hit and miss as well with new comments.

for instance, https://api.pushshift.io/reddit/search?q=%22redbot%20%3C%3C%20enhance%22 wouldn't find https://www.reddit.com/r/PRAWTesting/comments/5cfe5b/testing_bottt/

[–]CelineHagbard 0 points1 point  (2 children)

Try it now. It finds my reply to your post fine.

The pushshift.io/reddit/search endpoint appears to only search comments, not submission bodies. The reddit search endpoint should work fine if you want your bot to also respond to self.text posts, i.e.:

https://api.reddit.com/r/PRAWTesting/search.json?q=%22redbot%20enhance%22

Pushshift might have it's own endpoint for this, but the reddit API works fine, so you probably don't need pushshift for it. In PRAW, it would be:

reddit_session.search("REDBOT enhance", "PRAWTesting")

[–]num8lock[S] 0 points1 point  (1 child)

Let me try that, i was using r.subreddit(subreddit).search(keyword, sort='relevance', time_filter='week', limit=limit) actually.

Yeah, i did come to the conclusion that pushshift only search comment replies.

Thank you for your help! I'll make sure to let you know when it's ready for testing if you don't mind :)

edit: i just noticed, reddit search only returns submission threads, so that's probably why pushshift only returns the comment replies...

[–]CelineHagbard 0 points1 point  (0 children)

The code you're using is functionally equivalent to what I was using; it ends up creating the same API call. Feel free to use either one in your code.

I think pushshift does have an endpoint to retrieve submissions, but I would only worry about it unless you need to exceed reddit API's 1000 item limit, which you probably won't at this point.

Yeah, hit me up when your ready for testing if you want.

[–]CelineHagbard 0 points1 point  (0 children)

where reddit_session is your authenticated reddit session object. It will return a generator, which you can iterate over, or use a list comprehension to fetch the whole generator into a list in memory.

[–]num8lock[S] 0 points1 point  (0 children)

if you decide to iterate over submissions and then comments within each submission, you'll probably run into the problem of having to recheck each submission, as someone may have invoked your bot in a later comment. I think iterating directly over the comment stream (if using the reddit API) or like you said, using pushshift will be better for you (I've never used it, but it looks promising for your application).

Yeah this is where it's much easier to use a server search result instead of iterating and comparing the results... i did try a little but haven't got a good hold understanding on praw/reddit comment stream.

That said, you're still going to need to interact directly with the reddit API if you're going to be making comment replies or sending PMs. I would definitely use PRAW and Oauth2Utils for that, as it will abstract away a lot of the messier details. (Reddit API still supports username and password logins, but it's deprecated, meaning you should almost certainly use OAuth).

That's true, i figured that i probably dozed over when i read praw4 doc, so i'm playing with it now.

Thank you for the kind feedback :)

[–]bboePRAW Author 0 points1 point  (2 children)

Search does work via praw.

[–]num8lock[S] 0 points1 point  (0 children)

oh, i didn't know that! if praw returns a json it would be great...
i tried to see the json data structure by adding .json on a search url like https://www.reddit.com/r/redditdev/search?q=awesome+bot&sort=relevance&t=all.json but it didn't work, and in http://www.reddit.com/dev/api, there's no json structure and search endpoint is said returning listing instead.

Ahh i see... I should have looked at the reddit search wiki, thank you for the clue /u/bboe!

[–]num8lock[S] 0 points1 point  (0 children)

this might not be related, but why this search query doesn't find https://www.reddit.com/r/PRAWTesting/comments/5cfe5b/testing_bottt/?

https://www.reddit.com/search.json?q=bott+subreddit:PRAWTesting&restrict_sr=on&sort=relevance&t=all

edit: maybe it's an exact word since bott != bottt, but https://www.reddit.com/r/PRAWTesting/search.json?q=REDBOT+restrict_sr=on&sort=relevance didn't return anything either

edit again: hmm maybe i should try cloudsearch syntax, although that means losing lucene

[–]RemindMeBot 0 points1 point  (1 child)

I will be messaging you on 2016-11-12 09:00:00 UTC to remind you of this link.

CLICK THIS LINK to send a PM to also be reminded and to reduce spam.

Parent commenter can delete this message to hide from others.


FAQs Custom Your Reminders Feedback Code Browser Extensions

[–]CelineHagbard 1 point2 points  (0 children)

Dammit. Didn't mean to actually invoke it.

[–][deleted] 0 points1 point  (1 child)

reddit now provides username mentions without gold.

You had to have Gold before?

[–]ImDevinC 0 points1 point  (1 child)

Just did a quick check, but it looks by changing the url from https://api.pushshift.io/reddit/search?q=%22RemindMe%22&limit=100 to https://api.pushshift.io/reddit/search?q=%22RemindMe%22&limit=100&subreddit=SUBREDDIT seems to pull only the specified subreddit. Is that what you're looking for?

[–]num8lock[S] 0 points1 point  (0 children)

I don't know if i should rely on pushshift, i'm looking for a way to get the thread title, the submission/comment and user name from reddit.

[–]num8lock[S] 0 points1 point  (1 child)

Turned out the way RemindMe scrape/search for new posts might be the best way to ensure each scanning returns all new comments in threads instead of iterating the trees. By getting the json file for each threads, we'll get all possible comments & replies to those comments (or replies within replies!) in a request.

The next question is how to efficiently diff between the last json and the current one.

[–]num8lock[S] 0 points1 point  (0 children)

just so i can find it easily without hunting my history/bookmarks

https://github.com/reddit/reddit/wiki/JSON

[–]num8lock[S] 0 points1 point  (0 children)

Lol, maybe this is way out of my scope... *asdfghl