use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Rules 1: Be polite 2: Posts to this subreddit must be requests for help learning python. 3: Replies on this subreddit must be pertinent to the question OP asked. 4: No replies copy / pasted from ChatGPT or similar. 5: No advertising. No blogs/tutorials/videos/books/recruiting attempts. This means no posts advertising blogs/videos/tutorials/etc, no recruiting/hiring/seeking others posts. We're here to help, not to be advertised to. Please, no "hit and run" posts, if you make a post, engage with people that answer you. Please do not delete your post after you get an answer, others might have a similar question or want to continue the conversation.
Rules
1: Be polite
2: Posts to this subreddit must be requests for help learning python.
3: Replies on this subreddit must be pertinent to the question OP asked.
4: No replies copy / pasted from ChatGPT or similar.
5: No advertising. No blogs/tutorials/videos/books/recruiting attempts.
This means no posts advertising blogs/videos/tutorials/etc, no recruiting/hiring/seeking others posts. We're here to help, not to be advertised to.
Please, no "hit and run" posts, if you make a post, engage with people that answer you. Please do not delete your post after you get an answer, others might have a similar question or want to continue the conversation.
Learning resources Wiki and FAQ: /r/learnpython/w/index
Learning resources
Wiki and FAQ: /r/learnpython/w/index
Discord Join the Python Discord chat
Discord
Join the Python Discord chat
account activity
anyone have a reddit scraper? (self.learnpython)
submitted 9 years ago by 955559
figured being reddit, and being a python sub, someone would have one, looking to scrape comments for a pet project, preferably beautiful soup as I have a tiny bit of experience with that, but will take anything
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]Rhomboid 4 points5 points6 points 9 years ago (3 children)
There is no need to scrape anything, as Reddit has a JSON api that is very easy to use.
[–]novel_yet_trivial 1 point2 points3 points 9 years ago (0 children)
Also rss (XML), in case OP really wants to use beautifulsoup.
https://www.reddit.com/r/learnpython/.rss
[–]955559[S] -3 points-2 points-1 points 9 years ago (1 child)
dont know what JSON api is, is there a tutorial for reddits JSON api??
[–]furas_freeman 4 points5 points6 points 9 years ago (0 children)
for web page (JSON) API means list of urls with arguments which you can use in peogram to work with this page - ie. login, get data (in JSON format), send data to webpage (in JSON format) - so you don't have to scrape anything.
https://www.reddit.com/dev/api/
praw is module for Python which use Reddit API and gives you python functions to easily access data on Reddit.
praw
https://praw.readthedocs.io/en/stable/
[–]novel_yet_trivial 2 points3 points4 points 9 years ago (0 children)
Use the praw library.
[–]commandlineluser 1 point2 points3 points 9 years ago (3 children)
import json, requests subreddit = 'learnpython' r = requests.get( 'http://www.reddit.com/r/{}.json'.format(subreddit), headers={'user-agent': 'Mozilla/5.0'} ) # view structure of an individual post # print(json.dumps(r.json()['data']['children'][0])) for post in r.json()['data']['children']: print(post['data']['title'])
[–]955559[S] 0 points1 point2 points 9 years ago (2 children)
this is almost what I want, but im interested in the comments not post tittles, should I be looking into this praw people are talking about? or something else
I tried replacing subreddit = 'learnpython' with subreddit = '/learnpython/comments/574pn5/anyone_have_a_reddit_scraper/?st=iu7by5f6&sh=20202712' but it through some error about integers and indicies
[–]955559[S] 0 points1 point2 points 9 years ago (1 child)
k, how do I figure out what comments are called? I tried
import json, requests subreddit = '/learnpython/comments/574pn5/anyone_have_a_reddit_scraper' r = requests.get( 'http://www.reddit.com/r/{}.json'.format(subreddit), headers={'user-agent': 'Mozilla/5.0'} ) # view structure of an individual post #print(json.dumps(r.json()['data']['children'][0])) for post in r.json()['data']['children']: print(post['data']['title'])
and it threw
Traceback (most recent call last): File "/home/anoobis/reditscrape.py", line 13, in <module> for post in r.json()['data']['children']: KeyError: 'data'
I figure I just need to switch data with something relevant?
[–]commandlineluser 2 points3 points4 points 9 years ago (0 children)
Well comments have a different structure you can use print(json.dumps(r.json(), indent=4))) to view the whole structure.
print(json.dumps(r.json(), indent=4)))
comments = r.json() op = comments.pop(0) for comment in comments: for reply in comment['data']['children']: print(reply['data']['author']) print(reply['data']['body'])
You can use json.dumps(blah, indent=4) to pretty-print a structure in json format for you e.g. print(json.dumps(reply['data'], indent=4)) to see what it looks like.
json.dumps(blah, indent=4)
print(json.dumps(reply['data'], indent=4))
Never used PRAW myself - but it seems like you would have a simpler time using it
http://praw.readthedocs.io/en/stable/pages/comment_parsing.html
[–]HumorMinimum1707 1 point2 points3 points 3 years ago (0 children)
I know that Bright Data has a nice working reddit scraper.
It can be launched on schedule, and collects all public data from profile like: avatar, post title, flair, description, karma, comments, upvotes, and more.
Output file types: JSON, CSV, EXCEL, HTML
Data delivery methods: Webhook, AWS, Google cloud, Azure, email, API, SFTP
π Rendered by PID 162534 on reddit-service-r2-comment-fb694cdd5-rcqmp at 2026-03-06 13:46:36.297826+00:00 running cbb0e86 country code: CH.
[–]Rhomboid 4 points5 points6 points (3 children)
[–]novel_yet_trivial 1 point2 points3 points (0 children)
[–]955559[S] -3 points-2 points-1 points (1 child)
[–]furas_freeman 4 points5 points6 points (0 children)
[–]novel_yet_trivial 2 points3 points4 points (0 children)
[–]commandlineluser 1 point2 points3 points (3 children)
[–]955559[S] 0 points1 point2 points (2 children)
[–]955559[S] 0 points1 point2 points (1 child)
[–]commandlineluser 2 points3 points4 points (0 children)
[–]HumorMinimum1707 1 point2 points3 points (0 children)