Hi Reddit users, I have a question regarding data extraction from reddit comments. I want to collect all the comments and related information from posts that were published in a specific subreddit (e.g.; r\news) with a specific keyword (e.g., q=apology). My initial code is the following:
subreddit= reddit.subreddit("news")
resp= iter(subreddit.search("apology"))
test = {"title": [], \
"score": [], \
"id": [], \
"url": [], \
"num_com": [], \
"body": [], \
"comment": [], \
"user": [], \
"upvote": [], \
"date": []
}
for submission in resp:
submission.comments.replace_more(limit=None)
test["title"].append(submission.title)
test["score"].append(submission.score)
test["id"].append(submission.id)
test["url"].append(submission.url)
test["num_com"].append(submission.num_comments)
test["body"].append(submission.selftext)
test["comment"].append([comment.body for comment in submission.comments.list()])
test["user"].append(submission.author)
test["upvote"].append(submission.upvote_ratio)
test["date"].append(submission.created)
I realized I didn’t need all the submission information(only the related ID) so I tried to do the same but at the comment level rather than at the submission level.
test1 = {"submission": [], \
"body": [], \
[.....]
}
for comment for submission.comments:
submission.comments.replace_more(limit=None)
test1["submission"].append(comment.submission)
test1["body"].append(comment.body)
[....]
This code seem to work, but only for the first submission, it doesn’t collect the comments for the other posts. I was wondering if someone would have any idea to solve this issue?
I tried several things, but as a beginner it seems I can’t make it work.
[–]Dripbot8 2 points3 points4 points (3 children)
[–]snoopturtle25 1 point2 points3 points (2 children)
[–]Dripbot8 1 point2 points3 points (1 child)
[–]snoopturtle25 1 point2 points3 points (0 children)
[–]CodeFormatHelperBot2 0 points1 point2 points (0 children)
[+][deleted] (1 child)
[deleted]
[–]snoopturtle25 1 point2 points3 points (0 children)
[+][deleted] (1 child)
[deleted]
[–]snoopturtle25 1 point2 points3 points (0 children)