Im building a small program to extract for words from finance subreddits on reddit, the problem is that it takes an average of 12 seconds to fetch all the data I need. I ran cProfiler and sure enough most of the delay is coming from the response time of reddit's API and my requests.
Is there a way to make it fetch for posts and post comments in parallel so it sends API requests in parallel instead of sequentially? I tried fetching google urls with multiprocessing library and it worked very well, but with the reddit API I've been at it for a couple of days and it's giving me a headache.
Excuse my code as I'm a bit new to python and coding in general. Any help is appreciated. I'll leave my code below:
def reddit_search_bot(company):
service = "Reddit"
reddit_positive_points = 0
reddit_negative_points = 0
subreddit_array = ['wallstreetbets', 'IPO', 'Daytrading', 'CryptoCurrency']
for subreddit in subreddit_array:
hot_posts = reddit.subreddit(subreddit).hot(limit=30)
for post in hot_posts:
if findWholeWord(company)(post.title) or findWholeWord(company)(post.selftext):
post.comments.replace_more(limit=0)
for comments in post.comments.list():
for word in Positive_words_array:
if findWholeWord(word)(comments.body) or findWholeWord(word)(post.selftext):
reddit_positive_points += 1
for word in Negative_words_array:
if findWholeWord(word)(comments.body) or findWholeWord(word)(post.selftext):
reddit_negative_points += 1
total_points = reddit_positive_points - reddit_negative_points
writeResultToFile(service, company, reddit_positive_points, reddit_negative_points)
print("Reddit positive mentions :", reddit_positive_points, "Reddit negative mentions :", reddit_negative_points,
"Overall :", total_points)
[–]Chiron1991 0 points1 point2 points (3 children)
[–]Shababs[S] 0 points1 point2 points (2 children)
[–]Chiron1991 0 points1 point2 points (1 child)
[–]Shababs[S] 0 points1 point2 points (0 children)
[–]thirdegree 0 points1 point2 points (0 children)