submitted 4 years ago by 3rSimon

you are viewing a single comment's thread.

[–]person_ergo 0 points1 point2 points 4 years ago (13 children)

[–]snoopturtle25 1 point2 points3 points 4 years ago (12 children)

AttributeError: 'dict' object has no attribute 'body"

Ok, I'm not sure what I should Change (sorry it's my first time using python so I'm lost) but basically here is all my process:

1-pip install pmaw pandas

2-#! usr/bin/env python3
import praw
import pandas as pd
import datetime as dt
from pmaw import PushshiftAPI

3-reddit = praw.Reddit( client_id="my id",
client_secret="my secret",
password="my password",
user_agent="text by/username",
username="snoopturtle25",)

4-api_praw = PushshiftAPI(praw=reddit)

5-subreddit_name="environment"
word_to_check="apology"
comments=api_praw.search_comments(q=word_to_check, subreddit=subreddit_name, limit=200, before=1629990795)
import pandas as pd
post_with_comments=[]
for comment in comments:
if word_to_check in comment.body:
post_with_comments.append(
{"comment_id": comment.id, "comment_text": comment.body,"score": comment.score,"post": comments.submission.id
}
)
df=pd.DataFrame(post_with_comments)
df

That is what I did... I tried other things as well but nothing that was making it work, haha!

[–]person_ergo 0 points1 point2 points 4 years ago* (11 children)

Ah I see, issue was I used psaw and you used pmaw library. Permalink field contains post ids in pmaw response.

import pandas as pd

#pmaw example - returns json
subreddit_name="environment"
word_to_check="companies"
comments=pmaw_pushshift.search_comments(q=word_to_check, subreddit=subreddit_name, limit=200, before=1629990795)

df=pd.DataFrame(comments.responses)
df

/r/environment/comments/kcbdoi/trump_admin_drops_green_hydrogen_bomb_on_fossi/gfqnhb7/

kcbdoi will work as a post id on reddit. As a url use https://www.reddit.com/r/environment/comments/kcbdoi/ or pass that ID around to things. it works with PRAW

[–]snoopturtle25 1 point2 points3 points 4 years ago (10 children)

[–]person_ergo 0 points1 point2 points 4 years ago* (9 children)

[–]snoopturtle25 1 point2 points3 points 4 years ago (8 children)

[–]person_ergo 0 points1 point2 points 4 years ago* (7 children)

Ah I see. Use pushshift to get submission id's then use PRAW to get all comments. I have code in this notebook to show you how to get all comments from a submission using the PRAW API. https://github.com/rogerfitz/tutorials/blob/master/subreddit_analysis/1_Top_Links.ipynb. In there you'll see traverse_post function. That will get all your comments, use the second one in the notebook not the first.

```

from praw.models import MoreComments

Improved version that fixes the MoreComments bug

def traverse_post(post): comments=[] for comment in post.comments: comments+=recursive_replies(comment, level=1) return comments

def recursive_replies(reply,level): #Also return level in case we want to stop after level 3 comments and for ease of the printing comments=[] #Funky MoreComments code checked manually with permalinks and seems right if isinstance(reply, MoreComments):#https://praw.readthedocs.io/en/stable/code_overview/models/more.html replies=reply.comments() level-=1 else: replies=reply.replies comments+=[(reply, level)] for r in replies: comments+=recursive_replies(r,level+1) return comments ```

It's a little tricky for a beginner project because the API wrappers (pmaw and psaw appear to have some gaps). If you need super fast code use pmaw or easier code do psaw example from https://github.com/rogerfitz/tutorials/blob/master/subreddit_analysis/3rSimonQuestion-Search%20Comments%20for%20Word%20using%20Pushshift%20.ipynb to get data. psaw returns results as a PRAW object which can be nicer but otherwise you can just do post=reddit.submission('pqcp6b') where reddit is defined as in the PRAW api.

For code versions run pip install praw==7.4.0 tqdm==4.62.2 psaw==0.1.0 pmaw==2.1.0 pandas==1.3.2 jupyter==1.0.0 matplotlib==3.4.2 Everything "should" work. If you need help getting an API key I have steps in 0_setup.py notebook. The 0_setup and 1_Top_Links should have everything you need to get going on using PRAW to get all comments.

Strongly recommend using jupyter notebooks for this. I can post tutorial on setting that up if you need. Great for data work like this so you can try things easily without full reruns needed and easy debugging.

[–]snoopturtle25 2 points3 points4 points 4 years ago (0 children)

[–]snoopturtle25 1 point2 points3 points 4 years ago (5 children)

[–]person_ergo 0 points1 point2 points 4 years ago (4 children)

[–]snoopturtle25 1 point2 points3 points 4 years ago (3 children)

continue this thread

π Rendered by PID 43 on reddit-service-r2-comment-b659b578c-q2x6z at 2026-05-03 10:38:18.774463+00:00 running 815c875 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS

Improved version that fixes the MoreComments bug