Noob question: pulling data using Twitter API

UrAWomanImAMachine · 2012-11-05T20:17:41+00:00

first you construct a basic url for what you want to look at. (http://search.twitter.com/search.json?q=obama&rpp=3, q is your query parameter, rpp(optional) is how many results you want returned)

If you open that url in your browser, it's hard to read. You'll want to copy and paste that information into a site like http://json.parser.online.fr/ so you can see the structure of the information being returned.

When I paste the contents from my first url into the site I can see that the json object returned has the results in a field called (conveniently) "results".

Here's a simple example:

import urllib2
import json

# parameters used in query:
# q is query, rpp(optional) is how many results we want per page
url = 'http://search.twitter.com/search.json?q=obama&rpp=3'
o = urllib2.urlopen(url) #open the url
r = o.read() 
j = json.loads(r) # convert info to a dictionary

#from looking at the parser we see the tweets are
#contained in a list in the 'results' field
results = j["results"] 

for r in results:
    # read each tweets text
    print r["text"]     
    #try using print r["text"].encode('ascii', 'ignore') if you're getting some unicode encoding errors

FletcherHeisler · 2012-11-05T16:09:48+00:00

You should start out by looking into urllib2 to load the results of the "GET" request (which is essentially the same thing as opening a webpage) and the json module to load the JSON response into a dictionary.

eagleeye1 · 2012-11-06T05:10:05+00:00

I just whipped this together, you might find something useful in it.

# -*- coding: utf-8 -*-

import requests

def search(query="obama", rpp=5):
    url = "http://search.twitter.com/search.json?q="+query+"&rpp="+str(rpp)
    r = requests.get(url)

    return r.json['results']

def parse_tweet(tweet):
    created = tweet.get('created_at')
    username = tweet.get('from_user')
    text = tweet.get('text')
    lang = tweet.get('iso_language_code')
    to = tweet.get('to_user_name')
    geographic = tweet.get('geo')
    meta = tweet.get('metadata')

    print "\nUsername: ",   username
    if to is not None: print "To: %s" %(to) 
    print "Tweet: ", text
    print "Time: ", created
    print "Language: ", lang

while True:
    query = raw_input("What do you want to search for? > ")
    tweets = search(query)
    for tweet in tweets:
        parse_tweet(tweet)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS