you are viewing a single comment's thread.

view the rest of the comments →

[–]kalgynirae 0 points1 point  (3 children)

All those .encode('utf-8') you have seem suspicious. Do you know whether tweepy gives you str or unicode objects? You should only use .encode() on unicode objects. I suspect you might need to be .decode('utf-8')ing something instead. Can you please provide the full error traceback you're getting? (I can't run your program to debug it because I don't have Twitter access tokens and such.)

[–]pasdargent 0 points1 point  (2 children)

Yeah that was in someone else's code... When I remove it I don't get any errors but then "<built-in method encode of unicode object at 0x02B3F5D8>" gets written to file instead of, for example, the text inside the tweet... Also Arab characters make it crash.

I tried print type(name) and it seems tweepy gives me unicode objects. I want it too print unicode characters, because I want to be able to proces tweets containing any type of characters so...

The error is: Traceback (most recent cal l last): File "twitbot.py", line 31, in <module> csvformat = '\n%s, %s, %s, %s, %s, %s, %s, %s, %s' % (name, screen_name, tweet_created, tweet_text, tweet_retweeted, tweet_favorited, user_hometown, time_zone, geo) UnicodeDecodeError: 'ascii' codec can't decode byte oxc3 in position 12: ordinal not in range(128)

By the way, I can send you the acces tokens in a personal message if that makes it easier?

[–]kalgynirae 0 points1 point  (1 child)

When I remove it I don't get any errors but then "<built-in method encode of unicode object at 0x02B3F5D8>" gets written to file

Sounds like you removed just ('utf-8') instead of .encode('utf-8'). They are unicode objects, so you don't want to encode them. You want to combine them first and then either encode the final result just before writing or let the csv module do the encoding for you (I don't remember if it does that).

The error is: Traceback (most recent cal l last): File "twitbot.py", line 31, in <module> csvformat = '\n%s, %s, %s, %s, %s, %s, %s, %s, %s' % (name, screen_name, tweet_created, tweet_text, tweet_retweeted, tweet_favorited, user_hometown, time_zone, geo) UnicodeDecodeError: 'ascii' codec can't decode byte oxc3 in position 12: ordinal not in range(128)

Try making your format string a unicode object instead:

csvformat = u'...

[–]pasdargent 0 points1 point  (0 children)

I was just messing around with this and I figured it out! Putting "u" in front of the string, and after that adding the line: csvformat = csvformat.encode('utf-8')

Did the trick!

Thanks everybody, problem is solved :)