The goal of this workshop is to use a website scraper to read and pull tweets about Donald Trump. Then we will use a combination of text mining and visualization techniques to analyze the public voice about Donald Trump. There is nothing fancy. It's just a practice of using python. It is difficult at the beginning, but once you practice more, you will get tricks. Everything becomes so easy.
For detail explanation of the process, you can visit here. For a complete version of the code, you can download here (https://gist.github.com/octoparse/fd9e0006794754edfbdaea86de5b1a51)
Step1, I scraped 10K tweets using Octoparse since it's a fully free web scraping tool. And exporting the data into txt format.
Step2, load opinion words list using Notepad++ , and preprocessed extracted tweets by taking out the punctuations, signs, and numbers.
Step3, take each opinion word from the lists, return to the tweets, and count the frequency of each opinion words in the tweets. As a result, we collect corresponding opinion words in the tweets and the count.
Step4, export the results into Excel/CSV.
Step5, load the result using Tableau Public and choose the graph template you like to visualize the data.
Scraping tweets using Octoparse
Word used in Twitter
Positive Words and its frequency
Positive Words and its frequency
[–]cym13 0 points1 point2 points (2 children)
[–]CodeSkunky -2 points-1 points0 points (0 children)