My “large scale data analysis” class this semester has me thinking about ways to use Twitter in introductory Python courses. After some preliminary searching for ways to use the Twitter API, I came across this excellent post about tracking the “happiness” on Twitter by collecting tweets containing emoticons. I especially like it because the program also incorporates matplotlib. When I ran the program, here is the output matplotlib graph I got (where Twitter is queried every 20 seconds for new emoticons):
There is also an excellent walkthrough about how to set up a Twitter developer account, so I won’t reiterate all that here.
Now I’m trying to brainstorm ways we could use this in CSE 231. The most obvious that comes to mind is simply parsing the tweet text as it streams in, storing it to a dictionary (keyed by the word) and then calculating word frequencies. But the word frequency concept can be a bit overdone in 231 because it is one of the most straightforward uses of dictionaries for beginners. Instead, perhaps we could store the incoming tweets somehow and then do (very naive) comparisons and sentiment analysis on the strings. For example, near an election time, we could collect tweets containing “democrat” and “republican.” Students could then look at how often each word is being tweeted and do simple analysis for positive/negative sentiment in the tweet.