Exploring the Sugar Tax Debate on Reddit and Twitter: A Sentiment Analysis Approach

Zhenning Xu

02.27.2020

1 Research questions

  • What sugar-tax-related topics do people discuss on Twitter and Reddit?
  • How can we identify different sugar-related topics that people talk about?
  • “Sin” tax toward sugary drinks - do people like it or not?
  • How do “black opinion markets” contribute to the sugar tax debate?

2 literature review

  • social media differs from traditional data, yet is an important source for businesses to gain and maintain a competitive edge (Davenport and Harris, 2007)

  • A “topic” consists of a cluster of words that frequently occur together.

  • Topic models can connect words with similar meanings and distinguish between uses of words with multiple meanings. - Probabilistic Topic Models by Steyvers and Griffiths (2007)

  • Example: https://utjimmyx.shinyapps.io/shinynlp/

3 Why is it important?

plot of chunk unnamed-chunk-1

4 Short Brief

  • Unlocking value through social media

  • Using Information Retrieval and Natural Language Processing techniques, marketers can efficiently retrieve and analyze unstructured data for marketing insights.

  • Tweets embedded on sites other than Twitter grab approximately 185 billion views per quarter (Koh, 2014).

5 Methodology

  • Crawling Tweets and Reddit comments via the Twitter Application Programming Interface (API) and the Reddit API
  • Data processing (converting tweets to a data frame; converting all letters to lower case; removing punctuations, numbers and stop words; as well as stemming and identifying synonyms)
  • Sentiment Score Analysis - analyzing the distribution of emotions
  • Term Frequency analysis - how frequently a word occurs in a document
  • LDA (Latent Dirichlet Allocation) topic modelling - analyzing topics in the Tweets and the proportion of different topics and hidden in these Tweets and comments.

6.1 Sentiment analysis using Twitter data (t1)

plot of chunk unnamed-chunk-2

6.2 Sentiment analysis using Twitter data (t2)

plot of chunk unnamed-chunk-3

6.3 Sample raw tweets

  • Figure 1 shows that the most dominant emotions are sadness, trust, and anticipation. The following raw tweets about sugar tax might offer some anecdotal evidence on the complexity of this issue.
  • “Yes and how sugar tax doesn't actually do anything that it is intended to do. It creates the opposite effect.”
  • “Did you read the details of the tax? It wasn't on sugar - it was on all sweetened drinks - including zero calorie diet drinks. Therefore did nothing to encourage consumer switching to 0cal drinks.
  • "It's a just a new tax built on the back of guilt. People should have less sugar, preferably none but this is just a way for the govt to make money, not to solve obesity.”

6.4 Sample raw tweets

  • “Scrap the sugar tax Theresa May.”
  • “Childhood obesity rates have established, having peaked in 2003/04.”
  • “ ### said he would tax the poor because they waste too much money on crazy stuff like diet soda. Taxes would keep them from wasting money and will keep them poor. Did this ### go to ### U?

6.5 Sample raw tweets

plot of chunk unnamed-chunk-4

6.6 Sentiment analysis using Reddit data

plot of chunk unnamed-chunk-5

7 Term frequency analysis using Twitter data

plot of chunk unnamed-chunk-6

8.1 Topic modeling using LDA (Andrew Ng.)

plot of chunk unnamed-chunk-7

8.2 Topic modeling using LDA (Andrew Ng.)

plot of chunk unnamed-chunk-8

9 Social network analysis

plot of chunk unnamed-chunk-9

10 WrodCloud

plot of chunk unnamed-chunk-10

11 Discussion/Conclusions

  • Our analysis would allow a natural observation of people's reactions toward sugar tax as more states are considering to exercise tax on sugar-sweetened beverages.
  • Sentiment analysis using several other text mining packages also reveal that the evolution of topics on Twitter seems to be highly correlated with the news media and the public attention.
  • Not all news receives equal attentions on Twitter. Political events seem to dominant the conversation about sugar tax on Twitter. More research is required to analyze if there are causal relationships between sentiments and policy movement toward sugar tax.