Basic Investigation Into Tweet Analysis: Climate Emergency v Climate Hoax

Q How responsible for the concept of ‘Climate Hoax’ is Trump?

Introduction

Tweet analysis is an incredibly complex area of research to tackle, however it could play a vital role to both identify key trends within our society and highlight the origins of culture memes; in this instance ‘Climate Emergency’ and ‘Climate Hoax’. For the purpose of this basic analysis we will try to look at: - Location - Tweet Time Series - Associated Hashtags - Assocaited key words

1: Get Data using rtweet and the twitter API

## how the code looks
#climate_hoax <- search_tweets('climate change hoax', n=10000, lang = 'en',
                         #include_rts = FALSE)

2: Load the data (Markdown doesn’t seem to allow the Twitter API to run)

3: Clean the data as much as possible, removing NAs and null values

users2 = data.frame(climate_emergency$location)
users2= data.frame(users2 %>% 
                    mutate_all(~ifelse(. %in% c("N/A", "null", ""), NA, .)) %>% 
                    na.omit())
users2 = rename(users2, location = climate_emergency.location)

users1 = data.frame(climate_hoax$location)
users1= data.frame(users1 %>% 
                     mutate_all(~ifelse(. %in% c("N/A", "null", ""), NA, .)) %>% 
                     na.omit())
users1 = rename(users1, location = climate_hoax.location)

4: Attempt to visualise the data. Where are the tweets coming from?

## Selecting by n

## Selecting by n

Notes on Location:

Data cleaning would probably be useful to group together regions and similar names to provide better analysis.

Key points on Location:

Basic analysis seems to suggest high frequency of climate hoax tweets in the US and climate emergency tweets in the UK.

4: Look at a time-series to evalute the frequency of both tweets

Key points on Time-Series:

The frequency seems to be very different with the two tweets.

‘Climate Hoax’ is almost non-existent in twitter and then has huge peaks in usage, likely when it trends. It would be interesting to perform further analysis to find out why this is. It would be good to correlate the day with some key twitter accounts or events to see if they are causing the peaks - the peak on 30th Sept would appeat to correlate with the US presidential debate for example.

‘Climate Emergency’ on the other hand seems to be more consistent it terms of it’s frequency, but also having similar peaks to the hoax graph, suggesting ‘twitter battle’.

5: Investigate Associated Hashtags

## Selecting by n
## Selecting by n

Key points on Hashtags:

The main takeaway from this is the relationship between the tweets and other issues: - climate hoax seems to be connected with the US Election - climate emergency seems to be more connected with sustainable/green agendas

6: Key Word Analysis

Conclusion

Orange Trump says it all really.