An Empirical Analysis of the Kanye West Conjecture

In September 2009, Taylor Swift recieved a MTV Video Music Award for her music video “You Belong with Me”. In the middle of her acceptance speech, Kanye West got up on the stage and interrupted her to exclaim that Beyoncé had “one of the best videos of all time!”. In February 2016, Kanye West released his album “Life of Pablo” where he posits that his actions ultimately made Taylor Swift famous. The purpose of this analysis to empirically examine the extent to which Kanye’s actions had a causal effect on Taylor Swift’s career.

Things to note:

library("gtrendsR")  #Gets data
library("CausalImpact")  #For analysis
library(zoo)  #Needed for time series objects

# Reading in data
tswift <- gtrends(keyword = "Taylor Swift", time = "all")

kwest <- gtrends(keyword = "Kanye West", time = "all")

# Converting Google Trends '< 1' to 0
tswift$interest_over_time$hits <- gsub("<1", 0, tswift$interest_over_time$hits)

kwest$interest_over_time$hits <- gsub("<1", 0, kwest$interest_over_time$hits)

data <- data.frame(tswift = as.numeric(tswift$interest_over_time$hits), kwest = as.numeric(kwest$interest_over_time$hits))

data <- zoo(data, tswift$interest_over_time$date)

pre.period <- as.Date(c("2004-01-01", "2009-09-01"))
post.period <- as.Date(c("2009-10-01", "2019-01-01"))

impact <- CausalImpact(data, pre.period, post.period)

plot(impact)

summary(impact)
## Posterior inference {CausalImpact}
## 
##                          Average        Cumulative    
## Actual                   42             4717          
## Prediction (s.d.)        14 (2.2)       1548 (250.5)  
## 95% CI                   [9.4, 18]      [1049.9, 2051]
##                                                       
## Absolute effect (s.d.)   28 (2.2)       3169 (250.5)  
## 95% CI                   [24, 33]       [2666, 3667]  
##                                                       
## Relative effect (s.d.)   205% (16%)     205% (16%)    
## 95% CI                   [172%, 237%]   [172%, 237%]  
## 
## Posterior tail-area probability p:   0.001
## Posterior prob. of a causal effect:  99.8999%
## 
## For more details, type: summary(impact, "report")
summary(impact, "report")
## Analysis report {CausalImpact}
## 
## 
## During the post-intervention period, the response variable had an average value of approx. 42.12. By contrast, in the absence of an intervention, we would have expected an average response of 13.83. The 95% interval of this counterfactual prediction is [9.37, 18.31]. Subtracting this prediction from the observed response yields an estimate of the causal effect the intervention had on the response variable. This effect is 28.29 with a 95% interval of [23.81, 32.74]. For a discussion of the significance of this effect, see below.
## 
## Summing up the individual data points during the post-intervention period (which can only sometimes be meaningfully interpreted), the response variable had an overall value of 4.72K. By contrast, had the intervention not taken place, we would have expected a sum of 1.55K. The 95% interval of this prediction is [1.05K, 2.05K].
## 
## The above results are given in terms of absolute numbers. In relative terms, the response variable showed an increase of +205%. The 95% interval of this percentage is [+172%, +237%].
## 
## This means that the positive effect observed during the intervention period is statistically significant and unlikely to be due to random fluctuations. It should be noted, however, that the question of whether this increase also bears substantive significance can only be answered by comparing the absolute effect (28.29) to the original goal of the underlying intervention.
## 
## The probability of obtaining this effect by chance is very small (Bayesian one-sided tail-area probability p = 0.001). This means the causal effect can be considered statistically significant.