11/4/2016

Introduction

The goal of this project is to create a web page presentation using R Markdown that features a plot created with Plotly. The webpage is hosted on RPubs. I thought it would be fun and creative to recreate a spurious correlation plot from Tyler Vigen's website.

Data

The data come from U.S. Office of Management and Budget and Centers for Disease Control & Prevention. The plot correlates US spending on science, space, and technology and Suicides by hanging, strangulation, and suffocation. The correlation coefficient across these two variables is a whopping 99%!

year<-c(1999,2000,2001,2002,2003,2004,2005,2006,2007,2008,2009)
suicides<-c(5427,5688,6198,6462,6635,7336,7248,7491,8161,8578,9000)
spending<-c(18.079,18.594,19.753,20.734,20.831,23.029,23.597,23.584,
            25.525,27.731,29.449)
dat<-as.data.frame(cbind(year,suicides,spending))
cor(suicides, spending)
## [1] 0.9920817

Does U.S. spending on science, space, and technology cause people to commit suicide?

hide me

US spending on science, space, and technology and Suicides by hanging, strangulation, and suffocation

No, it doesn't.

There is no connection between spending on science, space, and technology and suicides by hanging, strangulation and suffocation, and no logical argument to think that there would be. So although this graph shows that the two things are correlated, it is an example of a spurious correlation.

Spurious correlations are the reason why having a consistent, logical theory is important. If you think one thing is being caused by another and have data to show the two are connected, to make a credible case you need to be able to explain why they are connected, and give a theory that makes testable predictions about the relationship between the two things. -John Aziz