Assignment Description

You are an analyst for Initech Analytica (IA), a nefarious policy research company. IA has been hired by the Oat Milk Advocacy Group to create a misinformation campaign against the dairy industry. In the notes for this unit, I gave an example of lying with scatter plots using a spurious correlation between mozzarella and deaths by poisoning. Your assignment is to create a similar chart selecting a different variety of cheese (i.e., you may not use mozzarella).

Data

Load the ‘cheese’ and ‘injury mortality rates for the US’ data sets from Professor Suleiman’s website. Display an overview of each data set (respectively).

## # A tibble: 6 x 9
##    year cheddar mozzarella swiss  blue brick muenster neufchatel hispanic
##   <dbl>   <dbl>      <dbl> <dbl> <dbl> <dbl>    <dbl>      <dbl>    <dbl>
## 1  1995    9.04       7.89  1.09  0.16  0.04     0.41       2.04    NA   
## 2  1996    9.19       8.22  1.07  0.17  0.04     0.39       2.11     0.25
## 3  1997    9.51       8.16  0.99  0.18  0.03     0.37       2.25     0.25
## 4  1998    9.6        8.33  1.01 NA     0.03     0.34       2.2      0.27
## 5  1999   10.0        8.74  1.09 NA     0.03     0.28       2.26     0.3 
## 6  2000    9.87       9.05  1.02 NA     0.03     0.3        2.39     0.33
## # A tibble: 6 x 17
##    Year Sex   `Age group (yea… Race  `Injury mechani… `Injury intent` Deaths
##   <dbl> <chr> <chr>            <chr> <chr>            <chr>            <dbl>
## 1  2016 Both… All Ages         All … All Mechanisms   All Intentions  231991
## 2  2015 Both… All Ages         All … All Mechanisms   All Intentions  214008
## 3  2014 Both… All Ages         All … All Mechanisms   All Intentions  199752
## 4  2013 Both… All Ages         All … All Mechanisms   All Intentions  192945
## 5  2012 Both… All Ages         All … All Mechanisms   All Intentions  190385
## 6  2011 Both… All Ages         All … All Mechanisms   All Intentions  187464
## # … with 10 more variables: Population <dbl>, `Age Specific Rate` <dbl>, `Age
## #   Specific Rate Standard Error` <dbl>, `Age Specific Rate Lower Confidence
## #   Limit` <dbl>, `Age Specific Rate Upper Confidence Limit` <dbl>, `Age
## #   Adjusted Rate` <dbl>, `Age Adjusted Rate Standard Error` <dbl>, `Age
## #   Adjusted Rate Lower Confidence Limit` <dbl>, `Age Adjusted Rate Upper
## #   Confidence Limit` <dbl>, Unit <chr>

Simplify the deaths data set by (a) filtering based on all injury mechanisms, all age groups, all races, both sexes and specific injury intent, (b) selecting only year, injury intent and death count and (c) pivoting the resulting subset into a wide format to make it easier to setup a correlation matrix.

Create a new data set that correlates injury intent by muenster cheese by (a) filtering the cheese data set by muenster, (b) joining the results with the simplified deaths data set by year and (c) creating a correlation matrix. Display the correlation matrix.

##                          muenster Unintentional    Suicide    Homicide
## muenster                1.0000000    0.86434680  0.9555510 -0.25343850
## Unintentional           0.8643468    1.00000000  0.9304585  0.02860829
## Suicide                 0.9555510    0.93045853  1.0000000 -0.23414469
## Homicide               -0.2534385    0.02860829 -0.2341447  1.00000000
## Undetermined            0.2930508    0.65625952  0.4640975  0.28688435
## Legal intervention/war  0.8720061    0.83474497  0.8602972 -0.11764121
##                        Undetermined Legal intervention/war
## muenster                  0.2930508              0.8720061
## Unintentional             0.6562595              0.8347450
## Suicide                   0.4640975              0.8602972
## Homicide                  0.2868843             -0.1176412
## Undetermined              1.0000000              0.3248518
## Legal intervention/war    0.3248518              1.0000000

Charts

From the correlation matrix above, there are 3 injury intents that are highly correlated (coefficient > 0.8) with muenster cheese - suicide, legal intervention / war, and unintentional. Suicide will be used given it is shows the highest correlation coefficient at 0.956.

Display a scatter plot that shows the relationship between muenster cheese and suicide. Include a trend line and misleading title.