The data set is from a case-control study of smoking and Alzheimer’s disease. The data set has two variables of main interest:

library(tidyverse)

# Import data
data("alzheimer", package = "coin")

# create a table
tbl <- xtabs(~disease + smoking, alzheimer)
ftable(tbl)
##                 smoking None <10 10-20 >20
## disease                                   
## Alzheimer                126  15    30  27
## Other dementias           79   8    33  44
## Other diagnoses          104   5    47  20
# create a mosaic plot from the table
library(vcd)
mosaic(tbl, 
       shade = TRUE,
       legend = TRUE,
       labeling_args = list(set_varnames = c(disease = "")),
       set_labels = list(disease = c("Alzheimer", "Other\ndementias", "Other\ndiagnoses")))

Q1 Describe the largest group that has Alzheimer. Discuss it by number of cigarettes per day.

The Non smoker group, is the largest group that has Alzheimer. The more likely you are to have altzheimer if you dont smoke.

Q2 Describe one group that has more cases than expected given independence (by chance). Discuss it by number of cigarettes per day.

A group that has more cases than expected would be other dementias who consume a minimum of 20 cigarettes a day.

Q3 Does smoking seem to matter in determining Alzheimer? Discuss your reason using the masaic chart above.

No I dont think it matters, because the non smokers are the largest group with Alzheimer.

Q4 Create correlation plot for RailTrail.

Hint: The RailTrail data set is from the mosaicData package.

data(RailTrail, package="mosaicData")

# select numeric variables
df <- dplyr::select_if(RailTrail, is.numeric)

# calulate the correlations
r <- cor(df, use="complete.obs")
round(r,2)
##            hightemp lowtemp avgtemp spring summer  fall cloudcover precip
## hightemp       1.00    0.66    0.92  -0.33   0.67 -0.40      -0.10   0.13
## lowtemp        0.66    1.00    0.90  -0.39   0.74 -0.41       0.37   0.37
## avgtemp        0.92    0.90    1.00  -0.39   0.77 -0.44       0.14   0.27
## spring        -0.33   -0.39   -0.39   1.00  -0.74 -0.47      -0.10  -0.25
## summer         0.67    0.74    0.77  -0.74   1.00 -0.24       0.17   0.34
## fall          -0.40   -0.41   -0.44  -0.47  -0.24  1.00      -0.08  -0.09
## cloudcover    -0.10    0.37    0.14  -0.10   0.17 -0.08       1.00   0.37
## precip         0.13    0.37    0.27  -0.25   0.34 -0.09       0.37   1.00
## volume         0.58    0.18    0.43  -0.04   0.23 -0.25      -0.37  -0.23
##            volume
## hightemp     0.58
## lowtemp      0.18
## avgtemp      0.43
## spring      -0.04
## summer       0.23
## fall        -0.25
## cloudcover  -0.37
## precip      -0.23
## volume       1.00

Q5 What variables have positve correlation with the number of trail users (volume)?

The variables that have positive correlation with the number of trail users are Hightemp, AverageTemp, Lowtemp and summer.

Q8 Hide the messages, the code and its results on the webpage.

Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.

Q9 Display the title and your name correctly at the top of the webpage.

Q10 Use the correct slug.