The data set is from a case-control study of smoking and Alzheimer’s disease. The data set has two variables of main interest:

## ── Attaching packages ────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.1     ✔ purrr   0.3.2
## ✔ tibble  2.1.3     ✔ dplyr   0.8.3
## ✔ tidyr   0.8.3     ✔ stringr 1.4.0
## ✔ readr   1.3.1     ✔ forcats 0.4.0
## ── Conflicts ───────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
##                 smoking None <10 10-20 >20
## disease                                   
## Alzheimer                126  15    30  27
## Other dementias           79   8    33  44
## Other diagnoses          104   5    47  20
## Loading required package: grid

Q1 Describe the largest group that has Alzheimer. Discuss it by number of cigarettes per day.

The largest group that has Alzheimers is the none smokers. The largest group smokes no cigarettes.

Q2 Describe one group that has more cases than expected given independence (by chance). Discuss it by number of cigarettes per day.

The one group that has more cases than expected given independence is the dementias and they smoke about 20 cigarettes per day.

Q3 Does smoking seem to matter in determining Alzheimer? Discuss your reason using the masaic chart above.

No smoking does not seem to matter due to the faact that out of the group of people who have alzheimers, non smokers have it most. ## Q4 Create correlation plot for RailTrail. Hint: The RailTrail data set is from the mosaicData package.

##            hightemp lowtemp avgtemp spring summer  fall cloudcover precip
## hightemp       1.00    0.66    0.92  -0.33   0.67 -0.40      -0.10   0.13
## lowtemp        0.66    1.00    0.90  -0.39   0.74 -0.41       0.37   0.37
## avgtemp        0.92    0.90    1.00  -0.39   0.77 -0.44       0.14   0.27
## spring        -0.33   -0.39   -0.39   1.00  -0.74 -0.47      -0.10  -0.25
## summer         0.67    0.74    0.77  -0.74   1.00 -0.24       0.17   0.34
## fall          -0.40   -0.41   -0.44  -0.47  -0.24  1.00      -0.08  -0.09
## cloudcover    -0.10    0.37    0.14  -0.10   0.17 -0.08       1.00   0.37
## precip         0.13    0.37    0.27  -0.25   0.34 -0.09       0.37   1.00
## volume         0.58    0.18    0.43  -0.04   0.23 -0.25      -0.37  -0.23
##            volume
## hightemp     0.58
## lowtemp      0.18
## avgtemp      0.43
## spring      -0.04
## summer       0.23
## fall        -0.25
## cloudcover  -0.37
## precip      -0.23
## volume       1.00

Q5 What variables have positve correlation with the number of trail users (volume)?

The variables that have a correlaton with the number of trail users are high temp, average temp and low temp because they are all positive. ## Q6 What season seems to be most popular for trail users? Summer is the most popular season for trail users. ## Q7 The correlation coefficient between hightemp and cloudcover is quite small. Would you be sure that the two variables are not related at all? Create scatter plot. After examing the scatter plot, would you conclude that the two variables are not related at all? Hint: Discuss your reason by explaining your scatter plot.

Q8 Hide the messages, the code and its results on the webpage.

Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.

Q9 Display the title and your name correctly at the top of the webpage.

Q10 Use the correct slug.