RailTrail.hightemp and cloudcover is quite small. Would you be sure that the two variables are not related at all?The data set is from a case-control study of smoking and Alzheimer’s disease. The data set has two variables of main interest:
smoking a factor with four levels “None”, “<10”, “10-20”, and “>20” (cigarettes per day)disease a factor with three levels “Alzheimer”, “Other dementias”, and “Other diagnoses”.## ── Attaching packages ────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.0 ✓ purrr 0.3.3
## ✓ tibble 2.1.3 ✓ dplyr 0.8.5
## ✓ tidyr 1.0.2 ✓ stringr 1.4.0
## ✓ readr 1.3.1 ✓ forcats 0.5.0
## ── Conflicts ───────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
## Loading required package: grid
The largest group of people with other dimentias is non-smokers. In the graph showed, and the data given, the largest number of people with other dimentias were those who do not smoke every day.
It is most surprising that the group of non-smokers contracts the most disease but reading he data again it surprised me more that there were not more disease cases with smokers who smoke more than 20 cigarettes a day. In total there were more cases of disease with smokers who smoked 10-20 cigarettes per day.
It does not seem to matter if the majority of those who develop other dimentias are non-smokers.
RailTrail.Hint: The RailTrail data set is from the mosaicData package.
The four variables that have negative correlation with the number of trail users are: spring, fal, cloud cover, and precipitation.
The season that seems to be the least popular for trail users is fall
Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.