The data set is from a case-control study of smoking and Alzheimer’s disease. The data set has two variables of main interest:
smoking a factor with four levels “None”, “<10”, “10-20”, and “>20” (cigarettes per day)disease a factor with three levels “Alzheimer”, “Other dementias”, and “Other diagnoses”.The largest group with alzeimers is the group that does not smoke ciggeretes. 0 Ciggeretes a day.
The group that has more cases is the group with dementia they smoke on average 20 cigs a day ## Q3 Does smoking seem to matter in determining Alzheimer? Discuss your reason using the masaic chart above. it does not look like smoking is connected to altztimers ## Q4 Create correlation plot for RailTrail. Hint: The RailTrail data set is from the mosaicData package. ## Q5 What variables have positve correlation with the number of trail users (volume)? The positive correlation is the hightemp ## Q6 What season seems to be most popular for trail users? Summer seems to be the most popular ## Q7 The correlation coefficient between
hightemp and cloudcover is quite small. Would you be sure that the two variables are not related at all? Create scatter plot. After examing the scatter plot, would you conclude that the two variables are not related at all?library(ggplot2)
Hint: Discuss your reason by explaining your scatter plot. There not related because the scatter plot is non linear. ## Q8 Hide the messages, the code and its results on the webpage. Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.