Load necessary packages and the dataset in question:
library(ggplot2)
library(dplyr)
library(broom)
library(knitr)
load(url("http://www.openintro.org/stat/data/evals.RData"))
evals <- evals %>%
select(score, ethnicity, gender, language, age, bty_avg, rank)Exploratory data analysis: Create a visualization that shows the relationship between:
Comment on this relationship.
# Add your code to create visualization below:
ggplot(data=evals, mapping=aes(x=age,y=score))+
geom_point()+
labs(x="Instructor Age", y="Teaching Score")+
geom_smooth(method = "lm", se=FALSE)Comment here:
%>% the table into kable(digits=3) to get a cleanly outputted table with 3 significant digits.# Add your code to create regression table below:
lm(score~age, data=evals)%>%
tidy()%>%
kable(digits=3)| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | 4.462 | 0.127 | 35.195 | 0.000 |
| age | -0.006 | 0.003 | -2.311 | 0.021 |
Answer the two other questions here:
Does there seem to be a systematic pattern in the lack-of-fit of the model? In other words, is there a pattern in the error between the fitted score \(\widehat{y}\) and the observed score \(y\)? Hint:
geom_hline(yintercept=0, col="blue") adds a blue horizontal line at \(y=0\).
# Add the code necessary to answer this question below:
point_by_point_info <- lm(score~age, data=evals) %>%
augment() %>%
select(score,age, .fitted, .resid)
ggplot(point_by_point_info, aes(x=age, y=.resid))+
geom_point()+
geom_hline(yintercept = 0, col="blue")+
labs(x="Age", y ="Residual")Comment here:
Say an college administrator wants to model teaching scores using more than one predictor/explantory variable than just age, in particular using the instructor’s gender as well. Create a visualization that summarizes this relationship and comment on the observed relationship.
# Add your code to create visualization below:
ggplot(data=evals, mapping=aes(x=age, y=score, color=gender))+
geom_point()+
labs(x="Instructor Age", y="Teaching Score")+
geom_smooth(method = "lm", se=FALSE)Comment here: