## <p style="color:blue"> **Grading the professor** </p>
# I have not been able to figure out yet how to keep the header sizing in play when I apply color to body text.
## <p style="color:blue"> **Grading the professor** </p>
##

Grading the professor

#<p style="color:blue"> ## **Grading the professor** </p>

## Grading the professor

## <p style="color:blue"> Grading the professor </p> 
##

Grading the professor

#> Roses are [red and **bold**]{color="red"} and
#> violets are [blue]{color="blue"}.

Roses are red and bold and violets are blue.

# Roses are \textcolor{red}{red}, violets are \textcolor{blue}{blue}.

Roses are , violets are .

colorize <- function(x, color) {
  if (knitr::is_latex_output()) {
    sprintf("\\textcolor{%s}{%s}", color, x)
  } else if (knitr::is_html_output()) {
    sprintf("<span style='color: %s;'>%s</span>", color, 
      x)
  } else x
}

We can then use the code in an inline R expression some words in red, which will create some words in red (you may not see the red color if you are reading this book printed in black and white).

## `r colorize("Grading the Professor", "blue")`

Grading the Professor

## **`r colorize("Grading the Professor", "blue")`**

Grading the Professor

## **`r colorize("Grading the Professor", "light blue")`**

Grading the Professor

# Now, I need to work on learning names for the color palette!  

Grading the Professor

The data

download.file("http://www.openintro.org/stat/data/evals.RData", destfile = "evals.RData")
load("evals.RData")
head(evals)
##   score         rank    ethnicity gender language age cls_perc_eval
## 1   4.7 tenure track     minority female  english  36      55.81395
## 2   4.1 tenure track     minority female  english  36      68.80000
## 3   3.9 tenure track     minority female  english  36      60.80000
## 4   4.8 tenure track     minority female  english  36      62.60163
## 5   4.6      tenured not minority   male  english  59      85.00000
## 6   4.3      tenured not minority   male  english  59      87.50000
##   cls_did_eval cls_students cls_level cls_profs  cls_credits bty_f1lower
## 1           24           43     upper    single multi credit           5
## 2           86          125     upper    single multi credit           5
## 3           76          125     upper    single multi credit           5
## 4           77          123     upper    single multi credit           5
## 5           17           20     upper  multiple multi credit           4
## 6           35           40     upper  multiple multi credit           4
##   bty_f1upper bty_f2upper bty_m1lower bty_m1upper bty_m2upper bty_avg
## 1           7           6           2           4           6       5
## 2           7           6           2           4           6       5
## 3           7           6           2           4           6       5
## 4           7           6           2           4           6       5
## 5           4           2           2           3           3       3
## 6           4           2           2           3           3       3
##   pic_outfit pic_color
## 1 not formal     color
## 2 not formal     color
## 3 not formal     color
## 4 not formal     color
## 5 not formal     color
## 6 not formal     color
variable.names(evals)
##  [1] "score"         "rank"          "ethnicity"     "gender"       
##  [5] "language"      "age"           "cls_perc_eval" "cls_did_eval" 
##  [9] "cls_students"  "cls_level"     "cls_profs"     "cls_credits"  
## [13] "bty_f1lower"   "bty_f1upper"   "bty_f2upper"   "bty_m1lower"  
## [17] "bty_m1upper"   "bty_m2upper"   "bty_avg"       "pic_outfit"   
## [21] "pic_color"
Variable Description
score average professor evaluation score: (1) very unsatisfactory - (5) excellent
rank rank of professor: teaching, tenure track, tenured.
ethnicity ethnicity of professor: not minority, minority
gender gender of professor: female, male.
language language of school where professor received education: english or non-english.
age age of professor.
cls_perc_eval percent of students in class who completed evaluation.
cls_did_eval number of students in class who completed evaluation.
cls_students total number of students in class.
cls_level class level: lower, upper.
cls_profs number of professors teaching sections in course in sample: single, multiple.
cls_credits number of credits of class: one credit (lab, PE, etc.), multi credit.
bty_f1lower beauty rating of professor from lower level female: (1) lowest - (10) highest.
bty_f1upper beauty rating of professor from upper level female: (1) lowest - (10) highest.
bty_f2upper beauty rating of professor from second upper level female: (1) lowest - (10) highest.
bty_m1lower beauty rating of professor from lower level male: (1) lowest - (10) highest.
bty_m1upper beauty rating of professor from upper level male: (1) lowest - (10) highest.
bty_m2upper beauty rating of professor from second upper level male: (1) lowest - (10) highest.
bty_avg average beauty rating of professor.
pic_outfit outfit of professor in picture: not formal, formal.
pic_color color of professor’s picture: color, black & white.
plot(evals) # just experimenting and I appreciated the visual so left it in!#

Exploring the data

Exercise 1

Is this an observational study or an experiment? The original research question posed in the paper is whether beauty leads directly to the differences in course evaluations. Given the study design, is it possible to answer this question as it is phrased? If not, rephrase the question.
Answer: This is an observational study. There are no randomized assigned treatments.
It is not possible to answer this question as it is phrased. Use of the words “leads directly to” implies causation. Since this is an observational study we can not make conclusion regarding causality. The wording is also lacking in specificity. A potential rephrasing of the question is: Whether college student’s perceptions of their professors attractiveness is significantly correlated with differences in their course evaluation.

Exercise 2

Describe the distribution of score. Is the distribution skewed? What does that tell you about how students rate courses? Is this what you expected to see? Why, or why not?
score <- evals$score
hist(score)

summary(score)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   2.300   3.800   4.300   4.175   4.600   5.000
boxplot(score)

qqnorm((evals$score))
qqline((evals$score))

Answer: As shown in the histogram and boxplot above, the distribute is left-skewed, which is what I expected to see. The mean score was 4.175. Students will generally review their professors favorably.

Exercise 3

Excluding score, select two other variables and describe their relationship using an appropriate visualization (scatterplot, side-by-side boxplots, or mosaic plot).
gender <- c(evals$gender)
age <- c(evals$age)
rank <- c(evals$rank)
ethnicity <- c(evals$ethnicity)
btyavg <-c(evals$bty_avg)
scatter.smooth(evals$bty_avg~evals$age)

Answer: According to the scatterplot above the average evaluation of beauty for professors exhibits an inverse relationship to age. As professors age they are rated less attractive on average.
scatter.smooth(evals$bty_avg ~ gender)
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric = FALSE, :
## pseudoinverse used at 0.995
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric = FALSE, :
## neighborhood radius 1.005
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric = FALSE, :
## reciprocal condition number 1.643e-015
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric = FALSE, :
## There are other near singularities as well. 1.01
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric = FALSE, :
## pseudoinverse used at 0.995
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric = FALSE, :
## neighborhood radius 1.005
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric = FALSE, :
## reciprocal condition number 0
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric = FALSE, :
## There are other near singularities as well. 1.01
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric = FALSE, :
## pseudoinverse used at 0.995
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric = FALSE, :
## neighborhood radius 1.005
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric = FALSE, :
## reciprocal condition number 0
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric = FALSE, :
## There are other near singularities as well. 1.01
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric = FALSE, :
## pseudoinverse used at 0.995
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric = FALSE, :
## neighborhood radius 1.005
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric = FALSE, :
## reciprocal condition number 5.1232e-016
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric = FALSE, :
## There are other near singularities as well. 1.01
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric = FALSE, :
## pseudoinverse used at 0.995
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric = FALSE, :
## neighborhood radius 1.005
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric = FALSE, :
## reciprocal condition number 1.643e-015
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric = FALSE, :
## There are other near singularities as well. 1.01

plot(evals$bty_avg ~ ethnicity)

Answer: Males and females appears to generally receive similar overall attractiveness averages, however the males were slightly lower. Minorities have far fewer numbers represented than non-minorities. The attractiveness averages appear to be similarly distributed taking account for the difference in population size.

Simple linear regression

plot(evals$score ~ evals$bty_avg)

nrow(evals)
## [1] 463
summary(evals)
##      score                 rank            ethnicity      gender   
##  Min.   :2.300   teaching    :102   minority    : 64   female:195  
##  1st Qu.:3.800   tenure track:108   not minority:399   male  :268  
##  Median :4.300   tenured     :253                                  
##  Mean   :4.175                                                     
##  3rd Qu.:4.600                                                     
##  Max.   :5.000                                                     
##         language        age        cls_perc_eval     cls_did_eval   
##  english    :435   Min.   :29.00   Min.   : 10.42   Min.   :  5.00  
##  non-english: 28   1st Qu.:42.00   1st Qu.: 62.70   1st Qu.: 15.00  
##                    Median :48.00   Median : 76.92   Median : 23.00  
##                    Mean   :48.37   Mean   : 74.43   Mean   : 36.62  
##                    3rd Qu.:57.00   3rd Qu.: 87.25   3rd Qu.: 40.00  
##                    Max.   :73.00   Max.   :100.00   Max.   :380.00  
##   cls_students    cls_level      cls_profs         cls_credits   bty_f1lower   
##  Min.   :  8.00   lower:157   multiple:306   multi credit:436   Min.   :1.000  
##  1st Qu.: 19.00   upper:306   single  :157   one credit  : 27   1st Qu.:2.000  
##  Median : 29.00                                                 Median :4.000  
##  Mean   : 55.18                                                 Mean   :3.963  
##  3rd Qu.: 60.00                                                 3rd Qu.:5.000  
##  Max.   :581.00                                                 Max.   :8.000  
##   bty_f1upper     bty_f2upper      bty_m1lower     bty_m1upper   
##  Min.   :1.000   Min.   : 1.000   Min.   :1.000   Min.   :1.000  
##  1st Qu.:4.000   1st Qu.: 4.000   1st Qu.:2.000   1st Qu.:3.000  
##  Median :5.000   Median : 5.000   Median :3.000   Median :4.000  
##  Mean   :5.019   Mean   : 5.214   Mean   :3.413   Mean   :4.147  
##  3rd Qu.:7.000   3rd Qu.: 6.000   3rd Qu.:5.000   3rd Qu.:5.000  
##  Max.   :9.000   Max.   :10.000   Max.   :7.000   Max.   :9.000  
##   bty_m2upper       bty_avg           pic_outfit        pic_color  
##  Min.   :1.000   Min.   :1.667   formal    : 77   black&white: 78  
##  1st Qu.:4.000   1st Qu.:3.167   not formal:386   color      :385  
##  Median :5.000   Median :4.333                                     
##  Mean   :4.752   Mean   :4.418                                     
##  3rd Qu.:6.000   3rd Qu.:5.500                                     
##  Max.   :9.000   Max.   :8.167
Answer: The dataframe evals has total 463 observations. The summary says that there is no missing observation which would represent ‘NA’. So the scatter plot should have 463 points. The scatter plot shows far fewer.

Exercise 4

Replot the scatterplot, but this time use the function jitter() on the y- or the x-coordinate. (Use ?jitter to learn more.) What was misleading about the initial scatterplot?
plot(jitter(evals$score) ~ evals$bty_avg)

plot(jitter(evals$score) ~ jitter(evals$bty_avg))

Answer: The first scatterplot was misleading because it eliminated some datapoints that overlapped, reducing some of the visual noise. While that may make the larger picture easier to read, it may obscure some of the nuances in the data which the jitter() function would expose.

Exercise 5

Let’s see if the apparent trend in the plot is something more than natural variation. Fit a linear model called m_bty to predict average professor score by average beauty rating and add the line to your plot using abline(m_bty). Write out the equation for the linear model and interpret the slope. Is average beauty score a statistically significant predictor? Does it appear to be a practically significant predictor?
m_bty<- lm(score ~ bty_avg, data = evals)
plot(jitter(evals$score) ~ evals$bty_avg)
abline(m_bty)

summary(m_bty)
## 
## Call:
## lm(formula = score ~ bty_avg, data = evals)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.9246 -0.3690  0.1420  0.3977  0.9309 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.88034    0.07614   50.96  < 2e-16 ***
## bty_avg      0.06664    0.01629    4.09 5.08e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5348 on 461 degrees of freedom
## Multiple R-squared:  0.03502,    Adjusted R-squared:  0.03293 
## F-statistic: 16.73 on 1 and 461 DF,  p-value: 5.083e-05
Answer: The slope of the m_bty line is 0.0664 which is consistent with the slightly increasing line indicating a positive correlation between the evaluation score and the beauty rating.
y^=b0+b1x
y^=3.88034+0.06664×bty_avg
The P-value, of essentially zero, is also supportive of this positive correlation between the evaluation score and the beauty rating.

Exercise 6

Use residual plots to evaluate whether the conditions of least squares regression are reasonable. Provide plots and comments for each one (see the Simple Regression Lab for a reminder of how to make these).
#Histogram
hist(m_bty$residuals)

# normal probability plot of the residuals
qqnorm(m_bty$residuals)
qqline(m_bty$residuals)

plot(m_bty$residuals ~ evals$bty_avg)
abline(h = 0, lty = 5)  # adds a horizontal dashed line at y = 0

Answer: The least squares regression works by making the total of the square of the errors as small as possible. It is a form of regression analysis which establishes the relationship between the dependent and independent variable along a linear line. The line is referred to as the “line of best fit”. The objective of the least squares regression is to ensure that the line drawn through the set of values provided establishes the closest relationship between the values. The conditions of least squares regression are as follows:
1. Linearity: The data should show a linear trend. The above qqplot shows us that the linearity requirement is reasonable in this case.
2. Nearly normal residuals: Generally the residuals must be nearly normal. When this condition is found to be unreasonable, it is usually because of outliers or concerns about influential points. The above histogram illustrates that the nearly normal residuals requirement is reasonable.
3. Constant Variability: The variability points around the least squares line remains roughly constant. The scatterplot above indicates that the variability points around the line are roughly constant, hence this requirement is reasonable.
4. Independent observations. We will have to verify that the study methodology included independent observations, but for the purposes of this lab will assume that they were.

Multiple linear regression

plot(evals$bty_avg ~ evals$bty_f1lower)

cor(evals$bty_avg, evals$bty_f1lower)
## [1] 0.8439112
plot(evals[,13:19])

m_bty_gen <- lm(score ~ bty_avg + gender, data = evals)
summary(m_bty_gen)
## 
## Call:
## lm(formula = score ~ bty_avg + gender, data = evals)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.8305 -0.3625  0.1055  0.4213  0.9314 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.74734    0.08466  44.266  < 2e-16 ***
## bty_avg      0.07416    0.01625   4.563 6.48e-06 ***
## gendermale   0.17239    0.05022   3.433 0.000652 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5287 on 460 degrees of freedom
## Multiple R-squared:  0.05912,    Adjusted R-squared:  0.05503 
## F-statistic: 14.45 on 2 and 460 DF,  p-value: 8.177e-07

Exercise 7

P-values and parameter estimates should only be trusted if the conditions for the regression are reasonable. Verify that the conditions for this model are reasonable using diagnostic plots.
# Normal Probability Plot
qqnorm(m_bty_gen$residuals)
qqline(m_bty_gen$residuals)

# residual plot against each predictor variable
plot(m_bty_gen$residuals ~ evals$bty_avg)
abline(h = 0, lty = 4)  # adds a horizontal dashed line at y = 0

plot(m_bty_gen$residuals ~ evals$gender)
abline(h = 0, lty = 4)  # adds a horizontal dashed line at y = 0

#Residual vs Fitted, Normal Probability Plot, Scale-Location, Residual vs Leverage
plot(m_bty_gen)

#Histogram
hist(m_bty_gen$residuals)

# Checking linearity
plot(jitter(evals$score) ~ evals$bty_avg)
abline(h = 4, lty = 4)  # adds a horizontal dashed line at y = 4

plot(evals$score ~ evals$gender)

par(mfrow=c(2, 1))  
boxplot(m_bty_gen$residuals~evals$bty_avg)
boxplot(m_bty_gen$residuals~evals$gender)

1. Linearity: The above qqplot shows us that the linearity requirement is reasonable in this case, although a slight skew is present.
2. Nearly normal residuals: The above histogram illustrates that the nearly normal residuals requirement is reasonable, however there is a left-skew.
3. Constant Variability: The scatterplot above indicates that the variability points around the line are roughly constant, hence this requirement is reasonable.
4. Independent observations. We will have to verify that the study methodology included independent observations, but for the purposes of this lab will assume that they were.

Exercise 8

Is bty_avg still a significant predictor of score? Has the addition of gender to the model changed the parameter estimate for bty_avg?
summary(m_bty_gen)
## 
## Call:
## lm(formula = score ~ bty_avg + gender, data = evals)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.8305 -0.3625  0.1055  0.4213  0.9314 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.74734    0.08466  44.266  < 2e-16 ***
## bty_avg      0.07416    0.01625   4.563 6.48e-06 ***
## gendermale   0.17239    0.05022   3.433 0.000652 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5287 on 460 degrees of freedom
## Multiple R-squared:  0.05912,    Adjusted R-squared:  0.05503 
## F-statistic: 14.45 on 2 and 460 DF,  p-value: 8.177e-07
Answer: 1. The P-value is closer to zero than previously, hence not only is beauty_avg is still a significant predictor, gender increased it’s significance as a predictor. 2. Yes the parameter estimate has changed, to .07416 from 0.06664 above.
multiLines(m_bty_gen)

Exercise 9

What is the equation of the line corresponding to males? (Hint: For males, the parameter estimate is multiplied by 1.) For two professors who received the same beauty rating, which gender tends to have the higher course evaluation score?
Answer:
Male_score = 3.74734 + .07(bty_avg)+.17239 x 1(gender_male)
Female_score = 3.74734+ .07(bty_avg)
Male professors tend to have higher score than females by 0.17239.

Exercise 10

Create a new model called m_bty_rank with gender removed and rank added in. How does R appear to handle categorical variables that have more than two levels? Note that the rank variable has three levels: teaching, tenure track, tenured.
m_bty_rank <- lm(score ~ bty_avg + rank, data = evals)
summary(m_bty_rank)
## 
## Call:
## lm(formula = score ~ bty_avg + rank, data = evals)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.8713 -0.3642  0.1489  0.4103  0.9525 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       3.98155    0.09078  43.860  < 2e-16 ***
## bty_avg           0.06783    0.01655   4.098 4.92e-05 ***
## ranktenure track -0.16070    0.07395  -2.173   0.0303 *  
## ranktenured      -0.12623    0.06266  -2.014   0.0445 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5328 on 459 degrees of freedom
## Multiple R-squared:  0.04652,    Adjusted R-squared:  0.04029 
## F-statistic: 7.465 on 3 and 459 DF,  p-value: 6.88e-05
Answer: R appears to handle categorical variables that have more than two levels as if it only has two levels. Level “0”, in this case “teaching” is embedded in the “(Intercept)” category and does not appear on the output list independently. In order to make that information visible the categorical variable labels will have to be adjusted to show 1, 2, and 3.
m_bty_rank_2 <- lm(score ~ bty_avg + relevel(rank, ref = "tenured"), data = evals)
summary(m_bty_rank_2)
## 
## Call:
## lm(formula = score ~ bty_avg + relevel(rank, ref = "tenured"), 
##     data = evals)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.8713 -0.3642  0.1489  0.4103  0.9525 
## 
## Coefficients:
##                                            Estimate Std. Error t value Pr(>|t|)
## (Intercept)                                 3.85532    0.07693  50.117  < 2e-16
## bty_avg                                     0.06783    0.01655   4.098 4.92e-05
## relevel(rank, ref = "tenured")teaching      0.12623    0.06266   2.014   0.0445
## relevel(rank, ref = "tenured")tenure track -0.03448    0.06244  -0.552   0.5811
##                                               
## (Intercept)                                ***
## bty_avg                                    ***
## relevel(rank, ref = "tenured")teaching     *  
## relevel(rank, ref = "tenured")tenure track    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5328 on 459 degrees of freedom
## Multiple R-squared:  0.04652,    Adjusted R-squared:  0.04029 
## F-statistic: 7.465 on 3 and 459 DF,  p-value: 6.88e-05
Applying the “relevel” function appeared to place the tenured level data in the Intercept category and then list the teaching and tenure track instead, but still only show two levels rather than all three.

The search for the best model

Exercise 11

Which variable would you expect to have the highest p-value in this model? Why? Hint: Think about which variable would you expect to not have any association with the professor score.
Answer: I would expect the “number of credits” (cls_credits) to be the variable with the least association with the professors’ evaluation scores.
m_full <- lm(score ~ rank + ethnicity + gender + language + age + cls_perc_eval 
             + cls_students + cls_level + cls_profs + cls_credits + bty_avg 
             + pic_outfit + pic_color, data = evals)
sum_m_full <- summary(m_full)
sum_m_full
## 
## Call:
## lm(formula = score ~ rank + ethnicity + gender + language + age + 
##     cls_perc_eval + cls_students + cls_level + cls_profs + cls_credits + 
##     bty_avg + pic_outfit + pic_color, data = evals)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.77397 -0.32432  0.09067  0.35183  0.95036 
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            4.0952141  0.2905277  14.096  < 2e-16 ***
## ranktenure track      -0.1475932  0.0820671  -1.798  0.07278 .  
## ranktenured           -0.0973378  0.0663296  -1.467  0.14295    
## ethnicitynot minority  0.1234929  0.0786273   1.571  0.11698    
## gendermale             0.2109481  0.0518230   4.071 5.54e-05 ***
## languagenon-english   -0.2298112  0.1113754  -2.063  0.03965 *  
## age                   -0.0090072  0.0031359  -2.872  0.00427 ** 
## cls_perc_eval          0.0053272  0.0015393   3.461  0.00059 ***
## cls_students           0.0004546  0.0003774   1.205  0.22896    
## cls_levelupper         0.0605140  0.0575617   1.051  0.29369    
## cls_profssingle       -0.0146619  0.0519885  -0.282  0.77806    
## cls_creditsone credit  0.5020432  0.1159388   4.330 1.84e-05 ***
## bty_avg                0.0400333  0.0175064   2.287  0.02267 *  
## pic_outfitnot formal  -0.1126817  0.0738800  -1.525  0.12792    
## pic_colorcolor        -0.2172630  0.0715021  -3.039  0.00252 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.498 on 448 degrees of freedom
## Multiple R-squared:  0.1871, Adjusted R-squared:  0.1617 
## F-statistic: 7.366 on 14 and 448 DF,  p-value: 6.552e-14

Exercise 12

Check your suspicions from the previous exercise. Include the model output in your response.
Answer: The number of credits (cls_credits) output in the summary as “cls_creditsone credit”. The P-value for one credit appears to be very close to zero, which indicates a very strong significance, hence my answer to Exercise 11 was incorrect.
The factor with the highest P-value, and hence least significance, appears to be number of professors, (cls_profsingle) with a P-value of 0.77806.

Exercise 13

Interpret the coefficient associated with the ethnicity variable.
Answer: Evaluations for professors that are not recorded as part of a minority ethnicity tend to be 0.1234929 higher than evaluation for professors that are recorded as being of a minority ethnicity. (I would be curious how this would be impacted by the ethnicity of the the students completing the evaluations.)

Exercise 14

Drop the variable with the highest p-value and re-fit the model. Did the coefficients and significance of the other explanatory variables change? (One of the things that makes multiple regression interesting is that coefficient estimates depend on the other variables that are included in the model.) If not, what does this say about whether or not the dropped variable was collinear with the other explanatory variables?
m_full_1 <- lm(score ~ rank + ethnicity + gender + language + age + cls_perc_eval 
             + cls_students + cls_level + cls_credits + bty_avg 
             + pic_outfit + pic_color, data = evals)
summary(m_full_1)
## 
## Call:
## lm(formula = score ~ rank + ethnicity + gender + language + age + 
##     cls_perc_eval + cls_students + cls_level + cls_credits + 
##     bty_avg + pic_outfit + pic_color, data = evals)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.7836 -0.3257  0.0859  0.3513  0.9551 
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            4.0872523  0.2888562  14.150  < 2e-16 ***
## ranktenure track      -0.1476746  0.0819824  -1.801 0.072327 .  
## ranktenured           -0.0973829  0.0662614  -1.470 0.142349    
## ethnicitynot minority  0.1274458  0.0772887   1.649 0.099856 .  
## gendermale             0.2101231  0.0516873   4.065 5.66e-05 ***
## languagenon-english   -0.2282894  0.1111305  -2.054 0.040530 *  
## age                   -0.0089992  0.0031326  -2.873 0.004262 ** 
## cls_perc_eval          0.0052888  0.0015317   3.453 0.000607 ***
## cls_students           0.0004687  0.0003737   1.254 0.210384    
## cls_levelupper         0.0606374  0.0575010   1.055 0.292200    
## cls_creditsone credit  0.5061196  0.1149163   4.404 1.33e-05 ***
## bty_avg                0.0398629  0.0174780   2.281 0.023032 *  
## pic_outfitnot formal  -0.1083227  0.0721711  -1.501 0.134080    
## pic_colorcolor        -0.2190527  0.0711469  -3.079 0.002205 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4974 on 449 degrees of freedom
## Multiple R-squared:  0.187,  Adjusted R-squared:  0.1634 
## F-statistic: 7.943 on 13 and 449 DF,  p-value: 2.336e-14
Answer: The coefficients and significance changed slightly. Since the values changed, it implies that the excluded variable was slightly collinear with the other explanatory variables.

Exercise 15

Using backward-selection and p-value as the selection criterion, determine the best model. You do not need to show all steps in your answer, just the output for the final model. Also, write out the linear model for predicting score based on the final model you settle on.
m_full_best <- lm(score ~  ethnicity + gender + language + age + cls_perc_eval 
              + cls_credits + bty_avg 
              + pic_color, data = evals)
sum_m_full_best <- summary(m_full)
sum_m_full_best$coefficients
##                            Estimate  Std. Error    t value     Pr(>|t|)
## (Intercept)            4.0952140795 0.290527656 14.0957805 1.319326e-37
## ranktenure track      -0.1475932457 0.082067087 -1.7984462 7.277935e-02
## ranktenured           -0.0973377624 0.066329583 -1.4674864 1.429455e-01
## ethnicitynot minority  0.1234929213 0.078627324  1.5706108 1.169791e-01
## gendermale             0.2109481296 0.051822962  4.0705533 5.544372e-05
## languagenon-english   -0.2298111901 0.111375416 -2.0633924 3.965088e-02
## age                   -0.0090071896 0.003135911 -2.8722720 4.268765e-03
## cls_perc_eval          0.0053272412 0.001539323  3.4607683 5.902546e-04
## cls_students           0.0004546339 0.000377388  1.2046856 2.289607e-01
## cls_levelupper         0.0605139602 0.057561665  1.0512893 2.936925e-01
## cls_profssingle       -0.0146619208 0.051988497 -0.2820224 7.780566e-01
## cls_creditsone credit  0.5020431770 0.115938766  4.3302443 1.839347e-05
## bty_avg                0.0400333017 0.017506416  2.2867788 2.267440e-02
## pic_outfitnot formal  -0.1126816871 0.073880036 -1.5251980 1.279153e-01
## pic_colorcolor        -0.2172629964 0.071502140 -3.0385523 2.516206e-03
summary(m_full_best)
## 
## Call:
## lm(formula = score ~ ethnicity + gender + language + age + cls_perc_eval + 
##     cls_credits + bty_avg + pic_color, data = evals)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.85320 -0.32394  0.09984  0.37930  0.93610 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            3.771922   0.232053  16.255  < 2e-16 ***
## ethnicitynot minority  0.167872   0.075275   2.230  0.02623 *  
## gendermale             0.207112   0.050135   4.131 4.30e-05 ***
## languagenon-english   -0.206178   0.103639  -1.989  0.04726 *  
## age                   -0.006046   0.002612  -2.315  0.02108 *  
## cls_perc_eval          0.004656   0.001435   3.244  0.00127 ** 
## cls_creditsone credit  0.505306   0.104119   4.853 1.67e-06 ***
## bty_avg                0.051069   0.016934   3.016  0.00271 ** 
## pic_colorcolor        -0.190579   0.067351  -2.830  0.00487 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4992 on 454 degrees of freedom
## Multiple R-squared:  0.1722, Adjusted R-squared:  0.1576 
## F-statistic:  11.8 on 8 and 454 DF,  p-value: 2.58e-15
Answer:
The formula is:
**Formula Exercise 15**

Formula Exercise 15

3.771922 (intercept) + 0.167872 X (ethnicitynot_minority) + 0.207112 x (gendermale) + -0.206178 x (languagenon-english) + (-0.006046) x (age) + 0.004656 x (cls_perc_eval) + 0.505306 x (cls_creditsone credit) + 0.051069 x (bty_avg) + (-0.190579) x (pic_colorcolor)

Exercise 16

Verify that the conditions for this model are reasonable using diagnostic plots.
# Normal Probability Plot
qqnorm(m_full_best$residuals)
qqline(m_full_best$residuals)

# 4 plots: Residual vs Fitted, Normal Probability Plot, Scale-Location, Residual vs Leverage
plot(m_full_best)

#Histogram
hist(m_full_best$residuals) 

# Checking linearity
plot(jitter(evals$score) ~ evals$bty_avg)

par(mfrow=c(2, 2))  
boxplot(m_full_best$residuals~evals$ethnicity)
boxplot(m_full_best$residuals~evals$gender)
boxplot(m_full_best$residuals~evals$cls_credits)
boxplot(m_full_best$residuals~evals$language)

plot(jitter(evals$score) ~ evals$age)
abline(h=4)

plot(jitter(evals$score) ~ evals$cls_perc_eval)
abline(h=4)

par(mfrow=c(2, 2))  
boxplot(sum_m_full$residuals~evals$bty_avg)
boxplot(sum_m_full$residuals~evals$age)
boxplot(sum_m_full$residuals~evals$pic_color)
par(mfrow=c(1, 1))

plot(evals$score ~ evals$pic_color)

# For beauty
boxplot(sum_m_full$residuals~evals$cls_perc_eval)

Answer: The conditions for this model are reasonable as shown by the above diagnostic plots.
1. Linearity: The above qqplot shows us that the linearity requirement is reasonable in this case, although a slight skew is present.
2. Nearly normal residuals: The above histogram illustrates that the nearly normal residuals requirement is reasonable, however there is a slight left-skew.
3. Constant Variability: The scatterplots above and residuals plots above indicate that the variability points around the line are roughly constant, hence this requirement is reasonable.
4. Independent observations. We will have to verify that the study methodology included independent observations, but for the purposes of this lab will assume that they were.

Exercise 17

The original paper describes how these data were gathered by taking a sample of professors from the University of Texas at Austin and including all courses that they have taught. Considering that each row represents a course, could this new information have an impact on any of the conditions of linear regression?
Answer: There may be an impact upon independence of observations if the students rating professors have previously taken a class with them, their ratings may be based on their experiences in the previous class, and hence violate independent observation requirement.

Exercise 18

Based on your final model, describe the characteristics of a professor and course at University of Texas at Austin that would be associated with a high evaluation score.
m_full_best
## 
## Call:
## lm(formula = score ~ ethnicity + gender + language + age + cls_perc_eval + 
##     cls_credits + bty_avg + pic_color, data = evals)
## 
## Coefficients:
##           (Intercept)  ethnicitynot minority             gendermale  
##              3.771922               0.167872               0.207112  
##   languagenon-english                    age          cls_perc_eval  
##             -0.206178              -0.006046               0.004656  
## cls_creditsone credit                bty_avg         pic_colorcolor  
##              0.505306               0.051069              -0.190579
Answer: Based on the coefficients the characteristics of a professor and course at University of Texas at Austin that would be associated with a high evaluation score would have a young, non-minority, male professor with a black and white photo and high average attractiveness score who attended a university with courses conducted in English. The evaluated course would be a one-credit course.

Exercise 19

Would you be comfortable generalizing your conclusions to apply to professors generally (at any university)? Why or why not?
Answer: No, I would not be comfortable generalizing these conclusions to apply to professors generally at any university. We need to investigate the design study in detail to determine if the observation are actually independent, what the sample size was, what the time frame was for the evaluations, which semester these evaluation were taken in (would end of year/summer evaluation be higher than mid-year, winter evaluations?) The demographics of a student body will vary greatly from region to region. These demographics will greatly influence which factors get high scores and should be considered as possible confounding variables. There are many explanations for these results from what appears to be an observational study. We need much more information and further verifying studies before these conclusion could be generalized.