This is an observational study. In an experiment, there is a control group and an experimental group where the experimental group receives some sort of treatment or change in a particular factor while the control group does not. In this case, the study is observational because the experimenters used data that already existed and analyzed that data. To change this study to an experimental study, the researchers could have randomized a group of students into two groups where in one group, all the teachers were rated about the same in terms of beauty and in another group, the teachers were rated differently on the same scale of beauty to see if results varied.
score. Is the distribution
skewed? What does that tell you about how students rate courses? Is this
what you expected to see? Why, or why not?The distribution is left-skewed. Most of the courses were rated higher than 3.5. Since this distribution is left-skewed, most students rated courses on the higher end. I expected this because students choose their major, which in turn means they basically choose the types of classes they want to take. They usually like most of their classes because the content of those classes relates to the career they are hoping to eventually have. If they mostly rated their courses low, then they might be in the wrong program for their interests.
## [1] 4.17473
## [1] 4.3
## [1] 2.3 5.0
score, select two other variables and
describe their relationship with each other using an appropriate
visualization.The variables are Class Level and Class Percent Eval. The two variables did not appear to have any relationship. I expected the higher class levels to have higher percentages of students who responded, but that was not the case. The percentages varied greatly for both class levels, and those ranges were about the same as you can see from the Box Plot below.
ggplot(evals, aes(cls_level, cls_perc_eval)) +
geom_boxplot() +
labs(title = "Percent of Students Who Completed the Evaluation by Class Level",
x = "Class Level",
y = "Class Percent Eval")geom_jitter
as your layer. What was misleading about the initial scatterplot?The initial scatterplot did not show the concentration of scores in each area. When values overlapped, it was not shown on the other scatterplot. This, however, shows higher concentrations at different locations. It makes it appear more linear than the original scatterplot.
ggplot(data = evals, aes(x = bty_avg, y = score)) +
geom_jitter() +
labs(title = "Course Score by Beauty Average",
x = "Beauty Average",
y = "Score")m_bty to
predict average professor score by average beauty rating. Write out the
equation for the linear model and interpret the slope. Is average beauty
score a statistically significant predictor? Does it appear to be a
practically significant predictor?This is the linear model equation:
y = 0.06664x + 3.88034
y = score
x = beauty average
Average beauty score does not seem to be a statistically significant predictor. The score is 0.1871424, which shows a weak correlation. It might be a practically significant predictor because it does show some correlation. This might mean the difference between two courses that would be rated the same by one person except for the beauty score of the two professors. Since only six people rated them out of everyone, it is also challenging to tell if the beauty scores are accurate. Six people’s opinions do not speak for everyone.
The plots below show two ways to add a line to the plot.
m_bty <- lm(score ~ bty_avg, data = evals)
ggplot(data = evals, aes(x = bty_avg, y = score)) +
geom_jitter() +
geom_abline(intercept = as.numeric(m_bty$coefficients[1]),
slope = as.numeric(m_bty$coefficients[2])) +
labs(title = "Course Score by Beauty Average",
x = "Beauty Average",
y = "Score")ggplot(data = evals, aes(x = bty_avg, y = score)) +
geom_jitter() +
geom_smooth(method = "lm") +
labs(title = "Course Score by Beauty Average",
x = "Beauty Average",
y = "Score")## # A tibble: 1 × 1
## `cor(bty_avg, score)`
## <dbl>
## 1 0.187
The residual plots and other plots below show that the variability is about equal through the entire plot. More values are under the line than above the line since more points fall below the horizontal line 0 in the residual plots, but overall the variability is random and evenly distributed. Also, the values seem about normally distributed. They are not exact, but they are close to falling on the line in the Q-Q Plot.
The conditions for this model are reasonable based on the plots below. The residuals are mostly consistent across the graph. The values on the lower end have a smaller range for residuals, but other than that it looks about consistent. Also, the Q-Q Plot shows the data is about normally distributed. Some values miss the straight line around 2 and 3, but it is still reasonable.
##
## Call:
## lm(formula = score ~ bty_avg + gender, data = evals)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.8305 -0.3625 0.1055 0.4213 0.9314
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.74734 0.08466 44.266 < 2e-16 ***
## bty_avg 0.07416 0.01625 4.563 6.48e-06 ***
## gendermale 0.17239 0.05022 3.433 0.000652 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5287 on 460 degrees of freedom
## Multiple R-squared: 0.05912, Adjusted R-squared: 0.05503
## F-statistic: 14.45 on 2 and 460 DF, p-value: 8.177e-07
bty_avg still a significant predictor of
score? Has the addition of gender to the model
changed the parameter estimate for bty_avg?bty_avg was never a significant predictor of score, but the two had a weak correlation. The addition of gender changed the R-Squared value a bit. It shows the addition of gender made the value better, indicating a slightly stronger relationship. Also, the slope of bty_avg changed slightly after adding gender. It increased by about 0.08, which is not much of a difference.
##
## Call:
## lm(formula = score ~ bty_avg, data = evals)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.9246 -0.3690 0.1420 0.3977 0.9309
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.88034 0.07614 50.96 < 2e-16 ***
## bty_avg 0.06664 0.01629 4.09 5.08e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5348 on 461 degrees of freedom
## Multiple R-squared: 0.03502, Adjusted R-squared: 0.03293
## F-statistic: 16.73 on 1 and 461 DF, p-value: 5.083e-05
##
## Call:
## lm(formula = score ~ bty_avg + gender, data = evals)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.8305 -0.3625 0.1055 0.4213 0.9314
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.74734 0.08466 44.266 < 2e-16 ***
## bty_avg 0.07416 0.01625 4.563 6.48e-06 ***
## gendermale 0.17239 0.05022 3.433 0.000652 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5287 on 460 degrees of freedom
## Multiple R-squared: 0.05912, Adjusted R-squared: 0.05503
## F-statistic: 14.45 on 2 and 460 DF, p-value: 8.177e-07
The equation of the line for the black and white pictures:
y = 0.1049x + 3.7974
y = score
x = bty_avg
The equation of the line for the color pictures:
y = 0.04229x + 3.95826
y = score
x = bty_avg
The pictures in black and white tended to have the higher course evaluation score.
ggplot(data = evals, aes(x = bty_avg, y = score, color = pic_color)) +
geom_smooth(method = "lm", formula = y ~ x, se = FALSE)blWhite <-
evals |>
filter(pic_color == "black&white")
color <-
evals |>
filter(pic_color == "color")
lm(score ~ bty_avg, data = blWhite)##
## Call:
## lm(formula = score ~ bty_avg, data = blWhite)
##
## Coefficients:
## (Intercept) bty_avg
## 3.7974 0.1049
##
## Call:
## lm(formula = score ~ bty_avg, data = color)
##
## Coefficients:
## (Intercept) bty_avg
## 3.95826 0.04229
m_bty_rank with
gender removed and rank added in. How does R
appear to handle categorical variables that have more than two levels?
Note that the rank variable has three levels: teaching,
tenure track, tenured.It appeared that R handled categorical variables as separate variables. Each one was either present or not present. The new model is below:
y = 0.06783x - 0.16070z - 0.12623w + 3.98155
y = score
x = bty_avg
z = rank: tenure track
w = rank: tenured
##
## Call:
## lm(formula = score ~ bty_avg + rank, data = evals)
##
## Coefficients:
## (Intercept) bty_avg ranktenure track ranktenured
## 3.98155 0.06783 -0.16070 -0.12623
Before running the code, I predicted that number of professors would have the highest p-value in the model. Some people like to have one professor. Some people like to have multiple professors. Thus, I assumed it would be about evenly distributed for all the options.
After running the code, I see that classes with one professor had the highest p-value of 0.77806. The second-highest variable was upper-level classes. I guessed the number of professors correctly, but I did not consider class level.
m_full <- lm(score ~ rank + gender + ethnicity + language + age + cls_perc_eval
+ cls_students + cls_level + cls_profs + cls_credits + bty_avg
+ pic_outfit + pic_color, data = evals)
summary(m_full)##
## Call:
## lm(formula = score ~ rank + gender + ethnicity + language + age +
## cls_perc_eval + cls_students + cls_level + cls_profs + cls_credits +
## bty_avg + pic_outfit + pic_color, data = evals)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.77397 -0.32432 0.09067 0.35183 0.95036
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.0952141 0.2905277 14.096 < 2e-16 ***
## ranktenure track -0.1475932 0.0820671 -1.798 0.07278 .
## ranktenured -0.0973378 0.0663296 -1.467 0.14295
## gendermale 0.2109481 0.0518230 4.071 5.54e-05 ***
## ethnicitynot minority 0.1234929 0.0786273 1.571 0.11698
## languagenon-english -0.2298112 0.1113754 -2.063 0.03965 *
## age -0.0090072 0.0031359 -2.872 0.00427 **
## cls_perc_eval 0.0053272 0.0015393 3.461 0.00059 ***
## cls_students 0.0004546 0.0003774 1.205 0.22896
## cls_levelupper 0.0605140 0.0575617 1.051 0.29369
## cls_profssingle -0.0146619 0.0519885 -0.282 0.77806
## cls_creditsone credit 0.5020432 0.1159388 4.330 1.84e-05 ***
## bty_avg 0.0400333 0.0175064 2.287 0.02267 *
## pic_outfitnot formal -0.1126817 0.0738800 -1.525 0.12792
## pic_colorcolor -0.2172630 0.0715021 -3.039 0.00252 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.498 on 448 degrees of freedom
## Multiple R-squared: 0.1871, Adjusted R-squared: 0.1617
## F-statistic: 7.366 on 14 and 448 DF, p-value: 6.552e-14
Intercept Estimate Std. Error t value Pr(>|t|) ethnicity not minority 0.1234929 0.0786273 1.571 0.11698
The coefficient for the ethnicity variable was 0.1234929. Based on the boxplot below, it appeared that minority was 0 and not minority was 1 (in terms of the value of the factor used for the equation). This means that if a professor was not a minority, their course was rated 0.1234929 points higher than a course with a minority professor.
The coefficients and significance of the other variables barely changed. This shows that the dropped variable was probably collinear with the other variables. The prediction was not affected much by this variable, so it was probably collinear with another variable or multiple ones. If this variable was not collinear with something else, then the other variables would have changed much more than they did.
m_almost_full <- lm(score ~ rank + gender + ethnicity + language + age +
cls_perc_eval + cls_students + cls_level + cls_credits +
bty_avg + pic_outfit + pic_color, data = evals)
summary(m_almost_full)##
## Call:
## lm(formula = score ~ rank + gender + ethnicity + language + age +
## cls_perc_eval + cls_students + cls_level + cls_credits +
## bty_avg + pic_outfit + pic_color, data = evals)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.7836 -0.3257 0.0859 0.3513 0.9551
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.0872523 0.2888562 14.150 < 2e-16 ***
## ranktenure track -0.1476746 0.0819824 -1.801 0.072327 .
## ranktenured -0.0973829 0.0662614 -1.470 0.142349
## gendermale 0.2101231 0.0516873 4.065 5.66e-05 ***
## ethnicitynot minority 0.1274458 0.0772887 1.649 0.099856 .
## languagenon-english -0.2282894 0.1111305 -2.054 0.040530 *
## age -0.0089992 0.0031326 -2.873 0.004262 **
## cls_perc_eval 0.0052888 0.0015317 3.453 0.000607 ***
## cls_students 0.0004687 0.0003737 1.254 0.210384
## cls_levelupper 0.0606374 0.0575010 1.055 0.292200
## cls_creditsone credit 0.5061196 0.1149163 4.404 1.33e-05 ***
## bty_avg 0.0398629 0.0174780 2.281 0.023032 *
## pic_outfitnot formal -0.1083227 0.0721711 -1.501 0.134080
## pic_colorcolor -0.2190527 0.0711469 -3.079 0.002205 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4974 on 449 degrees of freedom
## Multiple R-squared: 0.187, Adjusted R-squared: 0.1634
## F-statistic: 7.943 on 13 and 449 DF, p-value: 2.336e-14
##
## Call:
## lm(formula = score ~ rank + gender + ethnicity + language + age +
## cls_perc_eval + cls_students + cls_level + cls_profs + cls_credits +
## bty_avg + pic_outfit + pic_color, data = evals)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.77397 -0.32432 0.09067 0.35183 0.95036
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.0952141 0.2905277 14.096 < 2e-16 ***
## ranktenure track -0.1475932 0.0820671 -1.798 0.07278 .
## ranktenured -0.0973378 0.0663296 -1.467 0.14295
## gendermale 0.2109481 0.0518230 4.071 5.54e-05 ***
## ethnicitynot minority 0.1234929 0.0786273 1.571 0.11698
## languagenon-english -0.2298112 0.1113754 -2.063 0.03965 *
## age -0.0090072 0.0031359 -2.872 0.00427 **
## cls_perc_eval 0.0053272 0.0015393 3.461 0.00059 ***
## cls_students 0.0004546 0.0003774 1.205 0.22896
## cls_levelupper 0.0605140 0.0575617 1.051 0.29369
## cls_profssingle -0.0146619 0.0519885 -0.282 0.77806
## cls_creditsone credit 0.5020432 0.1159388 4.330 1.84e-05 ***
## bty_avg 0.0400333 0.0175064 2.287 0.02267 *
## pic_outfitnot formal -0.1126817 0.0738800 -1.525 0.12792
## pic_colorcolor -0.2172630 0.0715021 -3.039 0.00252 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.498 on 448 degrees of freedom
## Multiple R-squared: 0.1871, Adjusted R-squared: 0.1617
## F-statistic: 7.366 on 14 and 448 DF, p-value: 6.552e-14
Using backward selection and removing all insignificant p-values, the final model used the following variables:
gender, ethnicity, language, age, cls_perc_eval, cls_credits, bty_avg, and pic_color
All the variables had a p-value less than 0.05. This is the final linear model:
y = 0.202597x + 0.163818z - 0.246683w - 0.006925v + 0.004942m + 0.517205n + 0.046732k - 0.113939d - 0.180870j + 3.907030
x = gender male, z = ethnicity not minority, w = language not English, v = age, m = class percent eval, n = one credit class, k = beauty average, d = picture outfit not formal, j = picture color color
m_almost_full4 <- lm(score ~ gender + ethnicity + language + age +
cls_perc_eval + cls_credits +
bty_avg + pic_color, data = evals)
summary(m_almost_full4)##
## Call:
## lm(formula = score ~ gender + ethnicity + language + age + cls_perc_eval +
## cls_credits + bty_avg + pic_color, data = evals)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.85320 -0.32394 0.09984 0.37930 0.93610
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.771922 0.232053 16.255 < 2e-16 ***
## gendermale 0.207112 0.050135 4.131 4.30e-05 ***
## ethnicitynot minority 0.167872 0.075275 2.230 0.02623 *
## languagenon-english -0.206178 0.103639 -1.989 0.04726 *
## age -0.006046 0.002612 -2.315 0.02108 *
## cls_perc_eval 0.004656 0.001435 3.244 0.00127 **
## cls_creditsone credit 0.505306 0.104119 4.853 1.67e-06 ***
## bty_avg 0.051069 0.016934 3.016 0.00271 **
## pic_colorcolor -0.190579 0.067351 -2.830 0.00487 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4992 on 454 degrees of freedom
## Multiple R-squared: 0.1722, Adjusted R-squared: 0.1576
## F-statistic: 11.8 on 8 and 454 DF, p-value: 2.58e-15
The conditions for this model are reasonable based on the diagnostic plots. The residuals are mostly evenly distributed. The data is also about normally distributed from the Q-Q Plot. The Scale-Location graph is evenly distributed. The data may have some outliers, but for the most part, the conditions for this model appear reasonable.
Each row represents a course, not a professor or student. Some students might rate the same professor differently for different courses depending on their interest in the topic of the course. Also, some students might rate the same professor the same regardless of the topic of the course. This can affect the results, and it would not be known which courses were taught by the same professor. One condition of linear regression is independence, but the different courses and their scores would not be independent since one professor can teach multiple classes and each class can be taught by multiple professors.
A professor who is male, is not a minority, went to a college that taught in English, is younger than other professors, has a black and white school picture, and is rated higher on the beauty scale is more likely to have a higher evaluation score.
A course that has a higher percentage of responses by students and that is only one credit is more likely to have a higher evaluation score.
I would not be comfortable generalizing my conclusions to apply to professors generally. This is a small sample in a specific part of the countries. A larger sample taken from colleges in multiple regions around the country would be more representative of U.S. college students than this sample. Also, it would be a good idea to compare opinions of students from small, medium, and large schools because students likely have different preferences for courses.