We will continue working with the course evaluation data from the previous in-class assignment. If you need to reference that document see https://rpubs.com/shawnsanto/ica-03-26-19.
We will analyze what goes into course evaluations and how certain variables effect the overall score.
To get started, load packages tidyverse
and broom
. Install any packages with code install.packages("package_name")
.
library(tidyverse)
library(broom)
The data were gathered from end of semester student evaluations for a large sample of professors from the University of Texas at Austin. In addition, six students rated the professors’ physical appearance. Each row in evals
contains a different course and the columns represent variables about the courses and professors.
Use read_csv()
to read in the data and save it as an object named evals
. The data is available on Google Classroom.
evals <- read_csv("evals-mod.csv")
Variable | Description |
---|---|
score | Average professor evaluation score: (1) very unsatisfactory - (5) excellent |
rank | Rank of professor: teaching, tenure track, tenure |
ethnicity | Ethnicity of professor: not minority, minority |
gender | Gender of professor: female, male |
language | Language of school where professor received education: english or non-english |
age | Age of professor |
cls_perc_eval | Percent of students in class who completed evaluation |
cls_did_eval | Number of students in class who completed evaluation |
cls_students | Total number of students in class |
cls_level | Class level: lower, upper |
cls_profs | Number of professors teaching sections in course in sample: single, multiple |
cls_credits | Number of credits of class: one credit (lab, PE, etc.), multi credit |
bty_f1lower | Beauty rating of professor from lower level female: (1) lowest - (10) highest |
bty_f1upper | Beauty rating of professor from upper level female: (1) lowest - (10) highest |
bty_f2upper | Beauty rating of professor from upper level female: (1) lowest - (10) highest |
bty_m1lower | Beauty rating of professor from lower level male: (1) lowest - (10) highest |
bty_m1upper | Beauty rating of professor from upper level male: (1) lowest - (10) highest |
bty_m2upper | Beauty rating of professor from upper level male: (1) lowest - (10) highest |
Before you get started, add the avg_bty
variable from last time.
evals <- evals %>%
rowwise() %>%
mutate(avg_bty = mean(bty_f1lower:bty_m2upper)) %>%
ungroup()
Fit a linear model with score
as the response and language
as a single predictor. Write out the model output.
What is the baseline level in Task 1? Interpret the meaning of coefficient \(b_1\).
Based on Task 1, what is the equation of the line for English speaking professors? What about non-English speaking professors?
Create a scatter plot of score
versus rank
with ggplot()
. Use geom_jitter()
.
Fit a linear model with score
as the response and rank
as a single predictor. What is the baseline? Write out the model output.
Add a new variable to evals
called rank_new
where the baseline level is set to “tenured”. Hint: relevel()
Fit a linear model with score
as the response and rank_new
as a single predictor. Is the baseline now different from the baseline in Task 5?
Fit a linear model with score
as the response and gender
, rank
, and avg_bty
as predictors. Write out the model. Give an interpretation for the coefficient of avg_bty
.
What are the \(R^2\) and adjusted \(R^2\) values from your model in Task 8?
Fit a linear model with score
as the response and only gender
and avg_bty
as predictors. How did the \(R^2\) and adjusted \(R^2\) values change compared to Task 9?
Fit a full model with score
as the response and predictors: rank
, ethnicity
, gender
, language
, age
, cls_perc_eval
, cls_students
, cls_level
, cls_profs
, cls_credits
, bty_avg
.
Why did we not consider cls_students
and the individual beauty scores?
Use the fitted full model in Task 11 and backward selection to determine the “best”" model. What are the \(R^2\) and adjusted \(R^2\) values from this “best” model?
Create a 95% prediction interval based on new predictor values of your choosing. Use your “best” model from Task 13.
Create 95% confidence intervals for the coeffients of your “best”" model from Task 13.
Can we use this model to make valid predictions about professors from any University?