We will continue working with the course evaluation data from the previous in-class assignment. If you need to reference that document see https://rpubs.com/shawnsanto/ica-03-26-19.
We will analyze what goes into course evaluations and how certain variables effect the overall score.
To get started, load packages tidyverse and broom. Install any packages with code install.packages("package_name").
library(tidyverse)
library(broom)
The data were gathered from end of semester student evaluations for a large sample of professors from the University of Texas at Austin. In addition, six students rated the professors’ physical appearance. Each row in evals contains a different course and the columns represent variables about the courses and professors.
Use read_csv() to read in the data and save it as an object named evals. The data is available on Google Classroom.
evals <- read_csv("evals-mod.csv")
| Variable | Description |
|---|---|
| score | Average professor evaluation score: (1) very unsatisfactory - (5) excellent |
| rank | Rank of professor: teaching, tenure track, tenure |
| ethnicity | Ethnicity of professor: not minority, minority |
| gender | Gender of professor: female, male |
| language | Language of school where professor received education: english or non-english |
| age | Age of professor |
| cls_perc_eval | Percent of students in class who completed evaluation |
| cls_did_eval | Number of students in class who completed evaluation |
| cls_students | Total number of students in class |
| cls_level | Class level: lower, upper |
| cls_profs | Number of professors teaching sections in course in sample: single, multiple |
| cls_credits | Number of credits of class: one credit (lab, PE, etc.), multi credit |
| bty_f1lower | Beauty rating of professor from lower level female: (1) lowest - (10) highest |
| bty_f1upper | Beauty rating of professor from upper level female: (1) lowest - (10) highest |
| bty_f2upper | Beauty rating of professor from upper level female: (1) lowest - (10) highest |
| bty_m1lower | Beauty rating of professor from lower level male: (1) lowest - (10) highest |
| bty_m1upper | Beauty rating of professor from upper level male: (1) lowest - (10) highest |
| bty_m2upper | Beauty rating of professor from upper level male: (1) lowest - (10) highest |
Before you get started, add the avg_bty variable from last time.
evals <- evals %>%
rowwise() %>%
mutate(avg_bty = mean(bty_f1lower:bty_m2upper)) %>%
ungroup()
Fit a linear model with score as the response and language as a single predictor. Write out the model output.
What is the baseline level in Task 1? Interpret the meaning of coefficient \(b_1\).
Based on Task 1, what is the equation of the line for English speaking professors? What about non-English speaking professors?
Create a scatter plot of score versus rank with ggplot(). Use geom_jitter().
Fit a linear model with score as the response and rank as a single predictor. What is the baseline? Write out the model output.
Add a new variable to evals called rank_new where the baseline level is set to “tenured”. Hint: relevel()
Fit a linear model with score as the response and rank_new as a single predictor. Is the baseline now different from the baseline in Task 5?
Fit a linear model with score as the response and gender, rank, and avg_bty as predictors. Write out the model. Give an interpretation for the coefficient of avg_bty.
What are the \(R^2\) and adjusted \(R^2\) values from your model in Task 8?
Fit a linear model with score as the response and only gender and avg_bty as predictors. How did the \(R^2\) and adjusted \(R^2\) values change compared to Task 9?
Fit a full model with score as the response and predictors: rank, ethnicity, gender, language, age, cls_perc_eval, cls_students, cls_level, cls_profs, cls_credits, bty_avg.
Why did we not consider cls_students and the individual beauty scores?
Use the fitted full model in Task 11 and backward selection to determine the “best”" model. What are the \(R^2\) and adjusted \(R^2\) values from this “best” model?
Create a 95% prediction interval based on new predictor values of your choosing. Use your “best” model from Task 13.
Create 95% confidence intervals for the coeffients of your “best”" model from Task 13.
Can we use this model to make valid predictions about professors from any University?