Data Analysis: & Decision Making - Oral Presentation Report

Project Group 9: Akshat Shah, Madhuri Rupaakula, Namita Kadam, Shraddha Somani

2016-12-06

Objective

Impact of beauty on Instructor’s Teaching Ratings Dataset –> TeachingReatings (AER Package)

Packages

require(AER)
## Loading required package: AER
## Warning: package 'AER' was built under R version 3.3.2
## Loading required package: car
## Warning: package 'car' was built under R version 3.3.2
## Loading required package: lmtest
## Warning: package 'lmtest' was built under R version 3.3.2
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 3.3.2
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## Loading required package: sandwich
## Warning: package 'sandwich' was built under R version 3.3.2
## Loading required package: survival
require(ggplot2)
## Loading required package: ggplot2
require(gridExtra)
## Loading required package: gridExtra
require(GGally)
## Loading required package: GGally
require(e1071)
## Loading required package: e1071
require(ellipse)
## Loading required package: ellipse
## 
## Attaching package: 'ellipse'
## The following object is masked from 'package:car':
## 
##     ellipse
require(car)
require(scatterplot3d)
## Loading required package: scatterplot3d
## Warning: package 'scatterplot3d' was built under R version 3.3.2
library(lmtest)
require(faraway)
## Loading required package: faraway
## 
## Attaching package: 'faraway'
## The following object is masked from 'package:GGally':
## 
##     happy
## The following object is masked from 'package:survival':
## 
##     rats
## The following objects are masked from 'package:car':
## 
##     logit, vif
data("TeachingRatings")

Introduction

Data on course evaluations, course characteristics, and professor characteristics for 463 courses for the academic years 2000-2002 at the University of Texas at Austin. A data frame containing 463 observations on 12 variables.

Structure of data

str(TeachingRatings)
## 'data.frame':    463 obs. of  12 variables:
##  $ minority   : Factor w/ 2 levels "no","yes": 2 1 1 1 1 1 1 1 1 1 ...
##  $ age        : int  36 59 51 40 31 62 33 51 33 47 ...
##  $ gender     : Factor w/ 2 levels "male","female": 2 1 1 2 2 1 2 2 2 1 ...
##  $ credits    : Factor w/ 2 levels "more","single": 1 1 1 1 1 1 1 1 1 1 ...
##  $ beauty     : num  0.29 -0.738 -0.572 -0.678 1.51 ...
##  $ eval       : num  4.3 4.5 3.7 4.3 4.4 4.2 4 3.4 4.5 3.9 ...
##  $ division   : Factor w/ 2 levels "upper","lower": 1 1 1 1 1 1 1 1 1 1 ...
##  $ native     : Factor w/ 2 levels "yes","no": 1 1 1 1 1 1 1 1 1 1 ...
##  $ tenure     : Factor w/ 2 levels "no","yes": 2 2 2 2 2 2 2 2 2 1 ...
##  $ students   : num  24 17 55 40 42 182 33 25 48 16 ...
##  $ allstudents: num  43 20 55 46 48 282 41 41 60 19 ...
##  $ prof       : Factor w/ 94 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
head(TeachingRatings)
##   minority age gender credits     beauty eval division native tenure
## 1      yes  36 female    more  0.2899157  4.3    upper    yes    yes
## 2       no  59   male    more -0.7377322  4.5    upper    yes    yes
## 3       no  51   male    more -0.5719836  3.7    upper    yes    yes
## 4       no  40 female    more -0.6779634  4.3    upper    yes    yes
## 5       no  31 female    more  1.5097940  4.4    upper    yes    yes
## 6       no  62   male    more  0.5885687  4.2    upper    yes    yes
##   students allstudents prof
## 1       24          43    1
## 2       17          20    2
## 3       55          55    3
## 4       40          46    4
## 5       42          48    5
## 6      182         282    6

Summary of data

summary(TeachingRatings)
##  minority       age           gender      credits        beauty          
##  no :399   Min.   :29.00   male  :268   more  :436   Min.   :-1.4504940  
##  yes: 64   1st Qu.:42.00   female:195   single: 27   1st Qu.:-0.6562689  
##            Median :48.00                             Median :-0.0680143  
##            Mean   :48.37                             Mean   : 0.0000001  
##            3rd Qu.:57.00                             3rd Qu.: 0.5456024  
##            Max.   :73.00                             Max.   : 1.9700230  
##                                                                          
##       eval        division   native    tenure       students     
##  Min.   :2.100   upper:306   yes:435   no :102   Min.   :  5.00  
##  1st Qu.:3.600   lower:157   no : 28   yes:361   1st Qu.: 15.00  
##  Median :4.000                                   Median : 23.00  
##  Mean   :3.998                                   Mean   : 36.62  
##  3rd Qu.:4.400                                   3rd Qu.: 40.00  
##  Max.   :5.000                                   Max.   :380.00  
##                                                                  
##   allstudents          prof    
##  Min.   :  8.00   34     : 13  
##  1st Qu.: 19.00   50     : 13  
##  Median : 29.00   82     : 11  
##  Mean   : 55.18   10     : 10  
##  3rd Qu.: 60.00   20     : 10  
##  Max.   :581.00   58     : 10  
##                   (Other):396

Analysis:

Graphical Representation of Distribution ‘eval’ Variable

plot1 = qplot(eval, data = TeachingRatings, fill = "red", xlab = "Evaluation")
plot2 = qplot(eval, data = TeachingRatings, geom = "density", fill = "red")
plot3 = qplot(sample = eval, data = TeachingRatings) 
grid.arrange(plot1, plot2, plot3, ncol = 3)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Analysis:

Overlay Plot

ggplot(TeachingRatings, aes(x = eval, y =..density..)) + geom_histogram( fill = "cornsilk", colour =" grey60", size =.2) + geom_density() 
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Boxplot

boxplot(TeachingRatings$eval, ylab = "eval", main = "Box Plot")

Analysis:

Identify Skewness of Numeric Variables

Skewness is not applicable for ‘test’ variable.

skewness(TeachingRatings$eval)
## [1] -0.4643668
skewness(TeachingRatings$age)
## [1] 0.04835668
skewness(TeachingRatings$beauty)
## [1] 0.5124669
skewness(TeachingRatings$students)
## [1] 4.465658
skewness(TeachingRatings$allstudents)
## [1] 4.128874

Analysis:

Graphical Representation of Distribution of Two or More Variables

Create subset of numeric variables

TeachingRatings_subset = subset(TeachingRatings, select = c(age, beauty, eval, students, allstudents))

Boxplot and stripchart

oldpar = par(mfrow = c(1,2))
boxplot(TeachingRatings_subset, main = "Boxplot of variables",col = (c("gold","darkgreen","red","blue","pink")))
stripchart(TeachingRatings_subset, vertical = TRUE, method = "jitter", col = "orange", pch = 1, main="Stripcharts of variables")

Analysis:

Boxplot between gender and eval

ggplot(aes(gender, eval), data = TeachingRatings) + geom_boxplot(aes(fill = gender))

Analysis:

Boxplot between minority and eval

ggplot(aes(minority, eval), data = TeachingRatings) + geom_boxplot(aes(fill = minority))

Analysis:

Boxplot of Minority with Nativity

df1 <- data.frame(Minority = TeachingRatings$minority, Nativity = TeachingRatings$native, o = TeachingRatings$eval)
df1$MinorityNativity <- interaction(df1$Minority, df1$Nativity)
ggplot(aes(y = o, x = MinorityNativity), data = df1) + 
  geom_boxplot(aes(fill = Minority)) + ggtitle("Interactive Model of Minority and Nativity") +
  labs(y = "Evaluation", x = "Minority and Nativity") 

Analysis:

Boxplotl of Gender with Credits

df2 <- data.frame(Gender = TeachingRatings$gender, Credits = TeachingRatings$credits, o = TeachingRatings$eval)
df2$GenderCredits <- interaction(df2$Gender, df2$Credits)
ggplot(aes(y = o, x = GenderCredits), data = df2) + 
  geom_boxplot(aes(fill = Gender)) + ggtitle("Interactive Model of Gender and credits") +
  labs(y = "Evaluation", x = "Gender and credits")

Analysis:

Boxplot of Tenure with Credits

df3 <- data.frame(Tenure = TeachingRatings$tenure, Credits = TeachingRatings$credits, o = TeachingRatings$eval)
df3$TenureCredits <- interaction(df3$Tenure, df3$Credits)
ggplot(aes(y = o, x = TenureCredits), data = df3) + 
  geom_boxplot(aes(fill = Credits)) + ggtitle("Interactive Model of Tenure and Credits") +
  labs(y = "Evaluation", x = "Tenure and Credits")

Analysis:

Regression and Correlation

summary(TeachingRatings_subset)
##       age            beauty                eval          students     
##  Min.   :29.00   Min.   :-1.4504940   Min.   :2.100   Min.   :  5.00  
##  1st Qu.:42.00   1st Qu.:-0.6562689   1st Qu.:3.600   1st Qu.: 15.00  
##  Median :48.00   Median :-0.0680143   Median :4.000   Median : 23.00  
##  Mean   :48.37   Mean   : 0.0000001   Mean   :3.998   Mean   : 36.62  
##  3rd Qu.:57.00   3rd Qu.: 0.5456024   3rd Qu.:4.400   3rd Qu.: 40.00  
##  Max.   :73.00   Max.   : 1.9700230   Max.   :5.000   Max.   :380.00  
##   allstudents    
##  Min.   :  8.00  
##  1st Qu.: 19.00  
##  Median : 29.00  
##  Mean   : 55.18  
##  3rd Qu.: 60.00  
##  Max.   :581.00

Finding Z-Scores

TeachingRatings_r = data.frame(scale(TeachingRatings_subset))

Summary table of Z - Score

summary(TeachingRatings_r)
##       age               beauty              eval          
##  Min.   :-1.97547   Min.   :-1.83922   Min.   :-3.421139  
##  1st Qu.:-0.64931   1st Qu.:-0.83214   1st Qu.:-0.717781  
##  Median :-0.03724   Median :-0.08624   Median : 0.003114  
##  Mean   : 0.00000   Mean   : 0.00000   Mean   : 0.000000  
##  3rd Qu.: 0.88087   3rd Qu.: 0.69182   3rd Qu.: 0.724009  
##  Max.   : 2.51307   Max.   : 2.49798   Max.   : 1.805352  
##     students         allstudents      
##  Min.   :-0.70247   Min.   :-0.62842  
##  1st Qu.:-0.48034   1st Qu.:-0.48189  
##  Median :-0.30263   Median :-0.34869  
##  Mean   : 0.00000   Mean   : 0.00000  
##  3rd Qu.: 0.07499   3rd Qu.: 0.06424  
##  Max.   : 7.62744   Max.   : 7.00417

Analysis:

Boxplot and stripchart on the basis of Z-Score

oldpar = par(mfrow = c(1,2))
boxplot(TeachingRatings_r, main = "Boxplot of re-scaled variables",col = (c("gold","darkgreen","red","blue","pink")))
stripchart(TeachingRatings_r, vertical = TRUE, method = "jitter", col = (c("gold","darkgreen","red","blue","pink")), pch = 1, main = "Stripcharts of re-scaled variables")

Analysis:

Correlation matrix

cor(TeachingRatings_subset)
##                     age      beauty         eval    students  allstudents
## age          1.00000000 -0.29789253 -0.051696191 -0.03046108 -0.012626464
## beauty      -0.29789253  1.00000000  0.189039091  0.13064984  0.099601914
## eval        -0.05169619  0.18903909  1.000000000  0.03546667 -0.001229338
## students    -0.03046108  0.13064984  0.035466674  1.00000000  0.972056127
## allstudents -0.01262646  0.09960191 -0.001229338  0.97205613  1.000000000

Analysis:

Identify Predictors and Outliers

Generalized Pairs Plot

ggpairs(TeachingRatings, columns = c(1, 3, 7:9, 2, 5, 10, 6), mapping = aes(colour = gender))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Gender: Blue : Female Red : Male

Analysis:

Linear Model

Fit a model for eval against all the other variables

fit = lm(eval ~ ., data = TeachingRatings) 
summary(fit)
## 
## Call:
## lm(formula = eval ~ ., data = TeachingRatings)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.46212 -0.19511  0.00898  0.18983  1.00008 
## 
## Coefficients: (6 not defined because of singularities)
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   13.1386717 36.0968676   0.364 0.716081    
## minorityyes   -1.6043290  4.5205210  -0.355 0.722870    
## age           -0.1020625  0.4188715  -0.244 0.807630    
## genderfemale   0.6301714  0.2747557   2.294 0.022383 *  
## creditssingle  0.3926076  0.1324904   2.963 0.003243 ** 
## beauty        -1.3043190  6.6622697  -0.196 0.844894    
## divisionlower -0.0584322  0.0733228  -0.797 0.426017    
## nativeno       0.0466904  3.1888844   0.015 0.988326    
## tenureyes     -3.7843180 14.4927882  -0.261 0.794149    
## students      -0.0003343  0.0024521  -0.136 0.891641    
## allstudents   -0.0029406  0.0014310  -2.055 0.040598 *  
## prof2         -0.6494872  1.8428949  -0.352 0.724721    
## prof3         -1.0498919  4.0645450  -0.258 0.796317    
## prof4         -2.6839369  9.3901062  -0.286 0.775173    
## prof5         -0.3245685  1.4626494  -0.222 0.824512    
## prof6          3.0135233  8.3048114   0.363 0.716916    
## prof7         -2.8180458  8.6458919  -0.326 0.744656    
## prof8         -0.9993021  2.0003539  -0.500 0.617684    
## prof9         -2.0263746  6.8098101  -0.298 0.766203    
## prof10        -2.9915172 12.8149360  -0.233 0.815551    
## prof11        -0.5674736  4.0907442  -0.139 0.889747    
## prof12        -5.2362883 21.0433760  -0.249 0.803630    
## prof13        -0.8893128  2.5757762  -0.345 0.730098    
## prof14        -1.0592936  5.9952185  -0.177 0.859850    
## prof15        -2.3315374  2.4798728  -0.940 0.347745    
## prof16        -0.3477871  3.1815474  -0.109 0.913014    
## prof17        -5.0550549 19.2072914  -0.263 0.792557    
## prof18        -3.9270391 16.3612888  -0.240 0.810449    
## prof19        -0.9322029  2.8259268  -0.330 0.741684    
## prof20        -4.3660730 12.0499005  -0.362 0.717313    
## prof21        -4.5256697 12.7460120  -0.355 0.722746    
## prof22        -3.1912762 13.8065872  -0.231 0.817334    
## prof23         0.3225347  0.6282508   0.513 0.607993    
## prof24         1.6081037  4.4724699   0.360 0.719387    
## prof25         0.1573405  4.4647874   0.035 0.971907    
## prof26         0.4967518  0.6232505   0.797 0.425949    
## prof27         1.8327860  2.9478535   0.622 0.534504    
## prof28         1.1144268  4.2096446   0.265 0.791366    
## prof29         0.9001250  2.8660077   0.314 0.753648    
## prof30        -2.9289498  1.7902831  -1.636 0.102696    
## prof31        -2.9770806  9.7805324  -0.304 0.761005    
## prof32        -1.9190175  8.0807807  -0.237 0.812418    
## prof33        -0.0535542  1.4464828  -0.037 0.970486    
## prof34         0.0492405  2.0160550   0.024 0.980528    
## prof35        -0.9337594  6.1894514  -0.151 0.880167    
## prof36         1.3231659  8.2582198   0.160 0.872793    
## prof37        -0.6304172  4.4340350  -0.142 0.887019    
## prof38         0.1984293  0.6124225   0.324 0.746118    
## prof39         0.7934674  0.9390916   0.845 0.398703    
## prof40        -0.8055888  0.4341491  -1.856 0.064322 .  
## prof41        -0.6729398  5.6028405  -0.120 0.904465    
## prof42        -0.7982813  5.1262982  -0.156 0.876338    
## prof43        -3.0353337  6.6798139  -0.454 0.649808    
## prof44         1.1035560  5.9280106   0.186 0.852423    
## prof45         1.7699295  5.5558923   0.319 0.750236    
## prof46        -0.3622876  1.2399309  -0.292 0.770312    
## prof47        -0.3107110  3.4392986  -0.090 0.928065    
## prof48        -1.0727251  0.3168721  -3.385 0.000788 ***
## prof49        -5.6826892 19.5574376  -0.291 0.771550    
## prof50        -0.6246916  4.2033557  -0.149 0.881938    
## prof51        -1.0313551  2.9927472  -0.345 0.730580    
## prof52         1.7411649  6.4869813   0.268 0.788536    
## prof53         0.0719133  1.4550675   0.049 0.960610    
## prof54        -4.4452413 17.0540241  -0.261 0.794504    
## prof55        -0.9945627  0.4318984  -2.303 0.021854 *  
## prof56        -2.0520824 10.3765111  -0.198 0.843341    
## prof57        -0.3872228  1.8890156  -0.205 0.837697    
## prof58        -2.0562549  7.3448334  -0.280 0.779667    
## prof59        -2.7999305 12.4156575  -0.226 0.821704    
## prof60        -1.2816455  1.0323594  -1.241 0.215229    
## prof61        -1.9744362 10.0178872  -0.197 0.843866    
## prof62        -0.1425869  1.0703896  -0.133 0.894100    
## prof63         0.4275225  5.3229625   0.080 0.936029    
## prof64         0.5248498  7.2513765   0.072 0.942340    
## prof65        -5.6579336 21.6004357  -0.262 0.793518    
## prof66        -0.4879983  1.9565841  -0.249 0.803181    
## prof67         1.5875309  8.7740774   0.181 0.856520    
## prof68        -2.4585656  5.9668559  -0.412 0.680554    
## prof69        -1.0555423  2.3890812  -0.442 0.658882    
## prof70        -1.1839153  6.8826579  -0.172 0.863522    
## prof71        -2.6727493 14.5557250  -0.184 0.854412    
## prof72         0.9727225  5.2238491   0.186 0.852385    
## prof73         3.4017003  5.6521512   0.602 0.547653    
## prof74        -2.0374718  9.4786592  -0.215 0.829924    
## prof75        -0.9431379  0.4579671  -2.059 0.040165 *  
## prof76        -0.4763093  0.3592227  -1.326 0.185687    
## prof77        -7.0507263 27.0450841  -0.261 0.794469    
## prof78        -0.8982174  3.7788599  -0.238 0.812251    
## prof79        -0.4938072  2.7256979  -0.181 0.856337    
## prof80         0.4676497  6.3317582   0.074 0.941164    
## prof81        -4.8609457 20.8182944  -0.233 0.815509    
## prof82        -0.3883115  2.8699617  -0.135 0.892448    
## prof83        -3.4104781  8.7702420  -0.389 0.697600    
## prof84         3.1205717 13.7035582   0.228 0.819992    
## prof85                NA         NA      NA       NA    
## prof86         1.6174389  7.7929949   0.208 0.835696    
## prof87        -0.3530191  2.9813985  -0.118 0.905810    
## prof88        -1.0902595  3.5360276  -0.308 0.758008    
## prof89         1.4308338  9.3300214   0.153 0.878201    
## prof90                NA         NA      NA       NA    
## prof91                NA         NA      NA       NA    
## prof92                NA         NA      NA       NA    
## prof93                NA         NA      NA       NA    
## prof94                NA         NA      NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3908 on 365 degrees of freedom
## Multiple R-squared:  0.6082, Adjusted R-squared:  0.504 
## F-statistic:  5.84 on 97 and 365 DF,  p-value: < 2.2e-16

Analysis:

Fit Plot and Residual Plot

Fit plot for the variable ‘eval’

qplot(fitted.values(fit), eval, data = TeachingRatings) + geom_abline(intercept = 0, slope = 1, color = "green")

Analysis:

Residual plot for the variable ‘eval’

ggplot(fit, aes(.fitted, .resid)) + geom_point() + geom_hline(yintercept = 0, color = "red", linetype = "dashed") + ggtitle("Residual Plot")

Analysis:

Exploring model structure

cor(fit$resid, TeachingRatings$eval)
## [1] 0.6259743

Analysis:

plot1 = qplot(minority, fit$resid, geom = "boxplot", data = TeachingRatings) +
  geom_hline(yintercept = 0, color = "red", linetype = "dashed")
plot2 = qplot(age, fit$resid, data = TeachingRatings) + 
  geom_hline(yintercept = 0, color = "red", linetype = "dashed")
plot3 = qplot(gender, fit$resid, geom = "boxplot", data = TeachingRatings) +
  geom_hline(yintercept = 0, color = "red", linetype = "dashed")
plot4 = qplot(credits, fit$resid, geom = "boxplot", data = TeachingRatings) +
  geom_hline(yintercept = 0, color = "red", linetype = "dashed")
plot5 = qplot(beauty, fit$resid, data = TeachingRatings) +
  geom_hline(yintercept = 0, color = "red", linetype = "dashed")
plot6 = qplot(division, fit$resid, geom = "boxplot", data = TeachingRatings) +
  geom_hline(yintercept = 0, color = "red", linetype = "dashed")
plot7 = qplot(native, fit$resid, geom = "boxplot", data = TeachingRatings) +
  geom_hline(yintercept = 0, color = "red", linetype = "dashed")
plot8 = qplot(tenure, fit$resid, geom = "boxplot", data = TeachingRatings) +
  geom_hline(yintercept = 0, color = "red", linetype = "dashed")
plot9 = qplot(students, fit$resid, data = TeachingRatings) +
  geom_hline(yintercept = 0, color = "red", linetype = "dashed")
plot10 = qplot(allstudents, fit$resid, data = TeachingRatings) +
  geom_hline(yintercept = 0, color = "red", linetype = "dashed")
plot11 = qplot(prof, fit$resid, data = TeachingRatings) +
  geom_hline(yintercept = 0, color = "red", linetype = "dashed")
grid.arrange(plot1, plot3, plot4, plot6, plot7, plot8, nrow = 3)

grid.arrange(plot2, plot5, plot9, plot10, plot11, nrow = 3)

Analysis:

Normality of the Residual

mod = fortify(fit)
plot1 = qplot(.stdresid, data = mod, geom = "histogram")
plot2 = qplot(.stdresid, data = mod, geom = "density")
plot3 = qplot(sample = .stdresid, data = mod, geom = "qq") +geom_abline()
grid.arrange(plot1, plot2, plot3, nrow = 1)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 7 rows containing non-finite values (stat_bin).
## Warning: Removed 7 rows containing non-finite values (stat_density).
## Warning: Removed 7 rows containing non-finite values (stat_qq).

Analysis:

Comparing Models

fit.bg = lm(eval ~ ., data = TeachingRatings)
fit.sm = lm(eval ~ 1, data = TeachingRatings)
anova(fit.sm, fit.bg)
## Analysis of Variance Table
## 
## Model 1: eval ~ 1
## Model 2: eval ~ minority + age + gender + credits + beauty + division + 
##     native + tenure + students + allstudents + prof
##   Res.Df     RSS Df Sum of Sq      F    Pr(>F)    
## 1    462 142.239                                  
## 2    365  55.735 97    86.503 5.8401 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Analysis:

Hypothesis Testing

Fit the model for variable ‘eval’ against all the other variables

fit_1 = lm(eval ~ age + gender + credits + beauty + division + native + tenure + students + allstudents + prof, data = TeachingRatings)
summary(fit_1)$r.squared
## [1] 0.6081562

Analysis:

Remove factor variables, credits and tenure

fit_2 = lm(eval ~ age + gender + beauty + division + native + students + allstudents + prof, data = TeachingRatings) 
summary(fit_1)$r.squared
## [1] 0.6081562
summary(fit_2)$r.squared
## [1] 0.5987293
anova(fit_1,fit_2)
## Analysis of Variance Table
## 
## Model 1: eval ~ age + gender + credits + beauty + division + native + 
##     tenure + students + allstudents + prof
## Model 2: eval ~ age + gender + beauty + division + native + students + 
##     allstudents + prof
##   Res.Df    RSS Df Sum of Sq      F   Pr(>F)   
## 1    365 55.735                                
## 2    366 57.076 -1   -1.3409 8.7811 0.003243 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Analysis:

Remove factor variables divison, native and gender

fit_3 = lm(eval ~ age + beauty + students + allstudents + prof, data = TeachingRatings) 
summary(fit_1)$r.squared
## [1] 0.6081562
summary(fit_3)$r.squared
## [1] 0.5985692
anova(fit_1,fit_3)
## Analysis of Variance Table
## 
## Model 1: eval ~ age + gender + credits + beauty + division + native + 
##     tenure + students + allstudents + prof
## Model 2: eval ~ age + beauty + students + allstudents + prof
##   Res.Df    RSS Df Sum of Sq      F  Pr(>F)  
## 1    365 55.735                              
## 2    367 57.099 -2   -1.3636 4.4651 0.01214 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Analysis:

Remove age

fit_4 = lm(eval ~ beauty + students + allstudents + prof, data = TeachingRatings) 
summary(fit_1)$r.squared
## [1] 0.6081562
summary(fit_4)$r.squared
## [1] 0.5985692
anova(fit_1,fit_4)
## Analysis of Variance Table
## 
## Model 1: eval ~ age + gender + credits + beauty + division + native + 
##     tenure + students + allstudents + prof
## Model 2: eval ~ beauty + students + allstudents + prof
##   Res.Df    RSS Df Sum of Sq      F  Pr(>F)  
## 1    365 55.735                              
## 2    367 57.099 -2   -1.3636 4.4651 0.01214 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Analysis:

Remove beauty

fit_5 = lm(eval ~ students + allstudents + prof, data = TeachingRatings) 
summary(fit_5)
## 
## Call:
## lm(formula = eval ~ students + allstudents + prof, data = TeachingRatings)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.51125 -0.19979  0.00942  0.19494  1.01597 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  4.3470122  0.2077959  20.920  < 2e-16 ***
## students    -0.0003496  0.0024708  -0.141 0.887564    
## allstudents -0.0031156  0.0014160  -2.200 0.028405 *  
## prof2       -0.6950659  0.3038517  -2.288 0.022734 *  
## prof3       -0.4785411  0.3421520  -1.399 0.162771    
## prof4       -0.2394047  0.2454749  -0.975 0.330068    
## prof5        0.1900744  0.2562636   0.742 0.458735    
## prof6        0.9739548  0.2642924   3.685 0.000263 ***
## prof7       -0.3770876  0.2672464  -1.411 0.159088    
## prof8       -0.2339032  0.2515044  -0.930 0.352974    
## prof9        0.0558806  0.2497346   0.224 0.823069    
## prof10       0.2778802  0.2387573   1.164 0.245237    
## prof11      -0.9869112  0.3044501  -3.242 0.001297 ** 
## prof12      -0.1285482  0.2681507  -0.479 0.631948    
## prof13      -0.4433878  0.2500451  -1.773 0.077020 .  
## prof14      -0.6628805  0.2820196  -2.350 0.019278 *  
## prof15      -1.1817290  0.2818103  -4.193 3.45e-05 ***
## prof16       0.1379471  0.2559060   0.539 0.590177    
## prof17      -0.0592205  0.2668526  -0.222 0.824498    
## prof18       0.0173654  0.2417590   0.072 0.942777    
## prof19       0.0844698  0.2414515   0.350 0.726658    
## prof20      -0.7962473  0.2395309  -3.324 0.000976 ***
## prof21      -0.6528306  0.2584028  -2.526 0.011943 *  
## prof22      -0.9593634  0.4434698  -2.163 0.031162 *  
## prof23      -0.0551294  0.2662047  -0.207 0.836051    
## prof24       0.2344740  0.2512613   0.933 0.351337    
## prof25       0.0150680  0.3440225   0.044 0.965088    
## prof26       0.0001329  0.2651912   0.001 0.999601    
## prof27      -0.2973520  0.2524220  -1.178 0.239562    
## prof28      -0.3895891  0.2838741  -1.372 0.170777    
## prof29      -0.2670415  0.2860347  -0.934 0.351125    
## prof30      -2.0061287  0.4444934  -4.513 8.61e-06 ***
## prof31      -0.5977400  0.2523059  -2.369 0.018348 *  
## prof32      -0.5009359  0.3063302  -1.635 0.102847    
## prof33       0.2204366  0.2684540   0.821 0.412103    
## prof34      -0.5905636  0.2305408  -2.562 0.010817 *  
## prof35      -0.0267562  0.3057718  -0.088 0.930319    
## prof36      -0.4844748  0.2812928  -1.722 0.085855 .  
## prof37      -0.6487074  0.2469256  -2.627 0.008972 ** 
## prof38      -0.1704852  0.3062421  -0.557 0.578071    
## prof39       0.1910362  0.2457589   0.777 0.437464    
## prof40      -0.7888025  0.4439598  -1.777 0.076439 .  
## prof41       0.3605282  0.2698872   1.336 0.182427    
## prof42       0.2576044  0.2810964   0.916 0.360045    
## prof43      -0.1880755  0.2826296  -0.665 0.506182    
## prof44      -0.2520091  0.3029762  -0.832 0.406075    
## prof45       0.2128984  0.2820087   0.755 0.450772    
## prof46      -0.4793289  0.3051750  -1.571 0.117121    
## prof47      -0.7319911  0.4428641  -1.653 0.099215 .  
## prof48      -0.8107118  0.3054797  -2.654 0.008303 ** 
## prof49      -0.4160068  0.2514346  -1.655 0.098874 .  
## prof50      -0.0535531  0.2285335  -0.234 0.814857    
## prof51       0.2954874  0.2605779   1.134 0.257547    
## prof52       0.1201735  0.2810010   0.428 0.669148    
## prof53       0.4346320  0.2483983   1.750 0.080999 .  
## prof54      -0.4025378  0.2560864  -1.572 0.116839    
## prof55      -0.9788460  0.3046807  -3.213 0.001431 ** 
## prof56       0.1080650  0.2734638   0.395 0.692946    
## prof57      -0.2060806  0.3421015  -0.602 0.547283    
## prof58      -0.1015104  0.2334153  -0.435 0.663897    
## prof59      -0.8962169  0.3451706  -2.596 0.009798 ** 
## prof60      -1.3107118  0.3054797  -4.291 2.28e-05 ***
## prof61      -0.0633354  0.4420009  -0.143 0.886138    
## prof62       0.0510323  0.4432009   0.115 0.908393    
## prof63      -0.2338352  0.3420928  -0.684 0.494695    
## prof64      -0.5219975  0.3025884  -1.725 0.085350 .  
## prof65      -0.1087745  0.2483484  -0.438 0.661650    
## prof66      -0.6154672  0.2605524  -2.362 0.018690 *  
## prof67      -0.3523139  0.3458348  -1.019 0.309000    
## prof68      -1.7154487  0.3057442  -5.611 3.97e-08 ***
## prof69      -1.1218336  0.4415050  -2.541 0.011467 *  
## prof70       0.2375440  0.2421761   0.981 0.327301    
## prof71       0.5111179  0.2380797   2.147 0.032461 *  
## prof72      -0.3034285  0.2562312  -1.184 0.237101    
## prof73       2.0999060  0.3895491   5.391 1.26e-07 ***
## prof74      -0.2122714  0.2792482  -0.760 0.447651    
## prof75      -0.5952872  0.3436035  -1.732 0.084028 .  
## prof76      -0.6660992  0.3453437  -1.929 0.054526 .  
## prof77      -0.2189106  0.2570192  -0.852 0.394920    
## prof78      -0.2702403  0.2821959  -0.958 0.338878    
## prof79      -0.3172439  0.3025911  -1.048 0.295133    
## prof80      -0.3822178  0.2812016  -1.359 0.174908    
## prof81       0.2721896  0.2834314   0.960 0.337518    
## prof82      -0.2484547  0.2363601  -1.051 0.293872    
## prof83      -0.1108817  0.2696040  -0.411 0.681110    
## prof84      -0.0200726  0.2691894  -0.075 0.940600    
## prof85       0.5027269  0.2466783   2.038 0.042268 *  
## prof86      -0.0096631  0.3051476  -0.032 0.974755    
## prof87       0.3079574  0.3426071   0.899 0.369314    
## prof88      -0.9037480  0.2525310  -3.579 0.000392 ***
## prof89      -0.4585025  0.3040943  -1.508 0.132475    
## prof90      -0.5058982  0.3446608  -1.468 0.143011    
## prof91       0.2832747  0.3031993   0.934 0.350771    
## prof92      -0.3617047  0.2521869  -1.434 0.152346    
## prof93       0.1133988  0.2555754   0.444 0.657521    
## prof94      -0.6598206  0.2799070  -2.357 0.018934 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3944 on 367 degrees of freedom
## Multiple R-squared:  0.5986, Adjusted R-squared:  0.4947 
## F-statistic:  5.76 on 95 and 367 DF,  p-value: < 2.2e-16
summary(fit_1)$r.squared
## [1] 0.6081562
summary(fit_5)$r.squared
## [1] 0.5985692
anova(fit_1,fit_5)
## Analysis of Variance Table
## 
## Model 1: eval ~ age + gender + credits + beauty + division + native + 
##     tenure + students + allstudents + prof
## Model 2: eval ~ students + allstudents + prof
##   Res.Df    RSS Df Sum of Sq      F  Pr(>F)  
## 1    365 55.735                              
## 2    367 57.099 -2   -1.3636 4.4651 0.01214 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Analysis:

Goodness of the Model

Log Transformation

fit_6 = lm(log(eval) ~ students + allstudents + prof, data = TeachingRatings) 

Sqrt Transformation

fit_7 = lm(sqrt(eval) ~ students + allstudents + prof, data = TeachingRatings) 

Fit Plot

p1 = qplot(fitted(fit_5), eval, data = TeachingRatings) + 
  geom_abline(intercept = 0, slope = 1, color = "red") 
p2 = qplot(fitted(fit_6), log(TeachingRatings$eval), data = TeachingRatings) + 
  geom_abline(intercept = 0, slope = 1, color = "red")
p3 = qplot(fitted(fit_7), sqrt(TeachingRatings$eval), data = TeachingRatings) + 
  geom_abline(intercept = 0, slope = 1, color = "red")
grid.arrange(p1,p2,p3)

Analysis:

Pearson Correlation

cor(fitted(fit_5), TeachingRatings$eval)
## [1] 0.7736726
cor(fitted(fit_6), log(TeachingRatings$eval))
## [1] 0.7725017
cor(fitted(fit_7), sqrt(TeachingRatings$eval))
## [1] 0.7732936

Analysis:

Plot the model fit_5 to estimate the goodness

plot(fit_5)
## Warning: not plotting observations with leverage one:
##   22, 30, 40, 47, 61, 62, 69

## Warning: not plotting observations with leverage one:
##   22, 30, 40, 47, 61, 62, 69

Analysis:

Confidence Interval

CI for fit_5

Between 5% and 95%

confint(fit_5)
##                    2.5 %        97.5 %
## (Intercept)  3.938392272  4.7556322269
## students    -0.005208219  0.0045090601
## allstudents -0.005900103 -0.0003311933
## prof2       -1.292574780 -0.0975569671
## prof3       -1.151365512  0.1942832214
## prof4       -0.722118553  0.2433090621
## prof5       -0.313855009  0.6940037589
## prof6        0.454237273  1.4936722355
## prof7       -0.902614016  0.1484387866
## prof8       -0.728473782  0.2606672956
## prof9       -0.435209791  0.5469709169
## prof10      -0.191623806  0.7473841102
## prof11      -1.585596789 -0.3882255219
## prof12      -0.655852936  0.3987565112
## prof13      -0.935088712  0.0483131025
## prof14      -1.217457614 -0.1083034012
## prof15      -1.735894624 -0.6275633061
## prof16      -0.365279028  0.6411732435
## prof17      -0.583972435  0.4655314552
## prof18      -0.458041232  0.4927721280
## prof19      -0.390332346  0.5592718836
## prof20      -1.267272617 -0.3252219843
## prof21      -1.160966427 -0.1446947504
## prof22      -1.831424201 -0.0873025796
## prof23      -0.578607391  0.4683485629
## prof24      -0.259618563  0.7285665576
## prof25      -0.661434734  0.6915707183
## prof26      -0.521352055  0.5216177705
## prof27      -0.793726930  0.1990229476
## prof28      -0.947812926  0.1686347371
## prof29      -0.829514175  0.2954311231
## prof30      -2.880202243 -1.1320551137
## prof31      -1.093886656 -0.1015932543
## prof32      -1.103318536  0.1014466734
## prof33      -0.307464498  0.7483376983
## prof34      -1.043910273 -0.1372168475
## prof35      -0.628040866  0.5745285115
## prof36      -1.037622814  0.0686731899
## prof37      -1.134274050 -0.1631408320
## prof38      -0.772694722  0.4317243358
## prof39      -0.292236165  0.6743085180
## prof40      -1.661826888  0.0842218061
## prof41      -0.170191082  0.8912475349
## prof42      -0.295157236  0.8103660835
## prof43      -0.743852223  0.3677011853
## prof44      -0.847796350  0.3437782411
## prof45      -0.341657231  0.7674540398
## prof46      -1.079439947  0.1207821850
## prof47      -1.602860863  0.1388785644
## prof48      -1.411422120 -0.2100015419
## prof49      -0.910440064  0.0784263678
## prof50      -0.502952586  0.3958463883
## prof51      -0.216925709  0.8079005266
## prof52      -0.432400500  0.6727475599
## prof53      -0.053830548  0.9230944566
## prof54      -0.906118538  0.1010430125
## prof55      -1.577985022 -0.3797069824
## prof56      -0.429687569  0.6458174831
## prof57      -0.878805621  0.4666444929
## prof58      -0.560509669  0.3574888309
## prof59      -1.574977277 -0.2174565662
## prof60      -1.911422120 -0.7100015419
## prof61      -0.932507547  0.8058366705
## prof62      -0.820499595  0.9225641797
## prof63      -0.906543092  0.4388727814
## prof64      -1.117022146  0.0730272254
## prof65      -0.597138993  0.3795899033
## prof66      -1.127830141 -0.1031043312
## prof67      -1.032380311  0.3277525683
## prof68      -2.316679111 -1.1142182108
## prof69      -1.990030671 -0.2536365172
## prof70      -0.238682863  0.7137709112
## prof71       0.042946332  0.9792895328
## prof72      -0.807294017  0.2004370483
## prof73       1.333877482  2.8659344460
## prof74      -0.761398799  0.3368559644
## prof75      -1.270966060  0.0803915655
## prof76      -1.345199919  0.0130014355
## prof77      -0.724325667  0.2865045152
## prof78      -0.825164060  0.2846834691
## prof79      -0.912273742  0.2777860277
## prof80      -0.935186406  0.1707507472
## prof81      -0.285163766  0.8295430113
## prof82      -0.713244705  0.2163353753
## prof83      -0.641044190  0.4192808356
## prof84      -0.549419752  0.5092745267
## prof85       0.017646603  0.9878072622
## prof86      -0.609720219  0.5903940337
## prof87      -0.365761910  0.9816767121
## prof88      -1.400337313 -0.4071586570
## prof89      -1.056488459  0.1394835396
## prof90      -1.183656118  0.1718597930
## prof91      -0.312951302  0.8795006273
## prof92      -0.857617317  0.1342080074
## prof93      -0.389177236  0.6159747881
## prof94      -1.210243315 -0.1093977970

Using Boniferroni Correction, 99%

confint(fit_5, level = 0.99)
##                    0.5 %        99.5 %
## (Intercept)  3.808968111  4.8850563878
## students    -0.006747119  0.0060479603
## allstudents -0.006782037  0.0005507404
## prof2       -1.481826635  0.0916948880
## prof3       -1.364472393  0.4073901026
## prof4       -0.875010807  0.3962013162
## prof5       -0.473466974  0.8536157241
## prof6        0.289624668  1.6582848403
## prof7       -1.069066508  0.3148912785
## prof8       -0.885121475  0.4173149890
## prof9       -0.590755189  0.7025163148
## prof10      -0.340332041  0.8960923456
## prof11      -1.775221355 -0.1986009562
## prof12      -0.822868684  0.5657722597
## prof13      -1.090827493  0.2040518838
## prof14      -1.393111475  0.0673504596
## prof15      -1.911418165 -0.4520397652
## prof16      -0.524668250  0.8005624655
## prof17      -0.750179629  0.6317386498
## prof18      -0.608619065  0.6433499607
## prof19      -0.540718692  0.7096582295
## prof20      -1.416462720 -0.1760318817
## prof21      -1.321910723  0.0162495453
## prof22      -2.107636194  0.1889094140
## prof23      -0.744411076  0.6341522476
## prof24      -0.416114864  0.8850628586
## prof25      -0.875706680  0.9058426641
## prof26      -0.686524467  0.6867901824
## prof27      -0.950946140  0.3562421573
## prof28      -1.124621831  0.3454436425
## prof29      -1.007668829  0.4735857769
## prof30      -3.157051746 -0.8552056109
## prof31      -1.251033575  0.0555536645
## prof32      -1.294114061  0.2922421984
## prof33      -0.474669139  0.9155423393
## prof34      -1.187500946  0.0063738262
## prof35      -0.818488643  0.7649762884
## prof36      -1.212824028  0.2438744036
## prof37      -1.288069885 -0.0093449964
## prof38      -0.963435428  0.6224650417
## prof39      -0.445305326  0.8273776793
## prof40      -1.938344067  0.3607389852
## prof41      -0.338288348  1.0593448012
## prof42      -0.470236081  0.9854449291
## prof43      -0.919886039  0.5437350005
## prof44      -1.036502911  0.5324848022
## prof45      -0.517304292  0.9431011000
## prof46      -1.269515997  0.3108582347
## prof47      -1.878695595  0.4147132962
## prof48      -1.601687964 -0.0197356974
## prof49      -1.067044263  0.2350305663
## prof50      -0.645293036  0.5381868384
## prof51      -0.379224767  0.9701995854
## prof52      -0.607419916  0.8477669766
## prof53      -0.208543614  1.0778075224
## prof54      -1.065620086  0.2605445612
## prof55      -1.767753191 -0.1899388136
## prof56      -0.600012499  0.8161424139
## prof57      -1.091881047  0.6797199193
## prof58      -0.705890698  0.5028698598
## prof59      -1.789964293 -0.0024695507
## prof60      -2.101687964 -0.5197356974
## prof61      -1.207804588  1.0811337118
## prof62      -1.096544060  1.1986086449
## prof63      -1.119613096  0.6519427853
## prof64      -1.305487162  0.2614922414
## prof65      -0.751821001  0.5342719119
## prof66      -1.290113296  0.0591788235
## prof67      -1.247781009  0.5431532661
## prof68      -2.507109708 -0.9237876133
## prof69      -2.265018886  0.0213516975
## prof70      -0.389520484  0.8646085320
## prof71      -0.105339899  1.1275757642
## prof72      -0.966885758  0.3600287896
## prof73       1.091249616  3.1085623127
## prof74      -0.935326542  0.5107837077
## prof75      -1.484977044  0.2944025492
## prof76      -1.560294726  0.2280962428
## prof77      -0.884408208  0.4465870556
## prof78      -1.000927720  0.4604471285
## prof79      -1.100740404  0.4662526904
## prof80      -1.110330789  0.3458951307
## prof81      -0.461696972  1.0060762172
## prof82      -0.860459879  0.3635505488
## prof83      -0.808965100  0.5872017453
## prof84      -0.717082405  0.6769371793
## prof85      -0.135995211  1.1414490763
## prof86      -0.799779185  0.7804529989
## prof87      -0.579152251  1.1950670534
## prof88      -1.557624427 -0.2498715429
## prof89      -1.245891426  0.3288865067
## prof90      -1.398325639  0.3865293135
## prof91      -0.501796805  1.0683461303
## prof92      -1.014690107  0.2912807979
## prof93      -0.548360542  0.7751580934
## prof94      -1.384581349  0.0649402373

Analysis:

Joint Confidence Region

Check hypothesis for students and allstudents variables

plot(ellipse(fit_5, c("students", "allstudents")), 
     type = "l", 
     main = "Joint Confidence Region")
points(0,0)
points(coef(fit_5)["students"], coef(fit_5)["allstudents"], 
       pch=18)
abline(v = confint(fit_5)["students",], lty = 2, color = 'red')
## Warning in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...):
## "color" is not a graphical parameter
abline(h = confint(fit_5)["allstudents",], lty = 2, color = 'red')
## Warning in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...):
## "color" is not a graphical parameter

Analysis:

Checking for Non-Constant Variance

mod <- fortify(fit_5)
p1 <- qplot(.fitted, .resid, data = mod) + 
  geom_hline(yintercept = 0, linetype = "dashed") + 
  labs(title = "Residuals vs Fitted", x = "Fitted", y = "Residuals") + 
  geom_smooth(color = "red", se = F)
p2 <- qplot(.fitted, abs(.resid), data = mod) + 
  geom_hline(yintercept = 0, linetype = "dashed") + 
  labs(title = "Scale-Location", x = "Fitted", y = "|Residuals|") + 
  geom_smooth(method = "lm", color = "red", se = F)
grid.arrange(p1, p2, nrow = 2)

Analysis:

An approximate test of non-contant error variance.

summary(lm(abs(residuals(fit_5)) ~ fitted(fit_5)))
## 
## Call:
## lm(formula = abs(residuals(fit_5)) ~ fitted(fit_5))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.33889 -0.17483 -0.05049  0.12348  1.24902 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    0.44193    0.10138   4.359 1.61e-05 ***
## fitted(fit_5) -0.04480    0.02521  -1.777   0.0763 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2326 on 461 degrees of freedom
## Multiple R-squared:  0.006802,   Adjusted R-squared:  0.004648 
## F-statistic: 3.157 on 1 and 461 DF,  p-value: 0.07625

Analysis:

Breuch Pagan test to formally check presence of heteroscedasticity

bptest(fit_5)
## 
##  studentized Breusch-Pagan test
## 
## data:  fit_5
## BP = 165.02, df = 95, p-value = 1.134e-05

Analysis:

An F-test for non-constant error variance between two groups defined by a predictor

group <- TeachingRatings$students > 36
p1 <- qplot(students, .resid, data = mod, color = group)
p2 <- qplot(group, .resid, data = mod, geom = "boxplot")
grid.arrange(p1, p2, nrow = 2)

var.test(residuals(fit_5)[TeachingRatings$students > 36], residuals(fit_5)[TeachingRatings$students < 36])
## 
##  F test to compare two variances
## 
## data:  residuals(fit_5)[TeachingRatings$students > 36] and residuals(fit_5)[TeachingRatings$students < 36]
## F = 0.51042, num df = 126, denom df = 329, p-value = 1.973e-05
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.3850553 0.6896998
## sample estimates:
## ratio of variances 
##          0.5104208

Analysis:

Checking for Non-Normal Errors

Normal QQ-plots for detecting nonnormality

gs <- lm(sqrt(eval) ~ students + allstudents + prof, data = TeachingRatings)
modgg <- fortify(fit_5)
modgs <- fortify(gs)
p1 <- qplot(sample = scale(.resid), data = modgg) + 
  geom_abline(intercept = 0, slope = 1, color = "red") + 
  labs(title = "Untransformed y", y = "Residuals")

p2 <- qplot(sample = scale(.resid), data = modgs) + 
  geom_abline(intercept = 0, slope = 1, color = "red") + 
  labs(title = "Sqrt-Tranformed y", y = "Residuals")
grid.arrange(p1, p2, nrow = 2)

Analysis:

Hisograms, kernal density plots

p1 <- qplot(scale(.resid), data = modgg, geom = "blank") + 
  geom_line(aes(y = ..density.., colour = "Empirical"), stat = "density") + 
  stat_function(fun = dnorm, aes(colour = "Normal")) + 
  geom_histogram(aes(y = ..density..), alpha = 0.4) + 
  scale_colour_manual(name = "Density", values = c("red", "blue")) + 
  theme(legend.position = c(0.85, 0.85)) + labs(title = "Untransformed y", 
                                                y = "Residuals")
p2 <- qplot(scale(.resid), data = modgs, geom = "blank") + 
  geom_line(aes(y = ..density.., colour = "Empirical"), stat = "density") + 
  stat_function(fun = dnorm, aes(colour = "Normal")) + 
  geom_histogram(aes(y = ..density..), alpha = 0.4) + 
  scale_colour_manual(name = "Density", values = c("red", "blue")) + 
  theme(legend.position = c(0.85, 0.85)) + labs(title = "Sqrt-Tranformed y", 
                                                y = "Residuals")
grid.arrange(p1, p2, nrow = 2)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Analysis:

The Shapiro-Wilk test of normality

shapiro.test(residuals(fit_5))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(fit_5)
## W = 0.98411, p-value = 5.904e-05
shapiro.test(residuals(gs))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(gs)
## W = 0.97694, p-value = 1.058e-06

Analysis:

Box-Cox Power Transform

lambda <- powerTransform(fit_5)
lam <- lambda$lambda
glam <- lm(eval^lam ~ students + allstudents + prof, data = TeachingRatings)
modlam <- fortify(glam)

p1 <- qplot(sample = scale(.resid), data = modgs) + 
  geom_abline(intercept = 0, slope = 1, color = "red") + 
  labs(title = "Normal QQ-Plot", y = "Residuals Sqrt-transformed")

p2 <- qplot(sample = scale(.resid), data = modlam) + 
  geom_abline(intercept = 0, slope = 1, color = "red") + 
  labs(title = "Normal QQ-Plot", y = "Residuals Box-Cox-Transform")

grid.arrange(p1, p2, nrow = 1)

shapiro.test(residuals(glam))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(glam)
## W = 0.99193, p-value = 0.01298

Analysis:

Checking Influential Outliers

Influence plot

influencePlot(glam)

##       StudRes       Hat      CookD
## 22        NaN 1.0000000        NaN
## 99  -3.534839 0.3340734 0.06331249
## 126 -3.666038 0.1429530 0.02258575
abc <- row.names(TeachingRatings)
halfnorm(lm.influence(glam)$hat, labs = abc, ylab = "Leverages")

cook <- cooks.distance(glam)
halfnorm(cook, 3, labs = abc, ylab = "Cook's distance")

Omnibus diagnostic plot function

oldpar = par(mfrow = c(2,2))
plot(glam, main = "TeachingRatings Data")
## Warning: not plotting observations with leverage one:
##   22, 30, 40, 47, 61, 62, 69

## Warning: not plotting observations with leverage one:
##   22, 30, 40, 47, 61, 62, 69

Analysis: