The Prestige data frame contains data on the prestige of Canadian Occupationshas 102 rows and 6 columns. The observations are occupations.
Added-variable plots for linear and generalized linear models graphs outcome vs predictor variables holding the rest constant (also called partial-regression plots)
library(car)
reg1 <-lm(prestige ~ education + income + type, data = Prestige)
avPlots(reg1, pch=16, col="red", cex=0.7)
# Help identify the effect(or influence) of an observation on the regression coefficient of the predictor variable
reg1 <-lm(prestige ~ education + income + type, data = Prestige)
influenceIndexPlot(reg1, pch=17)
Cook’s distance measures how much an observation influences the overall model or predicted values
Studentizided residuals are the residuals divided by their estimated standard deviation as a way to standardized
Bonferronitest to identify outliers
Hat-points identify influential observations (have a high impact on the predictor variables)
NOTE: If an observation is an outlier and influential (high leverage) then that observation can change the fit of the linear model, it is advisable to remove it. To remove a case(s) type
# reg1a <-update(prestige.reg4, subset=rownames(Prestige) != "general.managers")
#reg1b <-update(prestige.reg4, subset= !(rownames(Prestige) %in% c("general.managers","medical.technicians")))
Creates a bubble-plot combining the display of Studentizedresiduals, hat-values, and Cook’s distance (represented in the circles).
library(car)
reg1 <-lm(prestige ~ education + income + type, data = Prestige)
influencePlot(reg1)
## StudRes Hat CookD
## general.managers -1.3134574 0.33504477 0.172503975
## physicians -0.3953204 0.22420309 0.009115491
## medical.technicians 2.8210910 0.06858836 0.109052582
## electronic.workers 2.2251940 0.02701237 0.026372394
reg1 <-lm(prestige ~ education + income + type, data = Prestige)
qqPlot(reg1)
## medical.technicians electronic.workers
## 31 82
# Look for the tails, points should be close to the line or within the confidence intervals.