Regression Deletion Diagnostics

This suite of functions can be used to compute some of the regression (leave-one-out deletion) diagnostics for linear and generalized linear models discussed in Belsley, Kuh and Welsch (1980), Cook and Weisberg (1982), etc.

Details

The primary high-level function is influence.measures() which produces a class "infl" object tabular display showing the DFBETAS for each model variable, DFFITS, covariance ratios, Cook's distances and the diagonal elements of the hat matrix. Cases which are influential with respect to any of these measures are marked with an asterisk.
The functions dfbetas(), dffits(), covratio() and cooks.distance() provide direct access to the corresponding diagnostic quantities.
Functions rstandard() and rstudent() give the standardized and Studentized residuals respectively.
(These functions re-normalize the residuals to have unit variance, using an overall and leave-one-out measure of the error variance respectively.)

R commands

The optional infl(), res() and sd() arguments are there to encourage the use of these direct access functions, in situations where, e.g., the underlying basic influence measures (from lm.influence() or the generic influence) are already available.
Note that cases with weights == 0 are dropped from all these functions, but that if a linear model has been fitted with na.action = na.exclude, suitable values are filled in for the cases excluded during fitting.

Implementation

## S3 method for class 'lm' rstandard(Fit_4, infl = lm.influence(Fit_4, do.coef = FALSE), sd = sqrt(deviance(Fit_4)/df.residual(Fit_4)))

#rstandard(Fit_4)

`rstudent()`

### rstudent(Fit_4=)

## S3 method for class 'lm'
#rstudent(Fit_4, infl = lm.influence(Fit_4, do.coef = FALSE),
#res = infl$wt.res)


dffits(Fit_4)

##           1           2           3           4           5           6 
##  0.52971523 -0.05017807  0.36652235  0.32089734 -0.05655190  0.10479628 
##           7           8           9          10          11          12 
##  0.26450803 -0.56627494  0.15075718  0.02864957  0.12823528  0.72871457 
##          13          14          15          16          17          18 
## -0.19602704  0.17406599  0.94834090 -0.18018808  0.16248110 -0.38022994 
##          19          20          21          22          23          24 
## -0.34375517 -0.58567945 -0.02588694 -0.08480492 -0.29968252  0.25877488 
##          25          26          27          28          29          30 
##  0.11729707 -0.19188145 -0.34863248 -0.34332200 -0.38166182 -0.64420967

Influential Points in Regression

Sometimes in regression analysis, a few data points have disproportionate effects on the slope of the regression equation. We can describe how to identify those influential points.

DFBETAS

inflm.fit <- influence.measures(Fit_4)
which(apply(inflm.fit$is.inf, 1, any))

##  6 15 24 26 
##  6 15 24 26

dfbeta(Fit_4)

##     (Intercept)      Acetic           H2S      Lactic
## 1    7.07731820 -0.68012625  0.0420550124 -2.19359377
## 2   -0.38764360  0.11032996  0.0202150540 -0.27859860
## 3    1.19605590 -0.47130371 -0.1602239216  1.97108492
## 4   -0.60698006 -0.30825961  0.0168790642  1.77930153
## 5   -0.83595216  0.09838964 -0.0083766644  0.20147832
## 6    0.11759125  0.11243109  0.0837737105 -0.80330668
## 7   -0.07205384  0.23381240  0.2459615860 -1.71692172
## 8    4.83294413 -0.52904953 -0.0674181836 -1.49515621
## 9    1.70869606 -0.29187091 -0.0680016586  0.33250675
## 10   0.05436601 -0.02673581 -0.0238338121  0.18218091
## 11  -0.77410871  0.06359132 -0.0656032228  0.68395577
## 12 -10.08450212  1.56441733 -0.1891603413  2.23346307
## 13  -3.03365018  0.46595867  0.0343966900  0.06437970
## 14   1.32398602 -0.10433473 -0.0288768367 -0.20427672
## 15 -11.11994787  2.79814047 -0.1268840267 -1.78794068
## 16   1.56449261 -0.26498213 -0.1151354613  0.28964504
## 17   2.32664393 -0.33710478 -0.0196510188 -0.11807832
## 18   0.19882307 -0.03715341  0.2918596950 -1.54856780
## 19  -2.78292195  0.93355752 -0.0009873396 -1.90407108
## 20  -3.63470859  1.69175579 -0.2757600708 -3.06238914
## 21  -0.47524232  0.08659421 -0.0120878916  0.03299263
## 22  -0.57516179  0.04292218  0.0079113411  0.10057971
## 23  -2.40345340  0.51056494 -0.3465413439  0.99119107
## 24   0.51352039 -0.49721920  0.1737491648  0.95109008
## 25  -0.23243413  0.15705824 -0.1079103901  0.10333016
## 26   2.37695270 -0.60880791  0.2042806670 -0.25751941
## 27   5.75579661 -1.19050910  0.1533294223 -0.29643604
## 28   0.01256323 -0.42767305  0.2416490573  0.32629777
## 29   0.93560400 -0.78927388 -0.1888564588  2.93631661
## 30   7.93824546 -2.50462227  0.3153678402  2.42252121

## S3 method for class 'lm'

dfbeta(Fit_4, 
infl = lm.influence(Fit_4, do.coef = TRUE))

##     (Intercept)      Acetic           H2S      Lactic
## 1    7.07731820 -0.68012625  0.0420550124 -2.19359377
## 2   -0.38764360  0.11032996  0.0202150540 -0.27859860
## 3    1.19605590 -0.47130371 -0.1602239216  1.97108492
## 4   -0.60698006 -0.30825961  0.0168790642  1.77930153
## 5   -0.83595216  0.09838964 -0.0083766644  0.20147832
## 6    0.11759125  0.11243109  0.0837737105 -0.80330668
## 7   -0.07205384  0.23381240  0.2459615860 -1.71692172
## 8    4.83294413 -0.52904953 -0.0674181836 -1.49515621
## 9    1.70869606 -0.29187091 -0.0680016586  0.33250675
## 10   0.05436601 -0.02673581 -0.0238338121  0.18218091
## 11  -0.77410871  0.06359132 -0.0656032228  0.68395577
## 12 -10.08450212  1.56441733 -0.1891603413  2.23346307
## 13  -3.03365018  0.46595867  0.0343966900  0.06437970
## 14   1.32398602 -0.10433473 -0.0288768367 -0.20427672
## 15 -11.11994787  2.79814047 -0.1268840267 -1.78794068
## 16   1.56449261 -0.26498213 -0.1151354613  0.28964504
## 17   2.32664393 -0.33710478 -0.0196510188 -0.11807832
## 18   0.19882307 -0.03715341  0.2918596950 -1.54856780
## 19  -2.78292195  0.93355752 -0.0009873396 -1.90407108
## 20  -3.63470859  1.69175579 -0.2757600708 -3.06238914
## 21  -0.47524232  0.08659421 -0.0120878916  0.03299263
## 22  -0.57516179  0.04292218  0.0079113411  0.10057971
## 23  -2.40345340  0.51056494 -0.3465413439  0.99119107
## 24   0.51352039 -0.49721920  0.1737491648  0.95109008
## 25  -0.23243413  0.15705824 -0.1079103901  0.10333016
## 26   2.37695270 -0.60880791  0.2042806670 -0.25751941
## 27   5.75579661 -1.19050910  0.1533294223 -0.29643604
## 28   0.01256323 -0.42767305  0.2416490573  0.32629777
## 29   0.93560400 -0.78927388 -0.1888564588  2.93631661
## 30   7.93824546 -2.50462227  0.3153678402  2.42252121

dfbetas(Fit_4)%>% 
  head(4) %>% 
  round(2)

##   (Intercept) Acetic   H2S Lactic
## 1        0.36  -0.15  0.03  -0.26
## 2       -0.02   0.02  0.02  -0.03
## 3        0.06  -0.11 -0.13   0.23
## 4       -0.03  -0.07  0.01   0.21

DFBETAS

## S3 method for class 'lm'
dfbetas(Fit_4, 
infl = lm.influence(Fit_4, do.coef = TRUE)) %>% 
  head(4) %>% 
  round(2)

##   (Intercept) Acetic   H2S Lactic
## 1        0.36  -0.15  0.03  -0.26
## 2       -0.02   0.02  0.02  -0.03
## 3        0.06  -0.11 -0.13   0.23
## 4       -0.03  -0.07  0.01   0.21

COVRATIOS

covratio(Fit_4, 
infl = lm.influence(Fit_4, do.coef = FALSE),
res = weighted.residuals(Fit_4))%>% 
  head(4) %>% 
  round(2)

##    1    2    3    4 
## 1.15 1.26 0.90 1.09

Arguments


  cooks.distance(Fit_4)
  ## S3 method for class 'lm'
  cooks.distance(Fit_4, 
    infl = lm.influence(Fit_4, do.coef = FALSE),
    res = weighted.residuals(Fit_4),
    sd = sqrt(deviance(Fit_4)/df.residual(Fit_4)),
    hat = infl$hat)

hatvalues(Fit_4) %>% head(6) %>% round(2)

##    1    2    3    4    5    6 
## 0.18 0.08 0.06 0.09 0.13 0.23

## S3 method for class 'lm'

## hatvalues(Fit_4, 
## infl = lm.influence(Fit_4, do.coef = FALSE))

Regression Deletion Diagnostics

DragonflyStats.github.io

Regression Deletion Diagnostics

Details

R commands

Implementation

rstudent()

Influential Points in Regression

DFBETAS

DFBETAS

COVRATIOS

Arguments

`rstudent()`