Leverage

In statistics, leverage is a term used in connection with regression analysis and, in particular, in analyses aimed at identifying those observations that are far away from corresponding average predictor values.

Model

We will use the following exemplar model for demonstration purposes. This data is from the mtcars data set.

library(car)
fit <- lm(mpg~disp+hp+wt+drat, data=mtcars)

Leverage Score

In linear regression model, the leverage score for \[ h_{ii} = (H)_{ii}\] the \(i-th\) data unit is defined as:

\[ H = X(X^{T} X)^{-1}X^{T} \] the diagonal of the hat matrix.

The leverage score is also known as the observation self-sensitivity or self-influence

Leverage

The leverage of an observation is based on how much the observation’s value on the predictor variable differs from the mean of the predictor variable. The greater an observation’s leverage, the more potential it has to be an influential observation.

For example, an observation with a value equal to the mean on the predictor variable has no influence on the slope of the regression line regardless of its value on the criterion variable. On the other hand, an observation that is extreme on the predictor variable has the potential to affect the slope greatly.

Generally, a point with leverage greater than \((2k+2)/n\) should be carefully examined, where k is the number of predictor variables and n is the number of observations.
Leverage points do not necessarily have a large effect on the outcome of fitting regression models.
Leverage points are those observations, if any, made at extreme or outlying values of the independent variables such that the lack of neighboring observations means that the fitted regression model will pass close to that particular observation.

Standardization

Another result of the fact that points further out on X have more leverage is that they tend to be closer to the regression line (or more accurately: the regression line is fit so as to be closer to them) than points that are near \(\bar{x}\). In other words, the residual standard deviation can differ at different points on X (even if the error standard deviation is constant). To correct for this, residuals are often standardized so that they have constant variance (assuming the underlying data generating process is homoscedastic, of course).

Calculation of Leverage (h)

The first step is to standardize the predictor variable so that it has a mean of 0 and a standard deviation of 1.

Then, the leverage (h) is computed by squaring the observation’s value on the standardized predictor variable, adding 1, and dividing by the number of observations.

broom::augment(fit)

## # A tibble: 32 x 12
##    .rownames    mpg  disp    hp    wt  drat .fitted .resid   .hat .sigma .cooksd
##    <chr>      <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>  <dbl>  <dbl>  <dbl>   <dbl>
##  1 Mazda RX4   21    160    110  2.62  3.9     23.7 -2.71  0.0460   2.60 1.10e-2
##  2 Mazda RX4~  21    160    110  2.88  3.9     22.8 -1.82  0.0499   2.63 5.43e-3
##  3 Datsun 710  22.8  108     93  2.32  3.85    25.1 -2.26  0.0674   2.61 1.17e-2
##  4 Hornet 4 ~  21.4  258    110  3.22  3.08    20.6  0.835 0.123    2.65 3.30e-3
##  5 Hornet Sp~  18.7  360    175  3.44  3.15    18.0  0.666 0.172    2.65 3.27e-3
##  6 Valiant     18.1  225    105  3.46  2.76    19.2 -1.10  0.201    2.64 1.12e-2
##  7 Duster 360  14.3  360    245  3.57  3.21    15.3 -0.953 0.145    2.64 5.30e-3
##  8 Merc 240D   24.4  147.    62  3.19  3.69    23.0  1.42  0.126    2.63 9.93e-3
##  9 Merc 230    22.8  141.    95  3.15  3.92    22.4  0.449 0.107    2.65 7.98e-4
## 10 Merc 280    19.2  168.   123  3.44  3.92    20.5 -1.27  0.129    2.64 8.08e-3
## # ... with 22 more rows, and 1 more variable: .std.resid <dbl>

Leverage Plots

The leveragePlot() function is contained in the {car} R package and is used to display a generalization of added-variable plots to multiple-df terms in a linear model.

# leverage plots
leveragePlots(fit,layout=c(2,2))

Hat Matrix

The hat matrix, H, sometimes also called influence matrix and projection matrix, maps the vector of observed values to the vector of fitted values (or predicted values). It describes the influence each observed value has on each fitted value.

The diagonal elements of the hat matrix are the leverages, which describe the influence each observed value has on the fitted value for that same observation.

If the vector of observed values is denoted by y and the vector of fitted values by \(\hat{y}\),

\[ \hat{y} = H y\]