Exercises of Chapter 3, Applied Econometrics with R

  1. This exercise is taken from Faraway (2005, p. 23). Generate some artificial data by
    x <- 1:20
    y <- x + rnorm(20)
    Fit a polynomial in x for predicting y. Compute \( \hat{\beta} \) in two ways—by lm() and by using the direct calculation \( \hat{\beta}= (X^{T}X)^{-1}X^{T}y \). At what degree of polynomial does the direct calculation method fail? (Note the need for the I() function in fitting the polynomial; e.g., in lm(y ~ x + I(x{^}2)).) (Lesson: Never use the textbook formula \( \hat{\beta}= (X^{T}X)^{-1}X^{T}y \) for computations!)

First let's generate x and y.

x <- 1:20
y <- x + rnorm(20)

Second we compute \( \hat{\beta} \) by lm for order 2.

lm2 <- lm(y ~ x + I(x^2))
beta_lm <- coef(lm2)

Thrid we compute \( \hat{\beta} \) by formula

X <- cbind(1, x, x^2)
X
##          x    
##  [1,] 1  1   1
##  [2,] 1  2   4
##  [3,] 1  3   9
##  [4,] 1  4  16
##  [5,] 1  5  25
##  [6,] 1  6  36
##  [7,] 1  7  49
##  [8,] 1  8  64
##  [9,] 1  9  81
## [10,] 1 10 100
## [11,] 1 11 121
## [12,] 1 12 144
## [13,] 1 13 169
## [14,] 1 14 196
## [15,] 1 15 225
## [16,] 1 16 256
## [17,] 1 17 289
## [18,] 1 18 324
## [19,] 1 19 361
## [20,] 1 20 400

Fourth compare the two parameters.

beta_direct <- solve(t(X) %*% X) %*% t(X) %*% y
print(cbind(beta_lm, beta_direct))
##              beta_lm         
## (Intercept) 0.368164 0.368164
## x           0.878272 0.878272
## I(x^2)      0.005028 0.005028