Difference between NLS and OLS in restricted regression

Take the following DGP:

\[ y_i=\alpha+\gamma x_{1i}+\gamma^2 x_{2i}+\varepsilon_i, \]

where \( x_{1i},x_{2i} \) and \( \varepsilon_i \) are i.i.d and normal.

We demonstrate (numerically) that given NLS solution \( \hat{\gamma}_{NLS} \), the coefficient vector \( (\hat\gamma_{NLS},\hat\gamma_{NLS}^2) \) is different than the OLS solution for the problem

\[ y_i=\alpha+\theta_1 x_{1i}+\theta_2 x_{2i}+\varepsilon_i, \]

where the true DGP is the first model.

We take \( x_{1i},x_{2i}\sim N(0,1) \), \( \varepsilon_i\sim N(0,1/4) \), \( \alpha=1 \) and \( \gamma=1 \).

We do 100 repetitions for the sample size 1000 and look at the difference between NLS and OLS solution. We plot the squared difference. If the solutions are identical we should see the difference equal to zero given the machine tolerance. Everything is double precision, since we use R.

The starting values for NLS are 1.1 and 0.9, close to the true values.

Create a function with data for one repetion and the function producing difference between OLS and NLS solutions. Calculate the gradient to make sure that the converged solution solves the first order condition.

gend <- function(n, alpha = 1, gamma = 1) {
    x1 <- rnorm(n)
    x2 <- rnorm(n)
    eps <- rnorm(n)/2
    y <- alpha + gamma * x1 + gamma^2 * x2 + eps
    data.frame(y = y, x1 = x1, x2 = x2)
}
ondiff <- function(n, alpha = 1, gamma = 1) {
    dt <- gend(n, alpha, gamma)
    ols <- coef(lsfit(dt[, c("x1", "x2")], dt$y), intercept = TRUE)
    nlsm <- nls(y ~ a + g * x1 + g^2 * x2, data = dt, start = list(a = 1.1, 
        g = 0.9))
    cnlsm <- coef(nlsm)
    gradf <- function(p, dt) {
        X <- as.matrix(cbind(1, dt[, c("x1", "x2")]))
        res <- dt$y - p[1] - p[2] * dt$x1 - p[2]^2 * dt$x2
        D <- rbind(c(1, 0, 0), c(0, 1, 2 * p[2]))
        D %*% crossprod(X, res)
    }
    list(ols = ols, nls = cnlsm, grad = gradf(cnlsm, dt))
}

Generate data for 100 repetitions with sample size 1000. Plot the gradient and scatter plots comparing NLS and OLS solutions.

set.seed(13)
mc <- lapply(1:100, function(x) ondiff(1000))
plot(sapply(mc, function(x) sum(abs(x$grad))), xlab = "Repetition", 
    ylab = "value", main = "L_1 norm of gradient")

plot of chunk unnamed-chunk-2

plot(sapply(mc, with, ols[2]), sapply(mc, with, nls[1]), xlab = "OLS theta_1", 
    ylab = "NLS gamma", main = "NLS vs OLS for the first coefficient")

plot of chunk unnamed-chunk-2

plot(sapply(mc, with, ols[3]), sapply(mc, with, nls[2]^2), xlab = "OLS theta_2", 
    ylab = "NLS gamma^2", main = "NLS vs OLS for the second coefficient")

plot of chunk unnamed-chunk-2

If OLS and NLS solutions algebraically coincide we should get no variation in plots, only straight lines.