Exercise 1

Find the positive root of the polynomial \( x^2-2=0 \).

1a: Implement the bisection algorithm.

iterate.bisection <- function(m) {
    middle <- (m[1] + m[2])/2
    value <- middle^2 - 2
    if (value > 0) 
        return(c(m[1], middle)) else return(c(middle, m[2]))
}

margins <- c(0.1, 10)
for (i in 1:25) {
    margins <- iterate.bisection(margins)
    if (abs(mean(margins) - sqrt(2)) <= 10^-6) {
        print(i)
        print(mean(margins) - sqrt(2), digits = 20)
    }
}
## [1] 21
## [1] -2.4203234305630871859e-08
## [1] 23
## [1] 5.6588274888191847367e-07
## [1] 24
## [1] 2.7083975728814380091e-07
## [1] 25
## [1] 1.2331826160227876699e-07

So after 21 iterations the accuracy is \( 10^{-6} \), but the 22nd iteration decreases accuracy again. After the 23rd, it is always accurate enough.

1b: Implement Newton-Raphson.

We know that \( f(x) = x^2-2 \) and we want to calculate the positive \( x \) such that \( f(x) = 0. \) We start at \( x_0=10, \) and iteration is: \( x_{i+1} = x_i - f(x_i)/f'(x_i). \) So, \[ x_{i+1} = x_i - (x_i^2-2)/(2x_i): \]

iterate.newton <- function(x) {
    return(x - (x^2 - 2)/(2 * x))
}

x <- 10
for (i in 1:8) {
    x <- iterate.newton(x)
    if (abs(x - sqrt(2)) <= 10^-6) {
        print(i)
        print(x - sqrt(2), digits = 20)
    }
}
## [1] 6
## [1] 3.4429174178285393282e-08
## [1] 7
## [1] 2.2204460492503130808e-16
## [1] 8
## [1] 0

We see that for this method, after 6 iterations the error is smaller than \( 10^{-6} \).

Exercise 2

Generate data using the code copied from the pdf:

set.seed(1244)
a <- 1
b <- 1
n <- 50
x <- rnorm(n)
y <- rbinom(n, size = 1, prob = pnorm(a + b * x))
data <- data.frame(x = x, y = y)
probreg <- glm(y ~ x, family = binomial(link = "probit"), data = data)

2a: Write function for MLE using optim().

We use the log-likelihood, for numerical stability:

likelihood <- function(param) {
    a <- param[1]
    b <- param[2]
    p <- pnorm(a + b * x)
    l <- y * p + (1 - y) * (1 - p)  # vectorized
    -sum(log(l))
}
fit.mle.optim <- function(start) {
    optim(start, likelihood)$par
}

2b: Compare to glm:

fit.mle.optim(c(0, 0))
## [1] 0.7564 1.0751
probreg$coefficients
## (Intercept)           x 
##      0.7565      1.0755

We see that they are nearly identical for this precision, with a small difference in the estimated value for a.

2c & 2d: Write function for MLE using nlm:

We can re-use the likelihood-function from 2a, so:

fit.mle.nlm <- function(start) {
    nlm(likelihood, start, hessian = TRUE)
}
fit <- fit.mle.nlm(c(0, 0))
fit$estimate
## [1] 0.7565 1.0755
fit$hessian
##       [,1]  [,2]
## [1,] 21.52 -8.58
## [2,] -8.58 14.35

We use the hessian to calculate the standard errors. We know that the hessian matrix is the inverse of the covariance matrix, so:

cov <- solve(fit$hessian)
sqrt(diag(cov))
## [1] 0.2470 0.3025
summary(probreg)$coefficients[, 2]
## (Intercept)           x 
##      0.2416      0.2978

Close, but because we used different methods, not identical.

2e: Use Laplace link function

require("VGAM")
likelihood.2 <- function(param) {
    a <- param[1]
    b <- param[2]
    p <- plaplace(a + b * x)
    l <- y * p + (1 - y) * (1 - p)  # vectorized
    -sum(log(l))
}
fit <- nlm(likelihood.2, c(0, 0), hessian = TRUE)
# Fit:
fit$estimate
## [1] 0.7827 1.2185
# Covariance matrix:
cov <- solve(fit$hessian)
cov
##         [,1]    [,2]
## [1,] 0.10421 0.07215
## [2,] 0.07215 0.18572