library(ggplot2)
library(kableExtra)

Here we will walk through an example using optim by using it to minimize the MSE (mean squared error) for a linear function, which in all honesty is over fit there are better functions there much better built in function that do this (aka lm which is used to fit linear models) but it is probably the simplest demo that comes to mind.

First we create the data set we will use for this example:

# Set a random seed for reproducibility
set.seed(123)

# Create random data
x <- rnorm(500)
y <- rnorm(500) + 0.7 * x

# Combine x and y into a data frame
data <- data.frame(x, y)
ggplot(data = data) + 
  geom_point(aes(y, y))

# Manually create a function for residual sum of squares
# Args 
#   data: data.frame containing a x and y column 
#   par: numeric vector of two parameters 
# Rerun: numeric vector length 1 the MSE between predicted y and the observed y   
objective_function <- function(data, par) {
  
  predicted_y <- par[1] + par[2] * data$x
  SE <- (predicted_y - data$y)^2
  MSE <- mean(SE)
  return(MSE)
  
}

Use the optim function to estimate values of our two parameters.

# Applying optim
optim_output <- optim(par = c(0, 1),
                      fn = objective_function,
                      data = data)

Question for Peter what happens when you play aroud with the values of par? If you set par = (10, 10)? Try palying around with different values

optim_output
## $par
## [1] -0.000506763  0.645985025
## 
## $value
## [1] 1.01713
## 
## $counts
## function gradient 
##       51       NA 
## 
## $convergence
## [1] 0
## 
## $message
## NULL

Now estimate the parameters using lm

fit <- lm(data = data, formula = y ~ x)
fit
## 
## Call:
## lm(formula = y ~ x, data = data)
## 
## Coefficients:
## (Intercept)            x  
##  -0.0004678    0.6460271

Questions for Peter