library(ggplot2)
library(kableExtra)
Here we will walk through an example using optim by
using it to minimize the MSE (mean squared error) for a linear function,
which in all honesty is over fit there are better functions there much
better built in function that do this (aka lm which is used
to fit linear models) but it is probably the simplest demo that comes to
mind.
First we create the data set we will use for this example:
# Set a random seed for reproducibility
set.seed(123)
# Create random data
x <- rnorm(500)
y <- rnorm(500) + 0.7 * x
# Combine x and y into a data frame
data <- data.frame(x, y)
ggplot(data = data) +
geom_point(aes(y, y))
# Manually create a function for residual sum of squares
# Args
# data: data.frame containing a x and y column
# par: numeric vector of two parameters
# Rerun: numeric vector length 1 the MSE between predicted y and the observed y
objective_function <- function(data, par) {
predicted_y <- par[1] + par[2] * data$x
SE <- (predicted_y - data$y)^2
MSE <- mean(SE)
return(MSE)
}
Use the optim function to estimate values of our two
parameters.
# Applying optim
optim_output <- optim(par = c(0, 1),
fn = objective_function,
data = data)
Question for Peter what happens when you play aroud with the values of par? If you set par = (10, 10)? Try palying around with different values
optim_output
## $par
## [1] -0.000506763 0.645985025
##
## $value
## [1] 1.01713
##
## $counts
## function gradient
## 51 NA
##
## $convergence
## [1] 0
##
## $message
## NULL
Now estimate the parameters using lm
fit <- lm(data = data, formula = y ~ x)
fit
##
## Call:
## lm(formula = y ~ x, data = data)
##
## Coefficients:
## (Intercept) x
## -0.0004678 0.6460271
Questions for Peter
kable)?