Data 609 - Module 3 - Homework
Exercise 1
Write down Newton’s formula for finding the minimum of \(f(x) = \frac{(3x^{4} - 4x^{3})}{12}\) in the range of [-10,10]. Then, implement it in R.
Let’s first visualize the curve: The blue line is \(f(x)\) and the red line is \(f'(x)\)
f <- function(x){
(3*x^4 - 4*x^3)/(12)
}
f_prime <- function(x){
(x^3 - x^2)
}
curve(f, col = 'blue', xlim=c(-2,2), ylim=c(-0.2,0.5))
curve(f_prime, col = 'red', xlim=c(-2,2), ylim=c(-0.2,0.5),add=TRUE)
abline(h=0)
abline(v=0)It looks like the minimum is at x=1 and the value of f(x) at x=1 is between 0 and -0.1
Here we know the \(f'(x)\) is:
\(f'(x) = x^{3} - x^{2}\)
And \(f''(x)\) is:
\(f''(x) = 3x^{2} - 2x\)
Using Newton’s formula we get:
\(x_{1} = x_{0} - \frac{(x_{0})^{3} - (x_{0})^{2}}{3(x_{0})^{2} - 2(x_{0})}\)
f <- function(x){
(x - (x^3 - x^2)/(3*x^2 - 2))
}
#Loop through 10 vaariables
get_values <- function(X_0){
#Create a list of X values
X <- c()
for(i in 1:10){
if (i == 1){
X[[i]] <- f(X_0)
}
else{
X[[i]] <- f(X[i-1])
}
}
print(X)
}
options(scipen = 999)
print("Starting at x=1.5 converges to 1.")## [1] "Starting at x=1.5 converges to 1."
## [1] 1.263158 1.112483 1.031209 1.003325 1.000043 1.000000 1.000000 1.000000
## [9] 1.000000 1.000000
Exercise 2
Explore optimize() in R and try to solve the previous problem.
Here we confirm the minimum is at 1 and the value of f(x) at the minimum is -0.0833
## $minimum
## [1] 0.9999986
##
## $objective
## [1] -0.08333333
Exercise 3
Use any optimization algorithm to find the minimum of \(f(x,y) = (x - 1)^{2} + 100(y-x^{2})^{2}\) in the domain \(-10 \le x,y \le 10\). Discuss any issues concerning the optimization process.
I’m going to use Newton’s method for optimization for multivariate functions. The drawback is the final value heavily depends on the starting x_0 value and its expensive to run.
Here’s the equation:
\(x_{t+1} = x_{t} -H(x_{t})^{-1}f(x_t)\)
We’ll start with finding the Hessian matrix:
\(f(x,y) = (x - 1)^{2} + 100(y-x^{2})^{2}\)
\(f(x,y) = (x^2 - 2x + 1) + 100(y-x^{2})(y-x^{2})\)
\(f(x,y) = (x^2 - 2x + 1) + 100(y^2-2x^{2}y + x^{4})\)
\(f(x,y) = (x^2 - 2x + 1) + 100y^2-200x^{2}y + 100x^{4}\)
\(f(x,y) = 100x^{4} + 100y^{2} + x^{2} - 200x^{2}y -2x + 1\)
\(f_{x} = 400x^3 + 2x - 400yx - 2\)
\(f_{y} = 200y - 200x^2\)
Here are second derivatives:
\(f_{xx} = 1200x^2 + 2 - 400y\)
\(f_{yy} = 200\)
\(f_{xy} = -400x\)
\(f_{yx} = -400x\)
The Hessian matrix gives us: \[H =\left[\begin{array} {matrix} f_{xx} & f_{xy} \\ f_{yx} & f_{yy} \\ \end{array}\right]\]
\[Df(x,y) =\left[\begin{array} {matrix} (1200x^2 + 2 - 400y) & -400x \\ -400x & 200 \\ \end{array}\right]\]
\(det(Df(x,y)) = (1,200x^2 + 2 -400y)(200) - (-400x)(-400x)\) \(det(Df(x,y)) = (240,000x^2 + 400 -80,000y) - (160,000x^2)\) \(det(Df(x,y)) = 80,000x^2 -80,000y + 400\)
\[[(Df(x,y))]^{-1} = \frac{1}{80,000x^2 -80,000y + 400}\left[\begin{array} {matrix} 200 & 400x \\ 400x & -(1200x^2 + 2 - 400y) \\ \end{array}\right]\] \[= \left[\begin{array} {matrix} \frac{200}{80,000x^2 -80,000y + 400} & \frac{400x}{80,000x^2 -80,000y + 400}\\ \frac{400x}{80,000x^2 -80,000y + 400} & \frac{-(1200x^2 + 2 - 400y)}{80,000x^2 -80,000y + 400} \\ \end{array}\right]\]
\[\left[\begin{array} {matrix} x_k \\ y_k \\ \end{array}\right] = \left[\begin{array} {matrix} x_{k-1} \\ y_{k-1} \\ \end{array}\right] - \left[\begin{array} {matrix} \frac{200}{80,000x^2 -80,000y + 400} & \frac{400x}{80,000x^2 -80,000y + 400}\\ \frac{400x}{80,000x^2 -80,000y + 400} & \frac{-(1200x^2 + 2 - 400y)}{80,000x^2 -80,000y + 400} \\ \end{array}\right]* \left[\begin{array} {matrix} 400x^3 + 2x - 400yx - 2 \\ 200y - 200x^2\\ \end{array}\right]\]
\[= \left[\begin{array} {matrix} x_{k-1} - \frac{8000x^3 +400x - 80000yx-400+80000xy-80000x^3}{80,000x^2 -80,000y + 400}\\ y_{k-1} - \frac{(400x)(400x^3+2x-400yx-2) + (1200x^2x+2-400y)(200y-200x^2)}{80,000x^2 -80,000y + 400} \\ \end{array}\right] \] \[= \left[\begin{array} {matrix} x_{k-1} - \frac{400(-180x^3+x-1)}{400(200x^2 -200y + 1)}\\ y_{k-1} - \frac{(160000x^4+800x^2-160000yx^2-800x) + (1200x^2x+2-400y)(200y-200x^2)}{80,000x^2 -80,000y + 400} \\ \end{array}\right] \]
I had issues simplying this function by hand above. Next I would’ve plugged in values for k from 1 to 5 to see what the resulting (x1,y1) value were. Using that (x1,y1) value I would to calculated (x2,y2) and so on. Eventually the values should converge to a single point.
Exercise 4
Explore the optimr package for R and try to solve the previous problem.
#install.packages("optimr")
library(optimr)
f <- function(x,y){
(x - 1)^2 + 100*(y-x^2)^2
}
par <- -10:10
optimr(par,f,method="BFGS")## Error in fn(par, ...) : argument "y" is missing, with no default
## $convergence
## [1] 9999
##
## $par
## [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
##
## $counts
## [1] NA NA
##
## $message
## [1] "optim() with bounds ONLY uses L-BFGS-B: optim method failure\n"