Write down Newton’s formula for finding the minimum of \[f(x) = \frac{(3x^{4} - 4x^{3})}{12}\] in the range of [-10,10]. Then, implement it in R.
Let’s first visualize the curve: The blue line is f(x) and the red
line is f′(x)
f <- function(x){
(3*x^4 - 4*x^3)/(12)
}
f_prime <- function(x){
(x^3 - x^2)
}
curve(f, col = 'blue', xlim=c(-2,2), ylim=c(-0.2,0.5))
curve(f_prime, col = 'red', xlim=c(-2,2), ylim=c(-0.2,0.5),add=TRUE)
abline(h=0)
abline(v=0)
It looks like the minimum is at x=1 and the value of f(x) at x=1 is between 0 and -0.1
Here we know the f′(x) is:
f′(x)=x3−x2
And f′′(x) is:
f′′(x)=3x2−2x
Using Newton’s formula we get:
\[x_{1} = x_{0} - \frac{(x_{0})^{3} - (x_{0})^{2}}{3(x_{0})^{2} - 2(x_{0})}\]
f <- function(x){
(x - (x^3 - x^2)/(3*x^2 - 2))
}
#Loop through 10 vaariables
get_values <- function(X_0){
#Create a list of X values
X <- c()
for(i in 1:10){
if (i == 1){
X[[i]] <- f(X_0)
}
else{
X[[i]] <- f(X[i-1])
}
}
print(X)
}
options(scipen = 999)
print("Starting at x=1.5 converges to 1.")
## [1] "Starting at x=1.5 converges to 1."
Exercise 2
Explore optimize() in R and try to solve the previous problem. Here we confirm the minimum is at 1 and the value of f(x) at the minimum is -0.0833
f <- function(x) {
(3*x^4 - 4*x^3)/(12)
}
xmin <- optimize(f,c(-10,10))
xmin
## $minimum
## [1] 0.9999986
##
## $objective
## [1] -0.08333333
Exercise 3
Use any optimization algorithm to find the minimum of f(x,y)=(x−1)2+100(y−x2)2
in the domain −10≤x,y≤10
Discuss any issues concerning the optimization process.
I’m going to use Newton’s method for optimization for multivariate functions. The drawback is the final value heavily depends on the starting x_0 value and its expensive to run.
Here’s the equation:
x_(t+1) = x_t -H(x_t)-1f(x_t)
We’ll start with finding the Hessian matrix:
f(x,y)=(x−1)2+100(y−x2)2
f(x,y)=(x2−2x+1)+100(y−x2)(y−x2)
f(x,y)=(x2−2x+1)+100(y2−2x2y+x4)
f(x,y)=(x2−2x+1)+100y2−200x2y+100x4
f(x,y)=100x4+100y2+x2−200x2y−2x+1
fx=400x3+2x−400yx−2
fy=200y−200x2
Here are second derivatives:
f_xx=1200x2+2−400y
f_yy=200
f_xy=−400x
f_yx=−400x
The Hessian matrix gives us:
\[ H =\left[\begin{array} {matrix} f_{xx} & f_{xy} \\ f_{yx} & f_{yy} \\ \end{array}\right]\]
\[Df(x,y) =\left[\begin{array} {matrix} (1200x^2 + 2 - 400y) & -400x \\ -400x & 200 \\ \end{array}\right]\]
det(Df(x,y))=(1,200x2+2−400y)(200)−(−400x)(−400x)
det(Df(x,y))=(240,000x2+400−80,000y)−(160,000x2)
det(Df(x,y))=80,000x2−80,000y+400
\[[(Df(x,y))]^{-1} = \frac{1}{80,000x^2 -80,000y + 400}\left[\begin{array} {matrix} 200 & 400x \\ 400x & -(1200x^2 + 2 - 400y) \\ \end{array}\right]\]
\[= \left[\begin{array} {matrix} \frac{200}{80,000x^2 -80,000y + 400} & \frac{400x}{80,000x^2 -80,000y + 400}\\ \frac{400x}{80,000x^2 -80,000y + 400} & \frac{-(1200x^2 + 2 - 400y)}{80,000x^2 -80,000y + 400} \\ \end{array}\right]\]
\[\left[\begin{array} {matrix} x_k \\ y_k \\ \end{array}\right] = \left[\begin{array} {matrix} x_{k-1} \\ y_{k-1} \\ \end{array}\right] - \left[\begin{array} {matrix} \frac{200}{80,000x^2 -80,000y + 400} & \frac{400x}{80,000x^2 -80,000y + 400}\\ \frac{400x}{80,000x^2 -80,000y + 400} & \frac{-(1200x^2 + 2 - 400y)}{80,000x^2 -80,000y + 400} \\ \end{array}\right]* \left[\begin{array} {matrix} 400x^3 + 2x - 400yx - 2 \\ 200y - 200x^2\\ \end{array}\right]\]
\[= \left[\begin{array} {matrix} x_{k-1} - \frac{8000x^3 +400x - 80000yx-400+80000xy-80000x^3}{80,000x^2 -80,000y + 400}\\ y_{k-1} - \frac{(400x)(400x^3+2x-400yx-2) + (1200x^2x+2-400y)(200y-200x^2)}{80,000x^2 -80,000y + 400} \\ \end{array}\right]\]
\[= \left[\begin{array} {matrix} x_{k-1} - \frac{400(-180x^3+x-1)}{400(200x^2 -200y + 1)}\\ y_{k-1} - \frac{(160000x^4+800x^2-160000yx^2-800x) + (1200x^2x+2-400y)(200y-200x^2)}{80,000x^2 -80,000y + 400} \\ \end{array}\right]\]
I had issues simplying this function by hand above. Next I would've plugged in values for k from 1 to 5 to see what the resulting (x1,y1) value were. Using that (x1,y1) value I would to calculated (x2,y2) and so on. Eventually the values should converge to a single point.
Exercise 4
Explore the optimr package for R and try to solve the previous problem.
#install.packages("optimr")
library(optimr)
f <- function(x,y){
(x - 1)^2 + 100*(y-x^2)^2
}
par <- -10:10
optimr(par,f,method="BFGS")
## Error in fn(par, ...) : argument "y" is missing, with no default
## $convergence
## [1] 9999
##
## $par
## [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
##
## $counts
## [1] NA NA
##
## $message
## [1] "optim() with bounds ONLY uses L-BFGS-B: optim method failure\n"
Resources:
https://azrael.digipen.edu/MAT180/NewtonsMethod.pdf
https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/optimize