DATA609 Module3

Write down Newton’s formula for finding the minimum of \[f(x) = \frac{(3x^{4} - 4x^{3})}{12}\] in the range of [-10,10]. Then, implement it in R.

Let’s first visualize the curve: The blue line is f(x) and the red line is f′(x)

f <- function(x){
  (3*x^4 - 4*x^3)/(12)
}
f_prime <- function(x){
  (x^3 - x^2)
}
curve(f, col = 'blue', xlim=c(-2,2), ylim=c(-0.2,0.5))
curve(f_prime, col = 'red', xlim=c(-2,2), ylim=c(-0.2,0.5),add=TRUE)
abline(h=0)
abline(v=0)

It looks like the minimum is at x=1 and the value of f(x) at x=1 is between 0 and -0.1

Here we know the f′(x) is:
f′(x)=x³−x²

And f′′(x) is:
f′′(x)=3x²−2x

Using Newton’s formula we get:

\[x_{1} = x_{0} - \frac{(x_{0})^{3} - (x_{0})^{2}}{3(x_{0})^{2} - 2(x_{0})}\]

f <- function(x){
  (x - (x^3 - x^2)/(3*x^2 - 2))
}

#Loop through 10 vaariables
get_values <- function(X_0){
  #Create a list of X values
  X <- c()
  
  for(i in 1:10){
    if (i == 1){
      X[[i]] <- f(X_0)
    }
    else{
      X[[i]] <- f(X[i-1])
    }
  }
  print(X)
}
options(scipen = 999) 

print("Starting at x=1.5 converges to 1.")

## [1] "Starting at x=1.5 converges to 1."

Exercise 2

Explore optimize() in R and try to solve the previous problem. Here we confirm the minimum is at 1 and the value of f(x) at the minimum is -0.0833

f <- function(x) {
  (3*x^4 - 4*x^3)/(12)
}
xmin <- optimize(f,c(-10,10))
xmin

## $minimum
## [1] 0.9999986
## 
## $objective
## [1] -0.08333333

Exercise 3

Use any optimization algorithm to find the minimum of f(x,y)=(x−1)²+100(y−x²)²

in the domain −10≤x,y≤10

Discuss any issues concerning the optimization process.

I’m going to use Newton’s method for optimization for multivariate functions. The drawback is the final value heavily depends on the starting x_0 value and its expensive to run.

Here’s the equation:

x_(t+1) = x_t -H(x_t)^-1f(x_t)

We’ll start with finding the Hessian matrix:

f(x,y)=(x−1)²+100(y−x²)²

f(x,y)=(x²−2x+1)+100(y−x²)(y−x²)

f(x,y)=(x²−2x+1)+100(y²−2x²y+x⁴)

f(x,y)=(x²−2x+1)+100y²−200x²y+100x⁴

f(x,y)=100x⁴+100y²+x²−200x²y−2x+1

fx=400x³+2x−400yx−2

fy=200y−200x²

Here are second derivatives:

f_xx=1200x²+2−400y
f_yy=200

f_xy=−400x
f_yx=−400x
The Hessian matrix gives us:

\[ H =\left[\begin{array} {matrix} f_{xx} & f_{xy} \\ f_{yx} & f_{yy} \\ \end{array}\right]\]

\[Df(x,y) =\left[\begin{array} {matrix} (1200x^2 + 2 - 400y) & -400x \\ -400x & 200 \\ \end{array}\right]\]

det(Df(x,y))=(1,200x²+2−400y)(200)−(−400x)(−400x)

det(Df(x,y))=(240,000x²+400−80,000y)−(160,000x²)

det(Df(x,y))=80,000x²−80,000y+400

\[[(Df(x,y))]^{-1} = \frac{1}{80,000x^2 -80,000y + 400}\left[\begin{array} {matrix} 200 & 400x \\ 400x & -(1200x^2 + 2 - 400y) \\ \end{array}\right]\]

\[= \left[\begin{array} {matrix} \frac{200}{80,000x^2 -80,000y + 400} & \frac{400x}{80,000x^2 -80,000y + 400}\\ \frac{400x}{80,000x^2 -80,000y + 400} & \frac{-(1200x^2 + 2 - 400y)}{80,000x^2 -80,000y + 400} \\ \end{array}\right]\]

\[\left[\begin{array} {matrix} x_k \\ y_k \\ \end{array}\right] = \left[\begin{array} {matrix} x_{k-1} \\ y_{k-1} \\ \end{array}\right] - \left[\begin{array} {matrix} \frac{200}{80,000x^2 -80,000y + 400} & \frac{400x}{80,000x^2 -80,000y + 400}\\ \frac{400x}{80,000x^2 -80,000y + 400} & \frac{-(1200x^2 + 2 - 400y)}{80,000x^2 -80,000y + 400} \\ \end{array}\right]* \left[\begin{array} {matrix} 400x^3 + 2x - 400yx - 2 \\ 200y - 200x^2\\ \end{array}\right]\]

\[= \left[\begin{array} {matrix} x_{k-1} - \frac{8000x^3 +400x - 80000yx-400+80000xy-80000x^3}{80,000x^2 -80,000y + 400}\\ y_{k-1} - \frac{(400x)(400x^3+2x-400yx-2) + (1200x^2x+2-400y)(200y-200x^2)}{80,000x^2 -80,000y + 400} \\ \end{array}\right]\]

\[= \left[\begin{array} {matrix} x_{k-1} - \frac{400(-180x^3+x-1)}{400(200x^2 -200y + 1)}\\ y_{k-1} - \frac{(160000x^4+800x^2-160000yx^2-800x) + (1200x^2x+2-400y)(200y-200x^2)}{80,000x^2 -80,000y + 400} \\ \end{array}\right]\]

I had issues simplying this function by hand above. Next I would've plugged in values for k from 1 to 5 to see what the resulting (x1,y1) value were. Using that (x1,y1) value I would to calculated (x2,y2) and so on. Eventually the values should converge to a single point.

Exercise 4

Explore the optimr package for R and try to solve the previous problem.

#install.packages("optimr")
library(optimr)
f <- function(x,y){
  (x - 1)^2 + 100*(y-x^2)^2
}

par <- -10:10
optimr(par,f,method="BFGS")

## Error in fn(par, ...) : argument "y" is missing, with no default

## $convergence
## [1] 9999
## 
## $par
##  [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## 
## $counts
## [1] NA NA
## 
## $message
## [1] "optim() with bounds ONLY uses L-BFGS-B:  optim method failure\n"

Resources:

https://azrael.digipen.edu/MAT180/NewtonsMethod.pdf

https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/optimize

https://rpubs.com/aaronsc32/newton-raphson-method

DATA609 Module3

IvanTikhonov

2023-02-22