Metode Newton

Sebagai ilustrasi, kita akan mencoba optimisasi pada fungsi berikut ini: \[ f(x_1, x_2)=x_1-x_2+2{x_1}^2+2x_1x_2+{x_2}^2 \] dengan nilai awal \[ x_{(1)}=\biggl[\begin{matrix}0\\0\end{matrix}\biggr] \]

Sebagai langkah awal, kita perlu menyimpan fungsi tersebut ke dalam suatu objek di R, seperti pada program berikut ini.

# the function

f<-function(x) {
  x1=x[1]; x2=x[2];
  x1-x2+(2*x1^2)+(2*x1*x2)+(x2^2)
}

# the initial value
x0<-c(0,0)

Selanjutnya, kita menyusun program untuk membuat fungsi optimisasi dengan metode Newton seperti di bawah ini.

library(numDeriv)
library(pracma)
## 
## Attaching package: 'pracma'
## The following objects are masked from 'package:numDeriv':
## 
##     grad, hessian, jacobian
newton_optimize <- function(f, x0, tol = 1e-6, max_iter = 100) {
  iter <- 0
  while (iter < max_iter) {
    # Estimate the gradient using the grad function
    df <- grad(f, x0)
    
    # Estimate the Hessian matrix using the hessian function
    ddf <- hessian(f, x0)
    
    # Calculate the update step
    dx <- solve(ddf, -df)
    
    # Update the guess
    x1 <- x0 + dx
    
    # Check for convergence
    if (abs(f(x1) - f(x0)) < tol) {
      return(list(x=x1,iter=iter))
    }
    
    # Update the iteration counter and the guess
    iter <- iter + 1
    x0 <- x1
  }
  
  # If the algorithm does not converge within the maximum number of iterations, return NULL
  return(NULL)
}
newton_optimize(f, x0)
## $x
## [1] -1.0  1.5
## 
## $iter
## [1] 1

Metode Marquardt

Metode Marquadt dapat dilakukan dengan menggunakan fungsi berikut ini.

library(numDeriv)
library(pracma)

marquardt_optimize <- function(f, x0, tol = 1e-6, max_iter = 100, lambda = 1) {
  iter <- 0
  while (iter < max_iter) {
    # Estimate the gradient using the grad function
    df <- grad(f, x0)
    
    # Estimate the Hessian matrix using the hessian function
    ddf <- hessian(f, x0)
    
    # Calculate the damping parameter
    mu <- lambda * max(diag(ddf))
    
    # Calculate the update step
    dx <- solve(ddf + mu * diag(length(x0)), -df)
    
    # Update the guess
    x1 <- x0 + dx
    
    # Check for convergence
    if (abs(f(x1) - f(x0)) < tol) {
      return( return(list(x=x1,iter=iter)))
    }
    
    # Update the iteration counter and the guess
    iter <- iter + 1
    
    # Update the damping parameter
    if (f(x1) < f(x0)) {
      lambda <- lambda / 10
    } else {
      lambda <- lambda * 10
    }
    
    x0 <- x1
  }
  
  # If the algorithm does not converge within the maximum number of iterations, return NULL
  return(NULL)
}
marquardt_optimize(f,x0)
## $x
## [1] -1.0  1.5
## 
## $iter
## [1] 4

Metode Quasi-Newton

Metode Davidon-Fletcher-Powell (DFP)

dfp_optimize <- function(f, x0, B0, tol = 1e-6, max_iter = 100) {
  # Initialize variables
  iter <- 0
  n <- length(x0)
  x <- x0
  B <- B0
  
  # Loop until convergence or maximum number of iterations is reached
  while (iter < max_iter) {
    # Calculate the gradient using numerical differentiation
    g <- grad(f, x)
    
    # Calculate the search direction
    d <- -B %*% g
    
    # Perform line search to find optimal step size
    alpha <- optimize(function(alpha) f(x + alpha*d), interval = c(0, 1))$minimum
    x_new <- x + alpha * d
    
    # Calculate the new gradient using numerical differentiation
    g_new <- grad(f, x_new)
    
    # Calculate the update to the Hessian matrix
    s <- alpha * d
    y <- g_new - g
    B <- B + ((s %*% t(s)) / as.numeric(t(s) %*% y)) - ((B %*% y %*% t(y) %*% B) / as.numeric(t(y) %*% B %*% y))
    
    # Check for convergence
    if (abs(f(x_new) - f(x)) < tol) {
      return(list(x=x_new, iter=iter))
    }
    
    # Update the iteration counter and the guess
    iter <- iter + 1
    x <- x_new
  }
  
  # If the algorithm does not converge within the maximum number of iterations, return NULL
  return(NULL)
}

Pada optimisasi metode quasi-Newton, kita perlu menentukan nilai awal bagi hampiran matrix Hessian, misalnya kita menggunakan nilai awal berikut.

\[ B_{(1)}=\biggl[\begin{matrix}1&0\\0&1\end{matrix}\biggr] \]

B0<-diag(2)
dfp_optimize(f, x0, B0)
## $x
##      [,1]
## [1,] -1.0
## [2,]  1.5
## 
## $iter
## [1] 2

Metode Broydon-Fletcher-Goldfarb-Shanno (BFGS)

optim(par=x0, fn=f, method="BFGS")
## $par
## [1] -1.0  1.5
## 
## $value
## [1] -1.25
## 
## $counts
## function gradient 
##        5        4 
## 
## $convergence
## [1] 0
## 
## $message
## NULL

Metode Gradient Descent

# define gradient descent function
gradient_descent <- function(obj_func, init_x, learning_rate=0.01, maxIter=1e5, tol=1e-6) {
  
   # initialize x with initial values
  x <- init_x
  gradx <- grad(obj_func,x)
  iter<-0
  
  # loop through iterations
  while (iter < maxIter && norm(as.matrix(gradx)) > tol) {
    # calculate gradient at current x
    gradx <- grad(obj_func,x)
    # update x using gradient descent formula
    x <- x - learning_rate * gradx
    iter<-iter+1
  }
  # return final x
  return(list(x=x, iter=iter))
}
gradient_descent(obj_func=f, init_x=x0, learning_rate=0.01)
## $x
## [1] -0.9999995  1.4999992
## 
## $iter
## [1] 1886

Latihan

  1. Tentukan \(x\) yang meminimumkan fungsi \(f(x)=x^2\), gunakan nilai awal \(x_{(1)}=5\). Lakukan perhitungan secara manual, lalu bandingkan hasilnya dengan output fungsi R.

  2. Tentukan \(x_1\) dan \(x_2\) yang meminimukan fungsi \(f(x_1, x_2)=100{(x_2-{x_1}^2)}^2+{(1-x_1)}^2\) dengan nilai awal \(x_{(1)}=\biggl[\begin{matrix}-1.2\\1\end{matrix}\biggr]\). Bandingkan hasilnya dengan hasil perhitungan secara manual (cukup gunakan dua iterasi).

Referensi

Rao, S.S. (2020). Engineering Optimization. New Jersey: John Wiley & Sons.

(Pustaka lain yang relevan)