There are two first partial derivatives \(\frac{\partial f}{\partial x}\) and \(\frac{\partial f}{\partial y}\), thus the stationary conditions are: \(\frac{\partial f}{\partial x} = 2x + y = 0\) and \(\frac{\partial f}{\partial y} = x + 2y = 0\).
From \(2x + y = 0\), we have \(y = -2x\), and from \(x + 2y = 0\), we have \(x = -2y\).
Substituting \(-2x\) for \(y\) in the second condition we get \(x = -2(-2x)\) or \(x = 4x\). The only value that satisfies this condition is \(0\).
Conversely, substituting \(-2y\) for \(x\) in the first condition we get \(y = -2(-2y)\) or \(y = 4y\). Again, the only value that satisfies this condition is \(0\).
Solving for the opposite variable in each equation gives us \(x = \frac{-y}{2}\) and \(y = \frac{-x}{2}\). In both cases, again, the only value that satisfies the condition is \(0\).
So, for \(f(x,y) = x^2 + xy + y^2\), its gradient \(\nabla f = (2x + y; x + 2y) = 0\) gives us \((x_*, y_*) = (0, 0)\).
There are four second order derivatives \(f_{xx} = \frac{\partial^2 f}{\partial x^2} = 2\), \(f_{yy} = \frac{\partial^2 f}{\partial y^2} = 2\), \(f_{xy} = \frac{\partial^2 f}{\partial x \partial y} = 1\) and \(f_{yx} = \frac{\partial^2 f}{\partial y \partial x} = 1\) which we use to form the Hessian matrix:
\[ H = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}\]
The Determinant of \(H\) is \(2 \times 2 - 1 \times 1 = 3\) and so is positive definite. Therefore, \((x_*, y_*)\) corresponds to a minimum at \((0,0)\) with \(f_{min} = 0\).
f <- function(x){x[1]^2 + x[1]*x[2] + x[2]^2}
optim(c(0,0), f, hessian = TRUE)
## $par
## [1] 0 0
##
## $value
## [1] 0
##
## $counts
## function gradient
## 95 NA
##
## $convergence
## [1] 0
##
## $message
## NULL
##
## $hessian
## [,1] [,2]
## [1,] 2 1
## [2,] 1 2
To minimize \(f(x) = x^4\) subject to \(g(x) = x^2 \ge 1\) first we need to change \(g(x)\) to \(x^2 - 1 \ge 0\).
Then, using the penalty method we can transform the constrained problem to an unconstrained one:
\[\begin{align} \Pi &= f(x) + \mu[g(x)]2 \\ &= x^4 + \mu[x^2 - 1]^2 \end{align}\]
Setting \(\Pi'(x) = 0\) we get:
\[\begin{align} \Pi'(x) &= 4x^3 + 4\mu x(x^2 - 1) = 0\\ &= 4x (x^2 +\mu (x^2 - 1)) = 0\\ &= 4x (x^2 +\mu x^2 -\mu) = 0\\ &= 4x (x^2 (\mu +1) -\mu) = 0 \end{align}\]
Therefore either \(\quad x=0 \quad \text{or} \quad x^2(\mu+1)-\mu =0\).
Solving for \(x^2(\mu+1)-\mu =0\) we get:
\[\begin{align} x^2 (\mu +1) -\mu &= 0\\ x^2(\mu+1) &= \mu\\ \frac{x^2\left(\mu+1\right)}{\mu+1} &= \frac{\mu}{\mu+1};\quad \:\mu\ne \:-1\\ x &= \pm\sqrt{\frac{\mu}{\mu+1}};\quad \:\mu\ne \:-1 \end{align}\]
So:
\[ x_* = 0,\quad x_* = \sqrt{\frac{\mu}{\mu+1}},\quad x_* = -\sqrt{\frac{\mu}{\mu+1}}; \quad \quad \mu\ne -1 \]
However, \(x_* = 0\) is not a feasible solution since it does not meet our inequality constraint that \(x^2 \ge 1\). So, it seems the optimal solution \(x_*\) depends on the penalty parameter \(\mu\), which highlights a problem with the penalty method (the choice of \(\mu\)). In this case however, the closer \(\mu\) gets to \(0\), the closer \(x_*\) gets to \(0\) as well. The larger \(\mu\) gets, the closer \(x_*\) gets to \(1\). In both cases \(x_*\) still would not meet our inequality constraint.
First we need to transform our problem from a constrained problem into an unconstrained problem via the Lagrange multiplier:
\[\Phi = x^2 + 2xy + y^2 + \lambda(x^2 - y - 2)\]
The stationary conditions become:
\[ \frac{\partial \Phi}{\partial x} = 2x+2y+2\lambda x = 0,\quad \frac{\partial \Phi}{\partial y} = 2x+2y-\lambda = 0,\quad \frac{\partial \Phi}{\partial \lambda} = x^2-y-2 = 0\quad \]
The second condition, \(2x+2y−\lambda = 0\), gives us \(\lambda = 2(x+y)\).
The third condition, \(x^2-y-2 = 0\), gives us \(y = x^2-2\).
Substituting these into the first condition gives us:
\[ \begin{align} 2x + 2y + 2[2(x+y)]x &= 0 \\ 2x + 2y + 4x^2 + 4xy &= 0 \\ 2x^2 + x + 2xy + y &=0 \\ 2x^2 + x + 2x[x^2-2] + [x^2-2] &=0 \\ 2x^2 + x + 2x^3-4x + x^2 - 2 &=0 \\ 2x^3 + 3x^2 -3x - 2 &=0 \\ (x−1)(x+2)(2x+1) &= 0 \end{align} \]
Therefore:
\[x_* = 1,\quad x_* = -2,\quad x_* = -\frac{1}{2}\]
Because \(f(x) = x^2 + 2xy + y^2 = (x+y)(x+y)\), at \(x_* = 1\), \(y = -1\), at \(x_* = -2\), \(y = 2\), and at \(x_* = -0.5\), \(y = 0.5\). So we get the following three points:
\[(1,-1),\quad (-2,2),\quad (-\frac{1}{2}, \frac{1}{2})\]
\((-\frac{1}{2}, \frac{1}{2})\) is not a feasible solution because it does not satisfy the equality constraint, \(y = x^2 - 2\), but both of the other two points are feasible and \(f(x,y)\) evaluates to \(0\) for both points. So there are two optimal points at \((1, -1)\) with \(f_{min}=0\) and \((-2, 2)\) with \(f_{min}=0\).
fn1=function(x){x[1]^2+2*x[1]*x[2]+x[2]^2}
eqn1=function(x){x[2]-x[1]^2}
x0 = c(0,0)
solnp(x0, fun = fn1, eqfun = eqn1, eqB = c(-2))
##
## Iter: 1 fn: 0.0000001876 Pars: 2.25137 -2.25180
## Iter: 2 fn: 1.243e-14 Pars: 1.28457 -1.28457
## Iter: 3 fn: 3.775e-15 Pars: 1.02269 -1.02269
## Iter: 4 fn: 2.442e-15 Pars: 1.00017 -1.00017
## Iter: 5 fn: 2.22e-15 Pars: 1.00000 -1.00000
## Iter: 6 fn: 3.109e-15 Pars: 1.00000 -1.00000
## solnp--> Completed in 6 iterations
## $pars
## [1] 1 -1
##
## $convergence
## [1] 0
##
## $values
## [1] 0.0000000000000000 0.0000001876313318 0.0000000000000124 0.0000000000000038
## [5] 0.0000000000000024 0.0000000000000022 0.0000000000000031
##
## $lagrange
## [,1]
## [1,] -0.00000000037
##
## $hessian
## [,1] [,2]
## [1,] 768981 768965
## [2,] 768965 768974
##
## $ineqx0
## NULL
##
## $nfuneval
## [1] 280
##
## $outer.iter
## [1] 6
##
## $elapsed
## Time difference of 0.056 secs
##
## $vscale
## [1] 0.000000010 0.000000029 0.999999993 1.000000042
x1 = c(-10,-10)
solnp(x1, fun = fn1, eqfun = eqn1, eqB = c(-2))
##
## Iter: 1 fn: 0.1256 Pars: -5.38707 5.74141
## Iter: 2 fn: 0.02541 Pars: -3.19004 3.34946
## Iter: 3 fn: 0.002953 Pars: -2.27333 2.32767
## Iter: 4 fn: 0.00005874 Pars: -2.02323 2.03089
## Iter: 5 fn: 0.00000001338 Pars: -2.00022 2.00033
## Iter: 6 fn: 8.882e-16 Pars: -2.00000 2.00000
## Iter: 7 fn: 0 Pars: -2.00000 2.00000
## solnp--> Completed in 7 iterations
## $pars
## [1] -2 2
##
## $convergence
## [1] 0
##
## $values
## [1] 400.00000000000000000 0.12555604531528530 0.02541361540434650
## [4] 0.00295265989395865 0.00005873562926784 0.00000001338122946
## [7] 0.00000000000000089 0.00000000000000000
##
## $lagrange
## [,1]
## [1,] -0.000000012
##
## $hessian
## [,1] [,2]
## [1,] 4.0 2.5
## [2,] 2.5 2.0
##
## $ineqx0
## NULL
##
## $nfuneval
## [1] 219
##
## $outer.iter
## [1] 7
##
## $elapsed
## Time difference of 0.045 secs
##
## $vscale
## [1] 0.000000010 0.000000046 2.000000027 2.000000062