Chapter 7: Numeric Optimization

Jinkun Xiao
March 26th

Numerical Optimization

The optimization problem: given a function $ f(\cdots) $, which value of x makes $ f(x) $ as large or as small as possible? Multivariate or Univariate optimization procedure?

In your calculus course, you learned how to do this by solving an equation involving the first derivative.

You then used the second derivative to test whether you found a maximum or minimum.

Most optimization problems are too complicated to be solved this way.

Numerical Optimization

Computational methods or algorithms for optimization fall into two classes: linear and nonlinear.

Linear optimization normally involves constraints, and is referred to as linear programming when the constraints are linear. Nonlinear optimization can be constrained or unconstrained.

Numerical Optimization

Unconstrained optimization of a smooth function can be done using gradient and Hessian methods (steepest descent or Newton- Raphson).

Non-smooth functions require different approaches.

Constrained optimization is often performed by applying a penalty to violation of the constraint.

Numerical Optimization

Knowing how to do minimization is sufficient.

If we want to maximize $ f(x) $, we simply change the sign and minimize $ -f(x) $.

The Golden Section Search Method

The golden section search method is used to nd the minimizer of a single-variable function which has a single minimum on a given interval $ [a, b] $.

The function does not need to be smooth.

Consider minimizing the function \[ f(x)=x^2+x-2\sqrt{x} \] on the interval $ [0,2] $

The Golden Section Search Method

We can write an R function to evaluate f(x) as follows:

 f <- function(x) {
 x^2 + x - 2*sqrt(x)
 }

We can use the curve() function to verify that there is a single minimum in $ [0, 2] $:

curve(f, from = 0, to = 2)

plot of chunk unnamed-chunk-2

The curve is displayed on the next slide, where we can see that the minimizer is located near $ x = 0.3 $.

The Golden Section Search Method

plot of chunk unnamed-chunk-3

The Golden Section Search Method

The golden section search method is an iterative method:

Start with the interval $ [a, b] $, known to contain the mini- mizer.
Repeatedly shrink the interval, finding smaller and smaller intervals $ [a', b'] $ which still contain the minimizer.
Stop when $ b'-a' $ is small enough, i.e. when the interval length is less than a pre-set tolerance.

When the search stops, the midpoint of the nal interval will serve as a good approximation to the true minimizer, with a maximum error of $ (b'-a')/2 $.

The Golden Section Search Method

The shrinkage step begins by evaluating the function at two points $ x_1 < x_2 $ in the interior of the interval $ [a, b] $.

Because we have assumed that there is a unique minimum, we know that if $ f(x_1) > f(x_2) $, then the minimum must lie to the right of $ x_1 $, i.e. in the interval

\[ [a',b']=[x_1,b] \]

If $ f(x_1) < f(x_2) $, the minimum must lie in

\[ [a',b']=[a,x_2] \]

The Golden Section Search Method

The choice of the points between a and b makes use of properties of the golden ratio $ \phi=(\sqrt{5}+1)/2 $.

We make use of the fact that $ 1/\phi=\phi-1 $ and $ 1/\phi^2=1-1/\phi $ in the following.

We locate the interior points at $ x_1=b-(b-a)/\phi $ and $ x_2=a+(b-a)/\phi $

The reason for this choice is as follows.

After one iteration of the search, it is possible that we will discard a and replace it with $ a' = x_1 $.

The Golden Section Search Method

Then the new value to use as $ x_1 $ will be

\[ x_1'=b-(b-a)/\phi \]

\[ =b-(b-x_1)/\phi \]

\[ =b-(b-a)/\phi^2 \]

\[ =a+(b-a)/\phi \]

\[ =x_2 \]

The Golden Section Search Method

In other words, we can re-use a point we already have.

We do not need a new calculation to find it, and we don't need a new evaluation of $ f(x'_1) $

We can re-use $ f(x_2) $.

Similarly, if we update to $ b' = x_2 $, then $ x'_2 = x_1 $, and we can re-use that point.

The Golden Section Search Method

 golden <- function (f, a, b, tol = 0.0000001)
 {
 ratio <- 2 / (sqrt(5) + 1)
 x1 <- b - ratio * (b - a)
 x2 <- a + ratio * (b - a)
 f1 <- f(x1)
 f2 <- f(x2)

 while(abs(b - a) > tol) {
 if (f2 > f1) {
 b <- x2
 x2 <- x1
 f2 <- f1
 x1 <- b - ratio * (b - a)
 f1 <- f(x1)
 } else {
 a <- x1
 x1 <- x2
 f1 <- f2
 x2 <- a + ratio * (b - a)
 f2 <- f(x2)
 }
 }
 return((a + b) / 2)
 }

The Golden Section Search Method

We test and see that golden() works, at least on one function.

golden(f, 0, 2)

[1] 0.3478104

Limitations:

Mainly deal with one variate optimization problems
Can only deal with local optimization problem rather than global ones.

Steepest Descent

The Golden Section method will work on any kind of function as long as the starting interval contains a single local minimum.

If the function is known to be smooth, other methods can be used which will converge to the solution faster.

The method of steepest descent works on functions which have a single derivative. It is used most often in problems involving more than 1 variable.

Example

A simple example of a function of 2 variables to be minimized is

\[ f(x)=f(x_1,x_2)=\frac{(2-x_1)^2}{2x_2^2}+\frac{(3-x_1)^2}{2x_2^2}+\log(x_2) \]

Note that $ x_2 $ should be positive, so we might need to protect against negative values of $ x_2 $.

Steepest Descent Algorithm

The essential idea of steepest descent is that the function de- creases most quickly in the direction of the negative gradient. The method starts at an initial guess x.

The next guess is made by moving in the direction of the neg- ative gradient.

The location of the minimum along this line can then be found by using a one-dimensional search algorithm such as golden sec- tion search.

Steepest Descent Algorithm

The nth update is then

\[ x_n=x_{n-1}-\alpha f'(x_{n-1}) \]

where $ \alpha $ is chosen to minimize the one-dimensional function:

\[ g(\alpha)=f(x_{n-1}-\alpha f'(x_{n-1})) \]

In order to use golden section, we need to assume that $ \alpha $ is in an interval. We take the interval to be $ [0, h] $ where $ h $ is a value that we need to choose.

Steepest Descent Algorithm

 steepestdescent <- function(f, fprime, start, h,
 tol=1e-7, maxiter=100) {
 x <- start
 g <- function(alpha) {
 f(x - alpha*fpx)
 }
 niter <- 0
 while(niter < maxiter & sum(abs(fprime(x))) > tol) {
 fpx <- fprime(x)
 alpha <- golden(g, 0, h)
 x <- x - alpha*fpx
 niter <- niter + 1
 }
 if (niter == maxiter) {
 print("Warning: Maximum number of iterations reached")
 }
 c("Minimizer" = x)
 }

Steepest Descent Example

This time, we need functions for both the function and the gradient:

 f1 <- function(x) {
 (2-x[1])^2/(2*x[2]^2) +
 (3-x[1])^2/(2*x[2]^2) + log(x[2])
 }
 f1prime <- function(x) {
 c(-(2-x[1])/x[2]^2 - (3-x[1])/x[2]^2,
 -(2-x[1])^2/x[2]^3 -
 (3-x[1])^2/x[2]^3 + 1/x[2])
 }

Steepest Descent Example

Let's try a starting value of $ x = (.1, .1) $.

steepestdescent(f1, f1prime, start=c(.1,.1), h=.1)

[1] "Warning: Maximum number of iterations reached"

Minimizer1 Minimizer2 
 2.4992967  0.7123675

We haven't converged yet.

Steepest Descent Example

One possibility is to run the procedure again, using the most recent result as our starting guess:

 steepestdescent(f1, f1prime,
 start=c(2.4992967, 0.7123675), h=.1)

Minimizer1 Minimizer2 
 2.5000000  0.7071068

steepestdescent(f1, f1prime, start=c(.1, .1), h=.1,maxiter=125)

[1] "Warning: Maximum number of iterations reached"

Minimizer1 Minimizer2 
 2.5000000  0.7071068

The Newton-Raphson Method

If the function to be minimized has two continuous derivatives and we know how to evaluate them, we can make use of this information to give a faster algorithm than the golden section search.

We want to find a minimizer $ x^* $ of the function $ f(x) $ in the interval $ [a, b] $. Provided the minimizer is not at $ a $ or $ b $, $ x^* $ will satisfy $ f'(x^*) = 0 $.

Other solutions of $ f'(x^*) = 0 $ are maximizers and points of inection. One sufficient condition to guarantee that our solution is a minimum is to check that $ f''(x^*) > 0 $.

The Newton-Raphson Method

If we have a guess $ x_0 $ at a minimizer, we use the fact that $ f''(x) $ is the slope of $ f'(x) $ and approximate $ f'(x) $ using a Taylor series approximation:

\[ f'(x)=f'(x_0)+(x-x_0)f''(x_0) \]

Finding a zero of the right hand side should give us an approx- imate solution to $ f'(x^*) = 0 $.

The Newton-Raphson Minimization Method

Start with an initial guess $ x_0 $, and compute an improved guess using the solution

\[ x_1=x_0-\frac{f'(x_0)}{f''(x_0)} \]

Then use $ x_1 $ in place of $ x_0 $, to obtain a new update $ x_2 $.

Continue with iterations of the form

\[ x_{n+1}=x_n-\frac{f'(x_n)}{f''(x_n)} \]

This iteration stops when $ f'(x_n) $ is close enough to 0. Usually, we set a tolerance $ \epsilon $ and stop when $ |f'(x_n)|<\epsilon $

The Newton-Raphson Method

In actual practice, implementation of Newton-Raphson can be tricky. We may have $ f''(x_n) = 0 $, in which case the function looks locally like a straight line, with no solution to the Taylor series approximation to $ f'(x^*) = 0 $. In this case a simple strat- egy is to move a small step in the direction which decreases the function value, based only on $ f'(x_n) $.

In cases where xn is far from the true minimizer, the Taylor approximation may be inaccurate, and $ f(x_{n+1}) $ is larger than $ f(x_n) $. When this happens, replace $ x_{n+1} $ with $ (x_{n+1}+x_n)=2 $ (or another value between $ x_n $ and $ x_{n+1} $), since a smaller step- size may produce better results.

The Newton-Raphson Method

 Newton <- function(fprime, f2prime, x, tol=1e-7, maxit=10) {
 niter <- 0
 while(niter < maxit) {
 fpx <- fprime(x)
 f2px <- f2prime(x)
 x <- x - fpx/f2px
 if (abs(fpx) < tol) break
 niter <- niter+1
 }
 if (niter == maxit) {
 print("Warning: Iteration limit exceeded in Newton")
 }
 x
 }

Newton-Raphson: Example

Newton-Raphson can be used in place of the golden search in the steepest descent algorithm.

At each step we minimize $ g(\alpha)=f(x-\alphaf'(x)) $ using the Newton-Raphson method.

Newton-Raphson: Example

steepestdescentNewton <- function(f, fprime, f2prime,
 start, alpha0, tol=1e-7, maxiter=100) {
 gprime <- function(alpha) -fpx%*%fprime(x-alpha*fpx)
 g2prime <- function(alpha) fpx%*%f2prime(x-alpha*fpx)%*%fpx
 x <- start; niter <- 0
 while(niter < maxiter & sum(abs(fprime(x))) > tol) {
 fpx <- fprime(x)
 alpha <- Newton(fprime=gprime, f2prime=g2prime, x=alpha0)
 x <- x - alpha*fpx
 niter <- niter + 1
 }
 if (niter == maxiter) {
 print("Warning: Maximum number of iterations reached")
 }
 c("Minimizer" = x)
 }

Newton-Raphson: Example

We need the second derivative of f. This is a $ 2 \times 2 $ matrix:

 f2prime <- function(x) {
 matrix(c(2/x[2]^2, 2*(2-x[1])/x[2]^3 +
 2*(3-x[1])/x[2]^3,
 2*(2-x[1])/x[2]^3 +2*(3-x[1])/x[2]^3,
 3*(2-x[1])^2/x[2]^4 + 3*(3-x[1])^2/x[2]^4),
 nrow=2)
 }

Newton-Raphson: Example

What happens if we start far from the solution?

steepestdescentNewton(f1, f1prime, f2prime, c(1.1,.5), .01)

[1] "Warning: Iteration limit exceeded in Newton"
[1] "Warning: Iteration limit exceeded in Newton"
[1] "Warning: Iteration limit exceeded in Newton"

Minimizer1 Minimizer2 
 2.5000000 -0.7071068

Newton-Raphson: Example

It could be worse:

steepestdescentNewton(f1, f1prime, f2prime, c(1.1,.5), .1)

[1] "Warning: Iteration limit exceeded in Newton"

   Minimizer1    Minimizer2 
-8.640211e+12 -2.573190e+13

Newton-Raphson: Example

What happens if we try starting a bit closer to the solution?

steepestdescentNewton(f1, f1prime, f2prime, c(1.5,.5), .1)

[1] "Warning: Iteration limit exceeded in Newton"
[1] "Warning: Iteration limit exceeded in Newton"
[1] "Warning: Iteration limit exceeded in Newton"

Minimizer1 Minimizer2 
 2.5000000  0.7071068

The Nelder-Mead Simplex Method

In the previous sections, we have talked about two different methods for optimizing a function of one variable.

However, when a function depends on multiple inputs, opti- mization becomes much harder.

It is hard even to visualize the function once it depends on more than two inputs.

The Nelder-Mead simplex algorithm is one method for opti- mization of a function of several variables.

The Nelder-Mead Simplex Method

In p dimensions, it starts with $ p + 1 $ points $ x1,\cdots,x_{p+1} $, ar- ranged so that when considered as vertices of a p-dimensional solid (a simplex), they enclose a non-zero volume.

For example, in two dimensions the three points would not be allowed to all lie on one line so they would form a triangle, and in three dimensions the four points would form a proper tetrahedron.

The points are labelled in order from smallest to largest values of $ f(x_i) $, so that $ f(x_1)\leq f(x_2)\leq \cdots \leq f(x_{p+1}) $.

The Nelder-Mead Simplex Method

The idea is that to minimize $ f(x) $, we would like to drop $ x_{p+1} $ and replace it with a point that gives a smaller value.

We do this by calculating several proposed points $ z_i $ from the existing points.

The Nelder-Mead Simplex Method

There are four kinds of proposals, illustrated in the next figure in two dimensions.

The first three refer to the midpoint of $ x_1,\cdots,x_p $ which we calculate as $ x_{mid} = (x1 +\cdots +xp)/p $.

The Nelder-Mead Simplex Method

Reflection: reflect $ x_{p+1} $ through $ x_{mid} $ to $ z_1 $.
Reflection and Expansion: reflect $ x_{p+1} $ through $ x_{mid} $, and double its distance, giving $ z_2 $.
Contraction 1: contract $ x_{p+1} $ halfway towards $ x_{mid} $ to give $ z_3 $.
Contraction 2: contract all points halfway towards $ x_1 $, giving $ z4,\cdots zp+3 $.

The Nelder-Mead Simplex Method

We consider each of these choices of simplex in order, based on the values of $ f(z_i) $.

Initialization:
Place the initial points in a matrix x,
so that point i is in x[i,]
For i in 1:(p + 1) calculate f(x[i,])
Relabel the points so that
f(x[1,]) <= f(x[2,]) <= ... <= f(x[p + 1,])
Calculate the midpoint xmid =
(x[1,] + x[2,] + ... + x[p,]) / p

The Nelder-Mead Simplex Method

Trials:
Calculate z1 by reflection: z1 <- xmid - (x[p + 1,] - xmid)
If f(z1) < f(x[1,]) { # Region A
Calculate z2 by reflection and expansion:
z2 <- xmid - 2 * (x[p + 1,] - xmid)
If f(z2) < f(z1) return(z2)
else return(z1)
} else {
If f(z1) < f(x[p,]) return(z1) # Region B
If f(z1) < f(x[p + 1,]) {
Swap z1 with x[p + 1,] # Region C
}
}

The Nelder-Mead Simplex Method

At this point we know f(z1) is in region D.
Try contraction 1, giving z3.
If f(z3) < f(x[p + 1,]) return(z3) # Region A, B, or C
At this point nothing has worked, so we use
contraction 2 to move everything towards x[1,]

The Nelder-Mead Simplex Method

Minimize the function

\[ f(x,y)=\frac{(x-y)^2+(x-2)^2+(y-3)^4}{10} \]

using the Nelder-Mead algorithm.

f <- function(x, y) {
 ((x - y)^2 + (x - 2)^2 + (y - 3)^4) / 10
 }

The Nelder-Mead Simplex Method

We start by drawing a contour plot of the function, in order to get approximate starting values.

After some experimentation, we obtain the following plot

 x <- seq(0, 5, len=20)
 y <- seq(0, 5, len=20)
 z <- outer(x, y, f)
 contour(x, y, z)

plot of chunk unnamed-chunk-18

The Nelder-Mead Simplex Method

We implemented the Nelder-Mead update algorithm in an R function with header neldermead(x, f), where x is our matrix in the pseudocode, and f is the function.

The output of neldermead(x, f) is an updated copy of the matrix x.

The Nelder-Mead Simplex Method

 f <- function(x, y) ((x-y)^2 + (x-2)^2 + (y-3)^4)/10
 x <- seq(-3,5,len=20)
 y <- seq(0,5,len=20)
 z <- outer(x, y, f)
 contour(x,y,z)
 x <- matrix(c(0, 0, 2, 0, 2, 0), 3, 2)
 polygon(x)

plot of chunk unnamed-chunk-19

The Nelder-Mead Simplex Method

1 :Accepted reflection, f(z1)= 3.3

2 :Swap z1 and x3 Accepted contraction 1, f(z3)= 3.25

3 :Accepted reflection and expansion, f(z2)= 0.31875

4 :Accepted reflection, f(z1)= 0.21875

5 :Accepted contraction 1, f(z3)= 0.21875

6 :Accepted contraction 1, f(z3)= 0.1

7 :Accepted contraction 1, f(z3)= 0.04963379

8 :Accepted contraction 1, f(z3)= 0.03874979

9 :Swap z1 and x3

Accepted contraction 1, f(z3)= 0.02552485

The Nelder-Mead Simplex Method

library("nloptr")
psf <- function(x)  (x[1] + 10*x[2])^2 + 5*(x[3] - x[4])^2 + 
                    (x[2] - 2*x[3])^4 + 10*(x[1] - x[4])^4
x0 <- c(3, -1, 0, 1)
neldermead(x0, psf)

$par
[1] -4.085274e-11  4.085274e-12 -3.522447e-11 -3.522447e-11

$value
[1] 3.721119e-41

$iter
[1] 1000

$convergence
[1] 5

$message
[1] "NLOPT_MAXEVAL_REACHED: Optimization stopped because maxeval (above) was reached."

Built-in Functions

There are several general purpose optimization functions in R.

For one-dimensional optimization, the optimize() function performs a variation on the golden section search we described earlier.

Built-in Functions

There are also multi-dimensional optimizers.

The frst of these is the optim() function. optim() is a general purpose wrapper for several different optimization methods, including Nelder-Mead, variations on Newton-Raphson, and others that we haven't discussed.

Syntax for optim()

optim(par, fn, ...)

The par parameter to optim() gives starting values for the pa- rameters.

Besides telling optim() where to begin, these indicate how many parameters will vary in its calls to fn, the second parameter. fn is an R function which evaluates the function to be minimized.

Syntax for optim()

Its first argument should be a vector of the same length as par; optim() will call it repeatedly, varying the value of this parameter, in order to find the minimum.

It should return a scalar value.

The optim() function has a number of optional parameters de- scribed on its help page.

Besides those, the optional parameters in the … list could include additional parameters to pass to fn.

Built-in Functions

There are other functions in R for general function optimization: nlm() and nlminb().

In most cases optim() is preferred because it oers more exibility, but there may be instances where one of the others performs better.

The constrOptim() function is aimed at cases where there are linear inequalities expressing constraints on the parameters.

Linear Programming

We often need to minimize (or maximize) a function subject to constraints.

When the function is linear and the constraints can be expressed as linear equations or inequalities, the problem is called a linear programming problem.

Linear Programming

The so-called standard form for the minimization problem in linear programming is

\[ \min_{x_1,x_2,...,x_k} C(x)=c_1x_2+\cdots+c_kx_k \]

subject to the constraints

\[ a_{11}x_1 +\cdots+a_{1k}x_k \geq b1 \]

\[ a_{21}x_1 +\cdots+a_{2k}x_k \geq b2 \]

\[ \cdots \]

\[ a_{m1}x_1 +\cdots+a_{mk}x_k \geq bm \]

Linear Programming

The idea is to find values of the decision variables $ x_1, x_2,\cdots, x_n $ which minimize the objective function C(x), subject to the con- straints and nonnegativity conditions.

Linear Programming: Example

A company has developed two procedures for reducing sulfur dioxide and carbon dioxide emissions from its factory.

The first reduces equal amounts of each gas at a per unit cost of $5. The second reduces the same amount of sulfur dioxide as the first method, but reduces twice as much carbon dioxide gas; the per unit cost of this method is $8.

The company is required to reduce sulfur dioxide emissions by 2 million units and carbon dioxide emissions by 3 million units. What combination of the two emission procedures will meet this requirement at minimum cost?

Linear Programming: Example

Let x1 denote the amount of the rst procedure to be used, and let x2 denote the amount of the second procedure to be used. For convenience, we will let these amounts be expressed in millions of units.

Then the cost (in millions of dollars) can be expressed as \[ C = 5x1 +8x2: \]

Since both methods reduce sulfur dioxide emissions at the same rate, the number of units of sulfur dioxide reduced will then be $ x1 +x2 $

Linear Programming: Example

Noting the requirement to reduce the sulfur dioxide amount by 2 million units, we have the constraint \[ x1 +x2 \geq 2: \] The carbon dioxide reduction requirement is 3 million units, and the second method reduces carbon dioxide twice as fast as the frst method, so we have the second constraint

\[ x_1 +2x_2 \geq 3: \]

Linear Programming: Example

Finally, note that x1 and x2 must be nonnegative, since we cannot use negative amounts of either procedure.

Thus, we obtain the linear programming problem:

\[ \min C = 5x1 +8x2 \]

\[ x1 +x2 \geq 2 \]

\[ x1 +2x2 \geq 3 \]

\[ x1, x2 \geq 0: \]

Linear Programming: Example

These relations are graphed in the figure on the next slide. The region shaded in grey is the feasible region; this is the set of all possible $ (x1, x2) $ combinations which satisfy the constraints. The unshaded area contains those combinations of values where the constraints are violated.

Linear Programming: Example

The gradient of the function C(x) is $ (5, 8) $, so this vector gives the direction of most rapid increase for that function.

The level sets or contours of this function are perpendicular to this vector.

One of the level sets is indicated as a dashed line in the earlier figure.

The solution of the minimization problem lies at the intersection of the first contour which intersects the feasible region.

If this happens at a single point, we have a unique minimizer.

In this example, this intersection is located at the point $ (1, 1) $.

Linear Programming

It can be shown that the only possible minimizers for such lin- ear programming problems must be at the intersections of the constraint boundaries, as in the earlier example.

The points of intersection of the constraints are called basic solutions. If these intersection points lie in the feasible region, they are called basic feasible solutions.

If there is at least one basic feasible solution, then one of them will be an optimal solution.

The point (1, 1) is the optimal solution in the example.

Solving linear programming problems in R

There is more than one linear programming function available in R, but we believe the lp() function in the lpSolve package may be the most stable version currently available.

It is based on the revised simplex method; this method intelli- gently tests a number of extreme points of the feasible region to see whether they are optimal.

The lp() function has a number of parameters:

objective.in - the vector of coecients of the objective function.
const.mat - a matrix containing the coecients of the de- cision variables in the left-hand side of the constraints; each row corresponds to a constraint.
const.dir - a character vector indicating the direction of the constraint inequalities; some of the possible entries are >=, == and <=.
const.rhs - a vector containing the constants given on the right-hand side of the constraints

Solving linear programming problems in R

To solve the pollution example, type

 library(lpSolve)
 eg.lp <- lp(objective.in=c(5, 8), const.mat=
 matrix(c(1, 1, 1, 2), nrow=2),
 const.rhs=c(2, 3), const.dir=c(">=", ">="))
 eg.lp

Success: the objective function is 13

eg.lp$solution

[1] 1 1

Maximization and other kinds of constraints

The lp() function can handle maximization problems with the use of the direction=“max” parameter. Furthermore, the const.dir parameter allows for different types of inequalities.

Maximization: Example

We will solve the problem:

\[ \max C = 5x1 +8x2 \]

subject to the constraints

\[ x1 +x2 \leq 2 \]

\[ x1 +2x2 = 3 \]

\[ x1; x2 \geq 0: \]

Maximization: Example

In R, this can be coded as

 eg.lp <- lp(objective.in=c(5, 8),
 const.mat=matrix(c(1, 1, 1, 2), nrow=2),
 const.rhs=c(2, 3),
 const.dir=c("<=", "="), direction="max")
 eg.lp$solution

[1] 1 1

The solution is (1, 1), giving a maximum value of 13.

Special situations: Multiple Optima

It sometimes happens that there are multiple solutions for a linear programming problem.

Multiple Optima: Example

A slight modification of the pollution emission example is \[ \min C = 4x1 +8x2 \] subject to the constraints \[ x1 +x2 \geq 2 \] \[ x1 +2x2 \geq 3 \] and \[ x1, x2 \geq 0: \] This problem has a solution at (1,1) as well as at (3, 0). All points on the line joining these two points are solutions as well. The figure on the next slide shows this graphically.

Multiple Optima: Example

The lp() function does not alert the user to the existence of multiple minima.

In fact, the output from this function for the modified pollution emission example is the solution $ x1 = 3, x2 = 0 $.

Degeneracy

For a problem with m decision variables, degeneracy arises when more than m constraint boundaries intersect at a single point.

This situation is quite rare, but it has potential to cause diff- culties for the simplex method, so it is important to be aware of this condition.

In very rare circumstances, degeneracy can prevent the method from converging to the optimal solution; most of the time, how- ever, there is little to worry about.

Degeneracy: Example

The following problem has a point of degeneracy which is not at the optimum; however, the lp() function works. \[ \min C = 3x1 +x2 \] subject to the constraints \[ x1 +x2 \geq 2 \] \[ x1 +2x2 \geq 3 \] \[ x1 +3x2 \geq 4 \] \[ 4x1 +x2 \geq 4 \]

and \[ x1, x2 \geq 0: \]

Degeneracy: Example

This problem can be solved easily:

 degen.lp <- lp(objective.in=c(3, 1),
 const.mat=matrix(c(1, 1, 1, 4, 1, 2, 3, 1), nrow=4),
 const.rhs=c(2, 3, 4, 4), const.dir=rep(">=", 4))
 degen.lp

Success: the objective function is 3.333333

degen.lp$solution

[1] 0.6666667 1.3333333

Infeasibility

Infeasibility is a more common problem.

When the constraints cannot simultaneously be satisfied there is no feasible region.

Then no feasible solution exists.

Infeasibility: Example

\[ minC = 5x1 +8x2 \]

subject to the constraints \[ x1 +x2 \geq 2 \] \[ x1 +x2 \leq 1 \] and \[ x1, x2 \geq 0: \] It is obvious that the constraints cannot simultaneously be sat- isfied.

Infeasibility: Example

Here is the output from the lp() function:

 eg.lp <- lp(objective.in=c(5, 8),
 const.mat=matrix(c(1, 1, 1, 1), nrow=2),
 const.rhs=c(2, 1), const.dir=c(">=", "<="))
 eg.lp

Error: no feasible solution found

Unboundedness

In rare instances, the constraints and objective function give rise to an unbounded solution.

A trivial example of unboundedness arises when solving \[ \max C = 5x1 +8x2 \] subject to the constraints

\[ x1 +x2 \geq 2 \] \[ x1 +2x2 \geq 3 \] and \[ x1, x2 \geq 0: \]

Unboundedness

The feasible region for this problem was plotted earlier.

However, instead of trying to minimize the objective function, we are now maximizing, so we follow the direction of increasing the objective function.

We can make the objective function as large as we wish, by taking x1 and x2 arbitrarily large.

Unboundedness

Here is what happens when lp() is applied to this problem:

eg.lp <- lp(objective.in=c(5, 8),
 const.mat=matrix(c(1, 1, 1, 2), nrow=2),
 const.rhs=c(2, 3), const.dir=c(">=", ">="),
 direction="max")
 eg.lp

Error: status 3

The condition of unboundedness will most often arise when con- straints and/or the objective function have not been formulated correctly.

Unrestricted variables

Sometimes a decision variable is not restricted to be nonnega- tive. The lp() function is not set up to handle this case directly. However, a simple device gets around this difficulty.

If x is unrestricted in sign, then x can be written as $x1 - x$2, where $ x1 \leq 0 $ and $ x2 \leq 0 $. This means that every unrestricted variable in a linear programming problem can be replaced by the difference of two nonnegative variables.

Unrestricted variables

We will solve the problem:

\[ \min C = x1 +10x2 \]

\[ x1 +x2 \geq 2; \]

\[ x1 - x2\leq 3 \]

\[ x1 \geq 0: \]

Unrestricted variables

Noting that x2 is unrestricted in sign, we set $ x2 = x3 - x4 $ for nonnegative x3 and x4. Plugging these new variables into the problem gives

\[ \min C = x1 +10x3 - 10x4 \]

subject to the constraints

\[ x1 +x3 - x4 \leq 2 \] \[ x1 - x3 +x4\geq 3 \]

\[ x1 \geq 0, x3 \geq 0, x4 \geq 0: \]

Unrestricted Variables: Example

Converting this to R code, we have

 unres.lp <- lp(objective.in=c(1, 10, -10),
 const.mat=matrix(c(1, 1, 1, -1, -1, 1), nrow=2),
 const.rhs=c(2, 3), const.dir=c(">=", "<="))
 unres.lp

Success: the objective function is -2.5

 unres.lp$solution

[1] 2.5 0.0 0.5

Integer programming

Decision variables are often restricted to be integers. For exam- ple, we might want to minimize the cost of shipping a product by using 1, 2 or 3 different trucks. It is not possible to use a fractional number of trucks, so the number of trucks must be integer-valued.

Problems involving integer-valued decision variables are called integer programming problems. Simple rounding of a non- integer solution to the nearest integer is not good practice; the result of such rounding can be a solution which is quite far from the optimal solution.

Integer programming

The lp() function has a facility to handle integer valued vari- ables using a technique called the branch and bound algorithm. The int.vec argument can be used to indicate which variables have integer values.

Integer Programming: Example

Find nonnegative x1; x2, x3 and x4 to minimize \[ C(x) = 2x1 +3x2 +4x3 ???? x4 \] subject to the constraints \[ x1 +2x2 \geq 9 \] \[ 3x2 +x3 \geq 9: \] and \[ x2 +x4 \leq 10: \] Furthermore, x2 and x4 can only take integer values.

Integer Programming: Example

 integ.lp <- lp(objective.in=c(2, 3, 4, -1),
 const.mat=matrix(c(1, 0, 0, 2, 3, 1, 0,
 1, 0, 0, 0, 1), nrow=3),
 const.dir=c(">=", ">=", "<="),
 const.rhs=c(9, 9, 10),
 int.vec=c(2, 4))
 integ.lp

Success: the objective function is 8

 integ.lp$solution

[1] 1 4 0 6

Integer Programming: Example

Here is what happens when the integer variables are ignored:

 wrong.lp <- lp(objective.in=c(2, 3, 4, -1),
 const.mat=matrix(c(1, 0, 0, 2, 3, 1, 0,
 1, 0, 0, 0, 1), nrow=3), const.dir=
 c(">=", ">=", "<="), const.rhs=c(9, 9, 10))
 wrong.lp

Success: the objective function is 8

 wrong.lp$solution

[1] 0.0 4.5 0.0 5.5

Integer Programming: Example

The lp() function provides an interface to code written in C. There is another function in the linprog package called solveLP() which is written entirely in R; this latter function solves large problems much more slowly than the lp() function, but it provides more detailed output. We note also the function simplex() in the boot package. I

t should also be noted that, for very large problems, the simplex method might not converge quickly enough; better procedures, called interior point methods, have been discovered recently, and are implemented in other programming languages, but not yet in R.