Jinkun Xiao
March 26th
The optimization problem: given a function \( f(\cdots) \), which value of x makes \( f(x) \) as large or as small as possible? Multivariate or Univariate optimization procedure?
In your calculus course, you learned how to do this by solving an equation involving the first derivative.
You then used the second derivative to test whether you found a maximum or minimum.
Most optimization problems are too complicated to be solved this way.
Computational methods or algorithms for optimization fall into two classes: linear and nonlinear.
Linear optimization normally involves constraints, and is referred to as linear programming when the constraints are linear. Nonlinear optimization can be constrained or unconstrained.
Unconstrained optimization of a smooth function can be done using gradient and Hessian methods (steepest descent or Newton- Raphson).
Non-smooth functions require different approaches.
Constrained optimization is often performed by applying a penalty to violation of the constraint.
Knowing how to do minimization is sufficient.
If we want to maximize \( f(x) \), we simply change the sign and minimize \( -f(x) \).
The golden section search method is used to nd the minimizer of a single-variable function which has a single minimum on a given interval \( [a, b] \).
The function does not need to be smooth.
Consider minimizing the function \[ f(x)=x^2+x-2\sqrt{x} \] on the interval \( [0,2] \)
We can write an R function to evaluate f(x) as follows:
f <- function(x) {
x^2 + x - 2*sqrt(x)
}
We can use the curve() function to verify that there is a single minimum in \( [0, 2] \):
curve(f, from = 0, to = 2)
The curve is displayed on the next slide, where we can see that the minimizer is located near \( x = 0.3 \).
The golden section search method is an iterative method:
Start with the interval \( [a, b] \), known to contain the mini- mizer.
Repeatedly shrink the interval, finding smaller and smaller intervals \( [a', b'] \) which still contain the minimizer.
Stop when \( b'-a' \) is small enough, i.e. when the interval length is less than a pre-set tolerance.
When the search stops, the midpoint of the nal interval will serve as a good approximation to the true minimizer, with a maximum error of \( (b'-a')/2 \).
The shrinkage step begins by evaluating the function at two points \( x_1 < x_2 \) in the interior of the interval \( [a, b] \).
Because we have assumed that there is a unique minimum, we know that if \( f(x_1) > f(x_2) \), then the minimum must lie to the right of \( x_1 \), i.e. in the interval
\[ [a',b']=[x_1,b] \]
If \( f(x_1) < f(x_2) \), the minimum must lie in
\[ [a',b']=[a,x_2] \]
The choice of the points between a and b makes use of properties of the golden ratio \( \phi=(\sqrt{5}+1)/2 \).
We make use of the fact that \( 1/\phi=\phi-1 \) and \( 1/\phi^2=1-1/\phi \) in the following.
We locate the interior points at \( x_1=b-(b-a)/\phi \) and \( x_2=a+(b-a)/\phi \)
The reason for this choice is as follows.
After one iteration of the search, it is possible that we will discard a and replace it with \( a' = x_1 \).
Then the new value to use as \( x_1 \) will be
\[ x_1'=b-(b-a)/\phi \]
\[ =b-(b-x_1)/\phi \]
\[ =b-(b-a)/\phi^2 \]
\[ =a+(b-a)/\phi \]
\[ =x_2 \]
In other words, we can re-use a point we already have.
We do not need a new calculation to find it, and we don't need a new evaluation of \( f(x'_1) \)
We can re-use \( f(x_2) \).
Similarly, if we update to \( b' = x_2 \), then \( x'_2 = x_1 \), and we can re-use that point.
golden <- function (f, a, b, tol = 0.0000001)
{
ratio <- 2 / (sqrt(5) + 1)
x1 <- b - ratio * (b - a)
x2 <- a + ratio * (b - a)
f1 <- f(x1)
f2 <- f(x2)
while(abs(b - a) > tol) {
if (f2 > f1) {
b <- x2
x2 <- x1
f2 <- f1
x1 <- b - ratio * (b - a)
f1 <- f(x1)
} else {
a <- x1
x1 <- x2
f1 <- f2
x2 <- a + ratio * (b - a)
f2 <- f(x2)
}
}
return((a + b) / 2)
}
We test and see that golden() works, at least on one function.
golden(f, 0, 2)
[1] 0.3478104
Limitations:
The Golden Section method will work on any kind of function as long as the starting interval contains a single local minimum.
If the function is known to be smooth, other methods can be used which will converge to the solution faster.
The method of steepest descent works on functions which have a single derivative. It is used most often in problems involving more than 1 variable.
A simple example of a function of 2 variables to be minimized is
\[ f(x)=f(x_1,x_2)=\frac{(2-x_1)^2}{2x_2^2}+\frac{(3-x_1)^2}{2x_2^2}+\log(x_2) \]
Note that \( x_2 \) should be positive, so we might need to protect against negative values of \( x_2 \).
The essential idea of steepest descent is that the function de- creases most quickly in the direction of the negative gradient. The method starts at an initial guess x.
The next guess is made by moving in the direction of the neg- ative gradient.
The location of the minimum along this line can then be found by using a one-dimensional search algorithm such as golden sec- tion search.
The nth update is then
\[ x_n=x_{n-1}-\alpha f'(x_{n-1}) \]
where \( \alpha \) is chosen to minimize the one-dimensional function:
\[ g(\alpha)=f(x_{n-1}-\alpha f'(x_{n-1})) \]
In order to use golden section, we need to assume that \( \alpha \) is in an interval. We take the interval to be \( [0, h] \) where \( h \) is a value that we need to choose.
steepestdescent <- function(f, fprime, start, h,
tol=1e-7, maxiter=100) {
x <- start
g <- function(alpha) {
f(x - alpha*fpx)
}
niter <- 0
while(niter < maxiter & sum(abs(fprime(x))) > tol) {
fpx <- fprime(x)
alpha <- golden(g, 0, h)
x <- x - alpha*fpx
niter <- niter + 1
}
if (niter == maxiter) {
print("Warning: Maximum number of iterations reached")
}
c("Minimizer" = x)
}
This time, we need functions for both the function and the gradient:
f1 <- function(x) {
(2-x[1])^2/(2*x[2]^2) +
(3-x[1])^2/(2*x[2]^2) + log(x[2])
}
f1prime <- function(x) {
c(-(2-x[1])/x[2]^2 - (3-x[1])/x[2]^2,
-(2-x[1])^2/x[2]^3 -
(3-x[1])^2/x[2]^3 + 1/x[2])
}
Let's try a starting value of \( x = (.1, .1) \).
steepestdescent(f1, f1prime, start=c(.1,.1), h=.1)
[1] "Warning: Maximum number of iterations reached"
Minimizer1 Minimizer2
2.4992967 0.7123675
We haven't converged yet.
One possibility is to run the procedure again, using the most recent result as our starting guess:
steepestdescent(f1, f1prime,
start=c(2.4992967, 0.7123675), h=.1)
Minimizer1 Minimizer2
2.5000000 0.7071068
steepestdescent(f1, f1prime, start=c(.1, .1), h=.1,maxiter=125)
[1] "Warning: Maximum number of iterations reached"
Minimizer1 Minimizer2
2.5000000 0.7071068
If the function to be minimized has two continuous derivatives and we know how to evaluate them, we can make use of this information to give a faster algorithm than the golden section search.
We want to find a minimizer \( x^* \) of the function \( f(x) \) in the interval \( [a, b] \). Provided the minimizer is not at \( a \) or \( b \), \( x^* \) will satisfy \( f'(x^*) = 0 \).
Other solutions of \( f'(x^*) = 0 \) are maximizers and points of inection. One sufficient condition to guarantee that our solution is a minimum is to check that \( f''(x^*) > 0 \).
If we have a guess \( x_0 \) at a minimizer, we use the fact that \( f''(x) \) is the slope of \( f'(x) \) and approximate \( f'(x) \) using a Taylor series approximation:
\[ f'(x)=f'(x_0)+(x-x_0)f''(x_0) \]
Finding a zero of the right hand side should give us an approx- imate solution to \( f'(x^*) = 0 \).
Start with an initial guess \( x_0 \), and compute an improved guess using the solution
\[ x_1=x_0-\frac{f'(x_0)}{f''(x_0)} \]
Then use \( x_1 \) in place of \( x_0 \), to obtain a new update \( x_2 \).
Continue with iterations of the form
\[ x_{n+1}=x_n-\frac{f'(x_n)}{f''(x_n)} \]
This iteration stops when \( f'(x_n) \) is close enough to 0. Usually, we set a tolerance \( \epsilon \) and stop when \( |f'(x_n)|<\epsilon \)
In actual practice, implementation of Newton-Raphson can be tricky. We may have \( f''(x_n) = 0 \), in which case the function looks locally like a straight line, with no solution to the Taylor series approximation to \( f'(x^*) = 0 \). In this case a simple strat- egy is to move a small step in the direction which decreases the function value, based only on \( f'(x_n) \).
In cases where xn is far from the true minimizer, the Taylor approximation may be inaccurate, and \( f(x_{n+1}) \) is larger than \( f(x_n) \). When this happens, replace \( x_{n+1} \) with \( (x_{n+1}+x_n)=2 \) (or another value between \( x_n \) and \( x_{n+1} \)), since a smaller step- size may produce better results.
Newton <- function(fprime, f2prime, x, tol=1e-7, maxit=10) {
niter <- 0
while(niter < maxit) {
fpx <- fprime(x)
f2px <- f2prime(x)
x <- x - fpx/f2px
if (abs(fpx) < tol) break
niter <- niter+1
}
if (niter == maxit) {
print("Warning: Iteration limit exceeded in Newton")
}
x
}
Newton-Raphson can be used in place of the golden search in the steepest descent algorithm.
At each step we minimize \( g(\alpha)=f(x-\alphaf'(x)) \) using the Newton-Raphson method.
steepestdescentNewton <- function(f, fprime, f2prime,
start, alpha0, tol=1e-7, maxiter=100) {
gprime <- function(alpha) -fpx%*%fprime(x-alpha*fpx)
g2prime <- function(alpha) fpx%*%f2prime(x-alpha*fpx)%*%fpx
x <- start; niter <- 0
while(niter < maxiter & sum(abs(fprime(x))) > tol) {
fpx <- fprime(x)
alpha <- Newton(fprime=gprime, f2prime=g2prime, x=alpha0)
x <- x - alpha*fpx
niter <- niter + 1
}
if (niter == maxiter) {
print("Warning: Maximum number of iterations reached")
}
c("Minimizer" = x)
}
We need the second derivative of f. This is a \( 2 \times 2 \) matrix:
f2prime <- function(x) {
matrix(c(2/x[2]^2, 2*(2-x[1])/x[2]^3 +
2*(3-x[1])/x[2]^3,
2*(2-x[1])/x[2]^3 +2*(3-x[1])/x[2]^3,
3*(2-x[1])^2/x[2]^4 + 3*(3-x[1])^2/x[2]^4),
nrow=2)
}
What happens if we start far from the solution?
steepestdescentNewton(f1, f1prime, f2prime, c(1.1,.5), .01)
[1] "Warning: Iteration limit exceeded in Newton"
[1] "Warning: Iteration limit exceeded in Newton"
[1] "Warning: Iteration limit exceeded in Newton"
Minimizer1 Minimizer2
2.5000000 -0.7071068
It could be worse:
steepestdescentNewton(f1, f1prime, f2prime, c(1.1,.5), .1)
[1] "Warning: Iteration limit exceeded in Newton"
Minimizer1 Minimizer2
-8.640211e+12 -2.573190e+13
What happens if we try starting a bit closer to the solution?
steepestdescentNewton(f1, f1prime, f2prime, c(1.5,.5), .1)
[1] "Warning: Iteration limit exceeded in Newton"
[1] "Warning: Iteration limit exceeded in Newton"
[1] "Warning: Iteration limit exceeded in Newton"
Minimizer1 Minimizer2
2.5000000 0.7071068
In the previous sections, we have talked about two different methods for optimizing a function of one variable.
However, when a function depends on multiple inputs, opti- mization becomes much harder.
It is hard even to visualize the function once it depends on more than two inputs.
The Nelder-Mead simplex algorithm is one method for opti- mization of a function of several variables.
In p dimensions, it starts with \( p + 1 \) points \( x1,\cdots,x_{p+1} \), ar- ranged so that when considered as vertices of a p-dimensional solid (a simplex), they enclose a non-zero volume.
For example, in two dimensions the three points would not be allowed to all lie on one line so they would form a triangle, and in three dimensions the four points would form a proper tetrahedron.
The points are labelled in order from smallest to largest values of \( f(x_i) \), so that \( f(x_1)\leq f(x_2)\leq \cdots \leq f(x_{p+1}) \).
The idea is that to minimize \( f(x) \), we would like to drop \( x_{p+1} \) and replace it with a point that gives a smaller value.
We do this by calculating several proposed points \( z_i \) from the existing points.
There are four kinds of proposals, illustrated in the next figure in two dimensions.
The first three refer to the midpoint of \( x_1,\cdots,x_p \) which we calculate as \( x_{mid} = (x1 +\cdots +xp)/p \).
Reflection: reflect \( x_{p+1} \) through \( x_{mid} \) to \( z_1 \).
Reflection and Expansion: reflect \( x_{p+1} \) through \( x_{mid} \), and double its distance, giving \( z_2 \).
Contraction 1: contract \( x_{p+1} \) halfway towards \( x_{mid} \) to give \( z_3 \).
Contraction 2: contract all points halfway towards \( x_1 \), giving \( z4,\cdots zp+3 \).
We consider each of these choices of simplex in order, based on the values of \( f(z_i) \).
Initialization:
Place the initial points in a matrix x,
so that point i is in x[i,]
For i in 1:(p + 1) calculate f(x[i,])
Relabel the points so that
f(x[1,]) <= f(x[2,]) <= ... <= f(x[p + 1,])
Calculate the midpoint xmid =
(x[1,] + x[2,] + ... + x[p,]) / p
Trials:
Calculate z1 by reflection: z1 <- xmid - (x[p + 1,] - xmid)
If f(z1) < f(x[1,]) { # Region A
Calculate z2 by reflection and expansion:
z2 <- xmid - 2 * (x[p + 1,] - xmid)
If f(z2) < f(z1) return(z2)
else return(z1)
} else {
If f(z1) < f(x[p,]) return(z1) # Region B
If f(z1) < f(x[p + 1,]) {
Swap z1 with x[p + 1,] # Region C
}
}
At this point we know f(z1) is in region D.
Try contraction 1, giving z3.
If f(z3) < f(x[p + 1,]) return(z3) # Region A, B, or C
At this point nothing has worked, so we use
contraction 2 to move everything towards x[1,]
Minimize the function
\[ f(x,y)=\frac{(x-y)^2+(x-2)^2+(y-3)^4}{10} \]
using the Nelder-Mead algorithm.
f <- function(x, y) {
((x - y)^2 + (x - 2)^2 + (y - 3)^4) / 10
}
We start by drawing a contour plot of the function, in order to get approximate starting values.
After some experimentation, we obtain the following plot
x <- seq(0, 5, len=20)
y <- seq(0, 5, len=20)
z <- outer(x, y, f)
contour(x, y, z)
We implemented the Nelder-Mead update algorithm in an R function with header neldermead(x, f), where x is our matrix in the pseudocode, and f is the function.
The output of neldermead(x, f) is an updated copy of the matrix x.
f <- function(x, y) ((x-y)^2 + (x-2)^2 + (y-3)^4)/10
x <- seq(-3,5,len=20)
y <- seq(0,5,len=20)
z <- outer(x, y, f)
contour(x,y,z)
x <- matrix(c(0, 0, 2, 0, 2, 0), 3, 2)
polygon(x)
1 :Accepted reflection, f(z1)= 3.3
2 :Swap z1 and x3 Accepted contraction 1, f(z3)= 3.25
3 :Accepted reflection and expansion, f(z2)= 0.31875
4 :Accepted reflection, f(z1)= 0.21875
5 :Accepted contraction 1, f(z3)= 0.21875
6 :Accepted contraction 1, f(z3)= 0.1
7 :Accepted contraction 1, f(z3)= 0.04963379
8 :Accepted contraction 1, f(z3)= 0.03874979
9 :Swap z1 and x3
Accepted contraction 1, f(z3)= 0.02552485
library("nloptr")
psf <- function(x) (x[1] + 10*x[2])^2 + 5*(x[3] - x[4])^2 +
(x[2] - 2*x[3])^4 + 10*(x[1] - x[4])^4
x0 <- c(3, -1, 0, 1)
neldermead(x0, psf)
$par
[1] -4.085274e-11 4.085274e-12 -3.522447e-11 -3.522447e-11
$value
[1] 3.721119e-41
$iter
[1] 1000
$convergence
[1] 5
$message
[1] "NLOPT_MAXEVAL_REACHED: Optimization stopped because maxeval (above) was reached."
There are several general purpose optimization functions in R.
For one-dimensional optimization, the optimize() function performs a variation on the golden section search we described earlier.
There are also multi-dimensional optimizers.
The frst of these is the optim() function. optim() is a general purpose wrapper for several different optimization methods, including Nelder-Mead, variations on Newton-Raphson, and others that we haven't discussed.
optim(par, fn, ...)
The par parameter to optim() gives starting values for the pa- rameters.
Besides telling optim() where to begin, these indicate how many parameters will vary in its calls to fn, the second parameter. fn is an R function which evaluates the function to be minimized.
Its first argument should be a vector of the same length as par; optim() will call it repeatedly, varying the value of this parameter, in order to find the minimum.
It should return a scalar value.
The optim() function has a number of optional parameters de- scribed on its help page.
Besides those, the optional parameters in the … list could include additional parameters to pass to fn.
There are other functions in R for general function optimization: nlm() and nlminb().
In most cases optim() is preferred because it oers more exibility, but there may be instances where one of the others performs better.
The constrOptim() function is aimed at cases where there are linear inequalities expressing constraints on the parameters.
We often need to minimize (or maximize) a function subject to constraints.
When the function is linear and the constraints can be expressed as linear equations or inequalities, the problem is called a linear programming problem.
The so-called standard form for the minimization problem in linear programming is
\[ \min_{x_1,x_2,...,x_k} C(x)=c_1x_2+\cdots+c_kx_k \]
subject to the constraints
\[ a_{11}x_1 +\cdots+a_{1k}x_k \geq b1 \]
\[ a_{21}x_1 +\cdots+a_{2k}x_k \geq b2 \]
\[ \cdots \]
\[ a_{m1}x_1 +\cdots+a_{mk}x_k \geq bm \]
The idea is to find values of the decision variables \( x_1, x_2,\cdots, x_n \) which minimize the objective function C(x), subject to the con- straints and nonnegativity conditions.
A company has developed two procedures for reducing sulfur dioxide and carbon dioxide emissions from its factory.
The first reduces equal amounts of each gas at a per unit cost of $5. The second reduces the same amount of sulfur dioxide as the first method, but reduces twice as much carbon dioxide gas; the per unit cost of this method is $8.
The company is required to reduce sulfur dioxide emissions by 2 million units and carbon dioxide emissions by 3 million units. What combination of the two emission procedures will meet this requirement at minimum cost?
Let x1 denote the amount of the rst procedure to be used, and let x2 denote the amount of the second procedure to be used. For convenience, we will let these amounts be expressed in millions of units.
Then the cost (in millions of dollars) can be expressed as \[ C = 5x1 +8x2: \]
Since both methods reduce sulfur dioxide emissions at the same rate, the number of units of sulfur dioxide reduced will then be \( x1 +x2 \)
Noting the requirement to reduce the sulfur dioxide amount by 2 million units, we have the constraint \[ x1 +x2 \geq 2: \] The carbon dioxide reduction requirement is 3 million units, and the second method reduces carbon dioxide twice as fast as the frst method, so we have the second constraint
\[ x_1 +2x_2 \geq 3: \]
Finally, note that x1 and x2 must be nonnegative, since we cannot use negative amounts of either procedure.
Thus, we obtain the linear programming problem:
\[ \min C = 5x1 +8x2 \]
\[ x1 +x2 \geq 2 \]
\[ x1 +2x2 \geq 3 \]
\[ x1, x2 \geq 0: \]
These relations are graphed in the figure on the next slide. The region shaded in grey is the feasible region; this is the set of all possible \( (x1, x2) \) combinations which satisfy the constraints. The unshaded area contains those combinations of values where the constraints are violated.
The gradient of the function C(x) is \( (5, 8) \), so this vector gives the direction of most rapid increase for that function.
The level sets or contours of this function are perpendicular to this vector.
One of the level sets is indicated as a dashed line in the earlier figure.
The solution of the minimization problem lies at the intersection of the first contour which intersects the feasible region.
If this happens at a single point, we have a unique minimizer.
In this example, this intersection is located at the point \( (1, 1) \).
It can be shown that the only possible minimizers for such lin- ear programming problems must be at the intersections of the constraint boundaries, as in the earlier example.
The points of intersection of the constraints are called basic solutions. If these intersection points lie in the feasible region, they are called basic feasible solutions.
If there is at least one basic feasible solution, then one of them will be an optimal solution.
The point (1, 1) is the optimal solution in the example.
There is more than one linear programming function available in R, but we believe the lp() function in the lpSolve package may be the most stable version currently available.
It is based on the revised simplex method; this method intelli- gently tests a number of extreme points of the feasible region to see whether they are optimal.
The lp() function has a number of parameters:
To solve the pollution example, type
library(lpSolve)
eg.lp <- lp(objective.in=c(5, 8), const.mat=
matrix(c(1, 1, 1, 2), nrow=2),
const.rhs=c(2, 3), const.dir=c(">=", ">="))
eg.lp
Success: the objective function is 13
eg.lp$solution
[1] 1 1
The lp() function can handle maximization problems with the use of the direction=“max” parameter. Furthermore, the const.dir parameter allows for different types of inequalities.
We will solve the problem:
\[ \max C = 5x1 +8x2 \]
subject to the constraints
\[ x1 +x2 \leq 2 \]
\[ x1 +2x2 = 3 \]
\[ x1; x2 \geq 0: \]
In R, this can be coded as
eg.lp <- lp(objective.in=c(5, 8),
const.mat=matrix(c(1, 1, 1, 2), nrow=2),
const.rhs=c(2, 3),
const.dir=c("<=", "="), direction="max")
eg.lp$solution
[1] 1 1
The solution is (1, 1), giving a maximum value of 13.
It sometimes happens that there are multiple solutions for a linear programming problem.
A slight modification of the pollution emission example is \[ \min C = 4x1 +8x2 \] subject to the constraints \[ x1 +x2 \geq 2 \] \[ x1 +2x2 \geq 3 \] and \[ x1, x2 \geq 0: \] This problem has a solution at (1,1) as well as at (3, 0). All points on the line joining these two points are solutions as well. The figure on the next slide shows this graphically.
The lp() function does not alert the user to the existence of multiple minima.
In fact, the output from this function for the modified pollution emission example is the solution \( x1 = 3, x2 = 0 \).
For a problem with m decision variables, degeneracy arises when more than m constraint boundaries intersect at a single point.
This situation is quite rare, but it has potential to cause diff- culties for the simplex method, so it is important to be aware of this condition.
In very rare circumstances, degeneracy can prevent the method from converging to the optimal solution; most of the time, how- ever, there is little to worry about.
The following problem has a point of degeneracy which is not at the optimum; however, the lp() function works. \[ \min C = 3x1 +x2 \] subject to the constraints \[ x1 +x2 \geq 2 \] \[ x1 +2x2 \geq 3 \] \[ x1 +3x2 \geq 4 \] \[ 4x1 +x2 \geq 4 \]
and \[ x1, x2 \geq 0: \]
This problem can be solved easily:
degen.lp <- lp(objective.in=c(3, 1),
const.mat=matrix(c(1, 1, 1, 4, 1, 2, 3, 1), nrow=4),
const.rhs=c(2, 3, 4, 4), const.dir=rep(">=", 4))
degen.lp
Success: the objective function is 3.333333
degen.lp$solution
[1] 0.6666667 1.3333333
Infeasibility is a more common problem.
When the constraints cannot simultaneously be satisfied there is no feasible region.
Then no feasible solution exists.
\[ minC = 5x1 +8x2 \]
subject to the constraints \[ x1 +x2 \geq 2 \] \[ x1 +x2 \leq 1 \] and \[ x1, x2 \geq 0: \] It is obvious that the constraints cannot simultaneously be sat- isfied.
Here is the output from the lp() function:
eg.lp <- lp(objective.in=c(5, 8),
const.mat=matrix(c(1, 1, 1, 1), nrow=2),
const.rhs=c(2, 1), const.dir=c(">=", "<="))
eg.lp
Error: no feasible solution found
In rare instances, the constraints and objective function give rise to an unbounded solution.
A trivial example of unboundedness arises when solving \[ \max C = 5x1 +8x2 \] subject to the constraints
\[ x1 +x2 \geq 2 \] \[ x1 +2x2 \geq 3 \] and \[ x1, x2 \geq 0: \]
The feasible region for this problem was plotted earlier.
However, instead of trying to minimize the objective function, we are now maximizing, so we follow the direction of increasing the objective function.
We can make the objective function as large as we wish, by taking x1 and x2 arbitrarily large.
Here is what happens when lp() is applied to this problem:
eg.lp <- lp(objective.in=c(5, 8),
const.mat=matrix(c(1, 1, 1, 2), nrow=2),
const.rhs=c(2, 3), const.dir=c(">=", ">="),
direction="max")
eg.lp
Error: status 3
The condition of unboundedness will most often arise when con- straints and/or the objective function have not been formulated correctly.
Sometimes a decision variable is not restricted to be nonnega- tive. The lp() function is not set up to handle this case directly. However, a simple device gets around this difficulty.
If x is unrestricted in sign, then x can be written as $x1 - x$2, where \( x1 \leq 0 \) and \( x2 \leq 0 \). This means that every unrestricted variable in a linear programming problem can be replaced by the difference of two nonnegative variables.
We will solve the problem:
\[ \min C = x1 +10x2 \]
\[ x1 +x2 \geq 2; \]
\[ x1 - x2\leq 3 \]
\[ x1 \geq 0: \]
Noting that x2 is unrestricted in sign, we set \( x2 = x3 - x4 \) for nonnegative x3 and x4. Plugging these new variables into the problem gives
\[ \min C = x1 +10x3 - 10x4 \]
subject to the constraints
\[ x1 +x3 - x4 \leq 2 \] \[ x1 - x3 +x4\geq 3 \]
\[ x1 \geq 0, x3 \geq 0, x4 \geq 0: \]
Converting this to R code, we have
unres.lp <- lp(objective.in=c(1, 10, -10),
const.mat=matrix(c(1, 1, 1, -1, -1, 1), nrow=2),
const.rhs=c(2, 3), const.dir=c(">=", "<="))
unres.lp
Success: the objective function is -2.5
unres.lp$solution
[1] 2.5 0.0 0.5
Decision variables are often restricted to be integers. For exam- ple, we might want to minimize the cost of shipping a product by using 1, 2 or 3 different trucks. It is not possible to use a fractional number of trucks, so the number of trucks must be integer-valued.
Problems involving integer-valued decision variables are called integer programming problems. Simple rounding of a non- integer solution to the nearest integer is not good practice; the result of such rounding can be a solution which is quite far from the optimal solution.
The lp() function has a facility to handle integer valued vari- ables using a technique called the branch and bound algorithm. The int.vec argument can be used to indicate which variables have integer values.
Find nonnegative x1; x2, x3 and x4 to minimize \[ C(x) = 2x1 +3x2 +4x3 ???? x4 \] subject to the constraints \[ x1 +2x2 \geq 9 \] \[ 3x2 +x3 \geq 9: \] and \[ x2 +x4 \leq 10: \] Furthermore, x2 and x4 can only take integer values.
integ.lp <- lp(objective.in=c(2, 3, 4, -1),
const.mat=matrix(c(1, 0, 0, 2, 3, 1, 0,
1, 0, 0, 0, 1), nrow=3),
const.dir=c(">=", ">=", "<="),
const.rhs=c(9, 9, 10),
int.vec=c(2, 4))
integ.lp
Success: the objective function is 8
integ.lp$solution
[1] 1 4 0 6
Here is what happens when the integer variables are ignored:
wrong.lp <- lp(objective.in=c(2, 3, 4, -1),
const.mat=matrix(c(1, 0, 0, 2, 3, 1, 0,
1, 0, 0, 0, 1), nrow=3), const.dir=
c(">=", ">=", "<="), const.rhs=c(9, 9, 10))
wrong.lp
Success: the objective function is 8
wrong.lp$solution
[1] 0.0 4.5 0.0 5.5
The lp() function provides an interface to code written in C. There is another function in the linprog package called solveLP() which is written entirely in R; this latter function solves large problems much more slowly than the lp() function, but it provides more detailed output. We note also the function simplex() in the boot package. I
t should also be noted that, for very large problems, the simplex method might not converge quickly enough; better procedures, called interior point methods, have been discovered recently, and are implemented in other programming languages, but not yet in R.