The bisection method is another approach to finding the root of a continuous function \(f(x)\) on an interval \([a,b]\). The method takes advantage of a corollary of the intermediate value theorem called Bolzano’s theorem which states that if the values of \(f(a)\) and \(f(b)\) have opposite signs, the interval must contain at least one root. The iteration steps of the bisection method are relatively straightforward, however; convergence towards a solution is slow compared to other root-finding methods.
Let’s say we want to solve the equation \(f(x) = 0\) and there are two poinst whuch are \(a\) dan \(b\), so there will be \(f(a)\) and \(f(b)\) which have opposite sighns. Since, we know about intermediate value theorem that explain \(f\) must have at least one rood in the interval \([a,b]\) as lonf as \(f\) is continuous on this interval. Then we use \(c=(a+b)/2\) or \(m = (a+b)/2\) to divides the interval in two. Hence, there are two possibilities either \(f(a)\) dan \(f(c)\) have opposite signs or \(f(c)\) and \(f(b)\) have opposite signs. The bisection algorithm is then applied recursively to the sub-interval where the sign change occurs. The mid-point of subintervals are annotated in the graph by both texts and blue straight lines, and the end points are denoted in dashed red lines. The root of each iteration is also plotted in the right margin of the graph.
The number of iteration n needed to converge towards a root withing the imposed tolerance ε is calculated as
Way to get this formula:
Every iteration the algorithm generates a series of intervals \([a_n, b_n]\) which is \(b_n\)-\(a_n\) = \((b-a)\over2^n\) = \(|c-x|\)
The tolerance \(ε\) is the absolute value of the difference between the actual root of the function x and approximation \(c\) : \(ε\) = \(|c-x|\).
From the algorithm above: \(|c-x|\) \(≤ε\)
Then we get: \((b-a)\over2^n\) \(≤ε\)
Then we obtain: \(2^n\) \(≥\) \((b−a)\overε\)
Then we apply the logarithm function to the both sides: \(log(2^n)\) \(≥\) \(log (b-a)\over ε\)
So, we get: \(n * log(2) ≥\) \(log (b-a)\over ε\) $ n≥$ $ (log (b-a)/ε)/log(2)$
Bisection algorithm is used to estimate the smallest value of \(ω\) which applied for the matrix A.
For example, f(x) function is given . The bisection algorthm works as follows:
if \(f(c)\) has the same sign as \(f(a)\) we replace \(a\) with \(c\) and we keep the same value for \(b\) if \(f(c)\) has the same sign as \(f(b)\), we replace \(b\) with \(c\) and we keep the same value for \(a\) 6. we go back to step 2. and recalculate \(c\) with the new value of \(a\) or \(b\)
The algorithm ends when the values of \(f(c)\) is less than a defined tolerance (e.g. 0.001). In this case we say that c is close enough to be the root of the function for which \(f(c)\) ~= 0.
The bisection method also work as follow:
Start
Define function f(x)
Input
If f(a)*f(b) > 0 print “Incorrect initial guesses” goto 3 End If
Do c = (a+b)/2
If f(a)*f(c) < 0 b = c Else a = c End If
while (fabs(f(c)) > e) // fabs -> returns absolute value
Print root as c
Stop
Suppose we want to find the root of the function \(x^3−2x−5\). It is often valuable to plot the function to find an interval containing the root.
func <- function(x) {
x^3 - 2 * x - 5
}
curve(func, xlim=c(-3,3), col='blue', lwd=1.5, lty=2)
abline(h=0)
abline(v=0)As we can see from the visualization above, we know that the root appears to lie close in 2. Next, we will use the interval [2,3] in Bisection Method.
NLRoot packagesOn of the most usefull package for the Bisection Method in R is the NLRoot package, here you will apply BFfzero() to find the solution, as you can see below:
library(NLRoot) # loading the package
BFfzero(func, 2, 3) # to find the solution (just as simple like this)## [1] 1
## [1] 2.094553
## [1] 1.262094e-05
## [1] "finding root is successful"
The output of the function shows the root is located at \(x=2.094553\), which matches the results found in previous examples on other root finding methods such as the Newton-Raphson method and Secant method.
We can also write a function that implements the bisection method using the iteration steps detailed at the beginning of the post.
Bakti_Bisection <- function(f, a, b, n = 1000, tol = 1e-7) {
# If the signs of the function at the evaluated points, a and b, stop the function and return message.
if (!(f(a) < 0) && (f(b) > 0)) {
stop('signs of f(a) and f(b) differ')
} else if ((f(a) > 0) && (f(b) < 0)) {
stop('signs of f(a) and f(b) differ')
}
for (i in 1:n) {
c <- (a + b) / 2 # Calculate midpoint
# If the function equals 0 at the midpoint or the midpoint is below the desired tolerance, stop the
# function and return the root.
if ((f(c) == 0) || ((b - a) / 2) < tol) {
return(c)
}
# If another iteration is required,
# check the signs of the function at the points c and a and reassign
# a or b accordingly as the midpoint to be used in the next iteration.
ifelse(sign(f(c)) == sign(f(a)),
a <- c,
b <- c)
}
# If the max number of iterations is reached and no root has been found,
# return message and end function.
print('Too many iterations')
}Now, we can apply our function find the root of the function \(x^3−2x−5\) as we have done above by using NLRoot packages.
## [1] 2.094552
## Warning: package 'animation' was built under R version 3.6.3
ani.options(nmax = 30)
#default example
f = function(x) x^3−2*x−5
xx= bisection.method(f, c (2,3), tol = 1e-7 )The Newton-Raphson method is a method for approximating the roots of polynomial equations of any order. In fact the method works for any equation, polynomial or not, as long as the function is differentiable in a desired interval. ’s often become increasingly better approximations of the function’s root. Newton’s method usually use for solving equations is another numerical method for solving an equation \(f(x)=0\). It is based on the geometry of a curve, using the tangent lines to a curve. As such, it requires calculus, in particular differentiation. Roughly, the idea of Newton’s method is as follows.
So, let \(f(x)\) be a differentiable function. Slect a point \(x_0\) based on a first approximation to the root. To approximate we can calulate using this formula: \(x_{(n+1)}\) = \(n_n\) - \(f(x_n)\over f'(x_n)\)
Newton’s method is a way to find a solution to the equation to as many decimal places as you want. It is what is called an "iterative procedure,’’ meaning that it can be repeated again and again to get an answer of greater and greater accuracy. Iterative procedures like Newton’s method are well suited to programming for a computer.The Newton algorithm consists in replacing the function to be minimized to get the new point. In the quasi Newton algorithm, the second order terms in the expansion are replaced by some approximations.
Suppose we want to find the root of the function \(f(x)=x^2−9 \text{ for, } x>0\). We assume that \(f\) is smooth (differentiable). Our goal is to find a root of \(f\) i.e a \(x\) value for which \(f(x)=0\). We illustrate the method with a simple example for which you know the root.
# first generate the sequence
x.seq <- seq(from = 0.1, to = 10, length.out = 101)
f <- sapply(x.seq, function(x) x^2 - 9)## Warning: package 'ggplot2' was built under R version 3.6.3
# put the x and y values in a data frame
plot_data <- data.frame("x" = x.seq, "f" = f)
head(plot_data)## x f
## 1 0.100 -8.990000
## 2 0.199 -8.960399
## 3 0.298 -8.911196
## 4 0.397 -8.842391
## 5 0.496 -8.753984
## 6 0.595 -8.645975
# create the base layer of ggplot with geom_Line
plot_f <- ggplot(plot_data, aes(x = x, y = f)) +
geom_line(color = "red", size = 1.0) +
labs(x = "x", y = "f(x) = x^2 - 9")
# beautify : Increase font size in axes
plot_f + theme(axis.text.x =
element_text(face = "bold",size = 12),
axis.text.y =
element_text(face = "bold", size = 12),
axis.line = element_line(colour = "black",
size = 1, linetype = "solid")) By considering the visualization above, it’s looks like the function is 0 around 3. How do we know that exactly correct, let’s try zoom the visualization.
plot_f +
geom_hline(yintercept = 0.00, color = "green", size = 1) +
coord_cartesian(xlim = c(2,4), ylim = c(-1, 3))Next, let us visualize this to find the root of our function \(f(x)=x^2−9.\) First we need to create a function that returns the tangent line to the function f at some point a.
# Let us create our tangent line function at some point a
f_tangent <- function(x, f, df, a){
# We need the function and its derivative df
# a is the point where we create the tangent to the function
# returns the tangent line
df(a)*(x - a) + f(a)
}Then, plot the tangent line to the initial value \(f\) at \(x(0)=4\) which will be our first guess of the root.
# add the tangent line at 4 to our data frame
plot_data$f.tan <- sapply(plot_data$x, f_tangent, f = function(x) x^2 - 9, df = function(x) 2*x, a = 4)
# create the base layer of ggplot with geom_Line
plot_tangent_line <- ggplot(plot_data, aes(x = x)) +
geom_line(aes(y = f), color = "red", size = 0.95) +
geom_line(aes(y = f.tan), color = "blue", size = 0.95) +
labs(x = "x", y = "f(x) and tangent line at 4")
plot_tangent_lineNext, let us zoom in on \(2≤x≤4\) and try to see where the tangent line is equal to 0.
# zoom in between 3 and 4
plot_tangent_line +
# add a horizontal line at 0
geom_hline(yintercept = 0.00, color = "green", size = 1) +
coord_cartesian(xlim = c(2,4), ylim = c(-5, 10))By considering the visualization above, looks like the tangent line is 0 around 3.1. This will be our updated guess at the root. Understand the rationale behind this. If f is approximately equal to its tangent line at 4, then the root of \(f\) is approximately equal to the root of the tangent line. Can you see this in the plot above?
The exact value when the tangent line is 0 is given by
\[ x^{(1)}=4−{f(4) \over f′(4)} =3.125\]
Then, add the tangent line at 3.125 to our data frame
plot_data$f.tan <- sapply(plot_data$x, f_tangent, f = function(x) x^2 - 9, df = function(x) 2*x, a = 3.125)
# create the base layer of ggplot with geom_Line
plot_tangent_line <- ggplot(plot_data, aes(x = x)) +
geom_line(aes(y = f), color = "red", size = 0.95) +
geom_line(aes(y = f.tan), color = "blue", size = 0.95) +
labs(x = "x", y = "f(x) and tangent line at 4")
plot_tangent_lineNow let us zoom in on around 3 and try to see where the tangent line is equal to 0.
# zoom in between 2.9 and 3.1
plot_tangent_line +
# add a horizontal line at 0
geom_hline(yintercept = 0.00, color = "green", size = 1) +
coord_cartesian(xlim = c(2.9,3.1), ylim = c(-2.5, 2.5))Looks like the tangent line is 0 around 3. In just two steps we are very close. This will be our updated guess at the root. The exact value when the tangent line is 0 is given by
\[ x^{(2)}=x^{(1)}−{f(x^{(1)}) \over f′(x^{(1)})}\] \[ x^{(2)}=3.125−{f(3.125) \over f′(3.125)} = 3.0025\]
We can continue until our guesses keep getting smaller and smaller.
NLRoot packagesThe same packages as we use to the Bisection Method, this packages also available for Newton’s method.
library(NLRoot)
# first argument is the function whose root we need to find
# second argument is the the derivative of the function whose
# root we need to find.
# third argument is an initial guess
NIMfzero(function(x) x^2-9, function(x) 2*x, x0 = 4)## [1] 3
## [1] 1.081801e-12
## [1] "finding root is successful"
In this lesson we will construct two functions that you will need for this course and begin to explore the R programming language. Run through the Chapter 1 of the book which talks about model building for a one-variable optimization problem. Example 1.1 (Meerschaert) A pig weighing 200 lbs gains 5 lbs per day and costs 45 cents per day to keep (feed and board). The market price for pigs is 65 cents per pound, but is falling 1 cent per day. When should the pig be sold?
\(t\) = time (days)
\(w\) = weight of the pig(lbs)
\(p\) = price for pigs ($/lb)
\(C\) = cost of keeping pig t days ($)
\(R\) = revenue obtained by selling pig ($)
\(P\) = profit from selling pig ($)
Assumptions:
\(w=200+5t\)
\(p=0.65−0.01t\)
\(C=0.45t\)
\(R=p⋅w\)
\(P=R−C\)
\(t>=0\)
Objective: Maximize profits from the sell of the pig Let x be the time to sell the pig and develop the profit function using x as the independent variable. In the case of an optimization, the variable on which you will make a decision on is generally called a decision variable. There can be multiple decision variables in a problem.
Given the description and mathematical explanation of the variables, lets write out the profit function:
Lets use R to “see” what the profit function looks like over the next 20 days.
profit = function (x){
return((0.65-0.01*x)*(200+5*x)-.45*x)
}
x = seq(0,20,1)
plot(x,profit(x),type="o")## [1] 133.2
As you can see, there is a good and bad day to sell the pig if our assumptions remain the same over the next 20 days. Although we could use this graph in order to select the “winner”, we want to be able to use this tool (R programming language) to assist us in ensuring that we have all of the information needed to make a decision. That means that we need to get our hands dirty and figure out how to use our tools (in this case, R).
We will create two functions that will be able to solve for an optimal point in R using mathematics / programming that we already know how to do. From calculus, you should remember the definition of the derivative and why tweaking it a little can get us a better estimation of the derivative of a function:
Using this simple formula, let’s build a derivative of the profit function that we defined above. Notice how our function fprime \((f,x)\) will work for ANY function that we assign to it.
Although we could use the graph above to find the solution for this particular problem, it is more appropriate to use math to make our solution automatic. There are many root-finding algorithms out there and feel free to use another, but we will use a simple algorithm known as the bisection method. This method only requires that you select two points that have opposite signs: if \(f(a)=−\) then \(f(b)=+\).
The simple idea of the bisection method is that you start with an interval \([a,b]\) that has a positive and negative value for the function \(f(a)\) and \(f(b)\) - it does not matter which one is positive. You cut the interval in half and proceed with the interval that keeps one side positive and the other negative. In Figure 1, we start on the interval \([2,9]\) which works because \(f(2)\) is positive and \(f(9)\) is negative. The first step cuts the interval in half and the decision must be made to continue with \([2,5.5]\) or \([5.5,9]\). It should be pretty obvous from the picture that \([2,5.5]\) is the right answer (we want to keep a root in the interval), but from an algorithm perspective, we simply test \(f(5.5)\) and find out that it is negative which causes us to replace 9 (\(f[9]\) was negative) with 5.5. We would continue this algorithm until we are satisfied that the width of the interval is small enough to make a conclusion about the value of the root.
bisection = function(f,a,b,tol=0.0001){
if (f(a)*f(b) > 0){
return ("Boundary Conditions Not Met")
}
else{
middle = a
while (abs(f(middle))>tol){
middle = (a+b)/2
if (f(middle)*f(a)>0) (a = middle)
else (b = middle)
x=middle
y=f(middle)
## if you want to "see" what happens at every step, take off the # of the next line ##
#cat(sprintf("x-Val: %.4f ; f(x-val): %.4f\n",x,y))
}
return (middle)
}
}
#Example of a finding a root of a function
f = function (x){x^3-5} #sinlge line function definition
zero=bisection (f,-10,20)
x=seq(-1,2,0.01)
plot(x,f(x),"l")
abline(h=0,col="red")
abline(v=zero,col="blue")## [1] 1.709967
Use the tools at your disposal to solve the problem. Be able to make several alternatives to the proposed model and investigate the relationship between multiple variables. When do we sell the pig?
## [1] 8.000488
As you can see, we got approximately the same answer as the book. The difference is that our solution uses an estimation of the derivative instead of the closed-form solution as well as a solver for finding the day \(x\) where the profit is maximized (profit function derivative = 0). In the case of most problems, you will use numerical methods (like we did) because real world problems do not ususally have a easy answer. To become a good problem solver, you are going to need to invest in your coding abilities because you will usually need to develop some original code to help you solve a particular problem.
Let’s go back and examine more closely one of our original assumptions that may or may not hold to see how our solution is afftected by a different assumption. Does the optimal day to sell change if we assume some different price of the pig. Here we let the pig price drop to recreate the numbers in the book, but we are able to do so much more which will be demonstrated in the next analysis.
#Price falling per day = p
p = seq(0.008,0.012,0.001)
ans = array(0,length(p))
for (i in 1:length(p)){
profit = function (x){
return((0.65-p[i]*x)*(200+5*x)-.45*x)
}
dProfit = function(x){fprime(profit,x,)}
ans[i] = bisection(dProfit,0,20,0.0001)
}
print(ans)## [1] 15.000000 11.110840 8.000488 5.454102 3.332520
Since we are also concerned about the growth rate of the pig, let’s conduct see how the days to sell are affected by the growth rate. The weight (\(w\)) is calculated using the assumption that the pig will grow 5lbs per day so that: \(w=200+gt\) where \(g=5\). Let’s see what happens if \(g\) is between 3 and 7. Remember the bisection method needs two boundary points that have the opposite values and if the growth rate is too small, think about what that does to the profit. Experiment with large and small growth rates and the profit function to figure it out for yourself if you don’t see it. We needed to make the starting point of the bisection method to the left (negative number) in order for it to find the zero for 3lbs and 3.5lbs per day.
#Growth rate of the pig = g
g = seq(1,12,0.5)
ans = array(0,length(g))
for (i in 1:length(g)){
profit = function (x){
return((0.65-0.01*x)*(200+g[i]*x)-.45*x)
}
dProfit = function(x){fprime(profit,x,)}
ans[i] = bisection(dProfit,-100,50,0.0001)
}
print(ans)## [1] -90.002441 -49.169922 -28.749084 -16.499329 -8.332825 -2.500916
## [7] 1.875305 5.278778 8.000183 10.227203 12.083435 13.653564
## [13] 14.999390 16.166687 17.187500 18.088150 18.889236 19.605637
## [19] 20.249939 20.833588 21.363449 21.847534 22.291565
plot(g,ans,"o",xlab="g(growth/day)",ylab="x(Days to Sell)")
title("Sensitivity of Growth Rate of Pig")