This project originated in an Elementary Numerical Methods course at The Ohio State University in 2007, initially implemented in JAVA. The work was later translated to R to enhance its statistical applications. It focuses on fundamental numerical computation methods for solving linear systems, root-finding algorithms, and numerical integration.

The original text, code, and methodology have been refined and enhanced using AI tools to improve readability, efficiency, and accuracy.


1 Numerical Analysis

Numerical analysis is a field of mathematics and computer science that focuses on developing algorithms for obtaining accurate numerical approximations to problems involving continuous variables, particularly those with no tractable algebraic solution. The primary goal is to design and analyze techniques that provide efficient and reliable approximate solutions to complex problems.


1.1 Types of Numerical Methods

1.1.1 Direct Methods

These algorithms compute the solution in a finite number of steps and would yield the exact result if performed with infinite precision arithmetic.

1.1.2 Iterative Methods

Unlike direct methods, iterative approaches do not necessarily terminate in a finite number of steps. Instead:

  • They generate successive approximations from an initial guess.
  • They converge toward the exact solution in the limit.
  • A convergence criterion (often based on residual error) determines when the approximation is sufficiently accurate.

By balancing computational efficiency with precision, numerical analysis enables the practical solution of analytically intractable problems.


2 Locating Roots of Equations

A root-finding algorithm is a numerical method for identifying a value \(x\) such that \(f(x) = 0\) for a given function \(f\). Such an \(x\) is called a root (or zero) of \(f\).

Key Properties of Root-Finding Methods

  1. Iterative Approach: Most methods generate a sequence of approximations \({x_n}\) that converge to the root as \(n \to \infty\).

  2. Initial Guess: Requires a starting value \((x_0)\) near the suspected root.

  3. Approximate Solutions: The iteration terminates once the result meets a predefined tolerance, yielding an approximation rather than an exact solution.


Convergence Criteria

The algorithm stops when one (or a combination) of the following is satisfied:

  • Absolute Error: \(\left|x_{n+1} - x_n\right| < \epsilon\)

  • Relative Error: \(\displaystyle{\frac{\left|x_{n+1} - x_n\right|}{\left|x_n\right|} < \epsilon}\)

  • Residual Threshold: \(\left|f(x_n)\right| < \epsilon\)

where \(\epsilon\) is a user-defined tolerance.

Common Challenges

  • Convergence Dependence: Speed and success often depend on the choice of \(x_0\).
  • Multiple Roots: Some roots may be missed if the initial guess is poorly chosen.
  • Non-Convergence: Not all functions guarantee convergence for arbitrary \(x_0\).

Locating Roots of Equations in Statistics

In statistics, many important problems reduce to solving equations where the solution is the root of some function \(f(x)\).
Since closed-form solutions are often unavailable, numerical root-finding methods (like bisection, Newton–Raphson, or fixed-point iteration) are widely used.

Common Applications

  • Maximum Likelihood Estimation (MLE):
    The estimator \(\hat\theta\) is found by solving
    \[ \frac{d}{d\theta} \ell(\theta) = 0, \]
    where \(\ell(\theta)\) is the log-likelihood. Root-finding locates the parameter value where the score function equals zero.

  • Hypothesis Testing:

    Test statistics sometimes require solving nonlinear equations to find critical values (e.g., likelihood ratio tests).

  • Confidence Intervals:
    Intervals based on inversion of test statistics involve solving
    \[ T(\theta) = c, \]
    where \(T(\theta)\) is a test statistic and \(c\) a cutoff.

  • Method of Moments:
    Equating sample moments to population moments produces equations of the form
    \[ m_k(\theta) - \hat m_k = 0, \]
    whose roots give parameter estimates.

  • Distribution Quantiles:
    Finding quantiles \(q\) requires solving
    \[ F(q) = p, \]
    where \(F(x)\) is the cumulative distribution function (CDF). Root-finding locates \(q\).


👉 In short: Locating roots of equations is central in statistics, especially for estimation, inference, and distributional calculations where explicit formulas are not available.

** Helper Function: Visualize a Function and Its Roots**

#' Visualize a Function and Its Roots
#'
#' Plots a function over a specified interval and marks its roots on the x-axis.
#'
#' @param f The function to be plotted (must accept vector inputs).
#' @param a Lower bound of the plotting range.
#' @param b Upper bound of the plotting range.
#' @param root Optional vector of root locations to mark.
#' @param title Plot title (optional).
#'
#' @return None (produces a plot)
#' @export
#'
#' @examples
#' f <- function(x) x^2 - 4
#' plot_root(f, a = -3, b = 3, root = c(-2, 2), title = "Quadratic Function")

plot_root <- function(f, a, b, root = NULL, title = NULL) {
  if (!is.function(f)) stop("f must be a function")
  if (length(a) != 1 || length(b) != 1) stop("Range must be numeric values")
  if (a >= b) stop("'a' must be less than 'b'")
  
  x <- seq(a, b, length.out = 100)
  y <- f(x)
  
  y_range <- range(y, na.rm = TRUE)
  y_pad <- max(0.1 * diff(y_range), 1e-6)
  y_limits <- c(y_range[1] - y_pad, y_range[2] + y_pad)
  
  plot(x, y,
       type = "l",
       lwd = 3,
       col = "tomato",
       xlim = c(a, b),
       ylim = y_limits,
       xlab = "x",
       ylab = "f(x)",
       main = title,
       col.main = "tomato",
       axes = FALSE,
       panel.first = {
         grid(col = "gray90")
         axis(1, pos = 0)
         axis(2, pos = 0)
       })
  
  if (!is.null(root) && length(root) > 0) {
    points(root, rep(0, length(root)),
           col = "white",
           bg = "tomato",
           pch = 21,
           cex = 1)
    
    axis(1, at = round(root, 2), 
         pos = (ceiling(max(f(x))) - floor(min(f(x)))) / 10,
         tick = FALSE,
         col.axis = "darkred")
  }
}


# save to the working directory getwd(), ls()
dump("plot_root", file = "plot_root.R")

Plot the roots of \(x^2 - 1\) on \([-2, 2]\)

Solution:

\[ \begin{aligned} x^2 - 1 &= 0 \\ x^2 &= 1 \\ x &= \pm 1 \end{aligned} \]

Thus, the roots are at \(x = -1\) and \(x = 1\).

On the interval \([-2,2]\), the parabola \(y = x^2 - 1\) intersects the \(x\)-axis at these two points.

plot_root(f = function(x){x^2 - 1}, 
          a = -2, b = 2, root = c(-1, 1), 
          title = expression(paste("y = " * x^2)))


2.1 The Bisection Method

The bisection method (also known as the interval halving method, binary search method, or dichotomy method) is a robust root-finding algorithm that systematically narrows down an interval containing a root of a continuous function.

Key Properties

  • Bracketing Method: Requires two initial points \(a\) and \(b\) such that \(f(a)\) and \(f(b)\) have opposite signs (i.e., \(f(a) \cdot f(b) < 0\)), guaranteeing a root exists in \([a,b]\) by the Intermediate Value Theorem.

  • Linear Convergence: The error decreases by approximately half each iteration, with convergence rate \(O(1/2^n)\).

  • Absolute Error Bound: After \(n\) iterations, the maximum error is \(\frac{b-a}{2^n}\).


Algorithm Steps:

  1. Initialization: Choose \(a_0\), \(b_0\) with \(f(a_0)f(b_0) < 0\)

  2. Iteration: For each step \(n\):

  • Compute midpoint \(c_n = \frac{a_n + b_n}{2}\)
  • Evaluate \(f(c_n)\)
  • Update interval:
  • If \(f(a_n)f(c_n) < 0\), set \(b_{n+1} = c_n\)
  • Else, set \(a_{n+1} = c_n\)
  1. Termination: Stop when \(\left|b_n - a_n\right| < \epsilon\) (desired tolerance)

Advantages & Limitations

Advantages Disadvantages
✓ Guaranteed convergence for continuous functions ✗ Slow convergence (linear rate)
✓ Simple to implement ✗ Requires bracketing interval
✓ Error bound known exactly ✗ Doesn’t generalize to higher dimensions

Practical Considerations

  • Often used to generate initial guesses for faster methods (e.g., Newton-Raphson)

  • Optimal for functions where evaluations are computationally expensive

  • Convergence criteria typically combine absolute and relative error thresholds: \[ \text{Stop if } \left|b_n - a_n\right| < \epsilon_{\text{abs}}+\epsilon_{\text{rel}}\left|c_n\right| \]


As described in (Kincaid & Cheney, 2008, p. 76):

#' Bisection Method for Root Finding (Modified for Plotting)
#'
#' Finds a root of a continuous function within a given interval, with optional plotting.
#'
#' @param f The function to find roots for.
#' @param a Lower bound of the interval.
#' @param b Upper bound of the interval.
#' @param e Tolerance for convergence (default = 1e-8).
#' @param max_iter Maximum number of iterations (default = 1000).
#' @param plot Whether to generate a plot (default = FALSE).
#' @param title Plot title (optional).
#' @return The approximate root.
#' @export
#'
#' @examples
#' bisection(cos, 0, 2, plot = TRUE, title = "cos(x)")

bisection <- function(f, a, b, tol = 1e-8, max_iter = 1000, plot = FALSE, title = NULL, ...) {
  
  if (f(a) * f(b) >= 0) {
    stop("Function must have opposite signs at endpoints (f(a) * f(b) < 0)")
  }
  
  iterations <- 0
  root <- NULL
  converged <- FALSE
  
  while (iterations < max_iter) {
    iterations <- iterations + 1
    c <- (a + b) / 2
    
    fc <- f(c)
    
    if (abs(fc) < tol || (b - a)/2 < tol) {
      root <- c
      converged <- TRUE
      break
    }
    
    if (f(a) * fc < 0) {
      b <- c
    } else {
      a <- c
    }
  }
  
  if (plot && converged) {
    plot_root(f, a, b, root, title, ...)
  }
  
  list(root = root, value = if (!is.null(root)) f(root) else NULL, iterations = iterations, converged = converged)
}

# save to the working directory getwd(), ls()
dump("bisection", file = "bisection.R")

Example Applications:

  1. The root of \(\mathrm{cosine}\) function on \(\left[\frac{-\pi}{3}, \frac{4\pi}{3}\right]\)
# Example usage with cos(x)
bisection(f = function(x){return(cos(x))}, a = -pi/3, b = 4*pi/3, plot = TRUE, title = expression(paste("y = " * cos(x))))

## $root
## [1] 1.570796
## 
## $value
## [1] 6.123234e-17
## 
## $iterations
## [1] 1
## 
## $converged
## [1] TRUE

  1. Roots of Quadratic Functions and the Bisection Method

A general quadratic function: \[ f(x) = ax^2 + bx + c \quad (a \neq 0) \]

has roots given by the quadratic formula: \[ x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a} \]

Three possible cases:

  1. Two distinct real roots: When discriminant \(D = b^2 - 4ac > 0\)

  2. One real double root: When \(D = 0\) (vertex touches x-axis)

  3. Complex conjugate roots: When \(D < 0\)


Bisection Method Behavior with Quadratics

Key Characteristics:

  • Single-Root Finding:

  • The bisection method can only find one root per execution

  • Which root is found depends entirely on the initial bracketing interval \([a,b]\)

  • Interval Selection Rules:

  • For two real roots at \(r_1 < r_2\):

  • To find \(r_1\): Choose \(a < r_1\) and \(b \in (r_1,r_2)\)

  • To find \(r_2\): Choose \(a \in (r_1,r_2)\) and \(b > r_2\)

  • Cannot find both roots simultaneously

  • Special Cases:

  • Double root: Will fail to find root unless \(a\) or \(b\) equals the root (as \(f(x)\) doesn’t change sign)

  • No real roots: Method will error out (as no bracketing interval exists)


Practical Example

Consider the quadratic function

\[ f(x) = x^2 - 3x + 2. \]

Factoring gives \[ f(x) = (x-1)(x-2), \]

so the roots are \[ x = 1 \quad \text{and} \quad x = 2. \]

f <- function(x) {x^2 - 3*x + 2}
# Find left root (x=1):
bisection(f, 0, 1.5)  # Returns ≈1
## $root
## [1] 1
## 
## $value
## [1] -7.450581e-09
## 
## $iterations
## [1] 26
## 
## $converged
## [1] TRUE
# Find right root (x=2):
bisection(f, 1.5, 3)  # Returns ≈2
## $root
## [1] 2
## 
## $value
## [1] -7.450581e-09
## 
## $iterations
## [1] 26
## 
## $converged
## [1] TRUE
# Define the quadratic function
f <- function(x) x^2 - 3*x + 2

# Find both roots using bisection
root1 <- bisection(f, 0, 1.5, e = 1e-8)$root
root2 <- bisection(f, 1.5, 3, e = 1e-8)$root

# Plot the function with both roots marked
plot_root(f, 
          a = -1,  # Adjusted for better visualization
          b = 4,     # Extended to see full behavior
          root = c(root1, root2),  # Using found roots, not hardcoded values
          title = "Quadratic Function f(x) = x² - 3x + 2")

# Verify the roots
cat(sprintf("Found roots at x = %.6f and x = %.6f\n", root1, root2))
## Found roots at x = 1.000000 and x = 2.000000

3. Root of the logarithmic function: \(\displaystyle f(x)=\ln\!\left(\frac{1+x}{\,1-x^2\,}\right)\)

To find the roots, solve \(f(x)=0\): \[ \ln\!\left(\frac{1+x}{1-x^2}\right)=0 \;\Longleftrightarrow\; \frac{1+x}{1-x^2}=1 \;\Longleftrightarrow\; 1+x = 1 - x^2 \;\Longleftrightarrow\; x^2 + x = 0 \;\Longleftrightarrow\; x(x+1)=0. \]

Candidate solutions: \(x=0\) or \(x=-1\).

Domain check: The argument of the log must be positive and \(1-x^2\neq 0\).
At \(x=0\): \(\frac{1+0}{1-0}=1>0\) ✓ (valid).
At \(x=-1\): numerator \(1+(-1)=0\Rightarrow \ln(0)\) undefined ✗ (invalid).

Conclusion: The only root is \[ \boxed{x=0}. \]

f <- function(x) log((1 + x)/(1 - x^2))
bisection(f, a = -0.99, b = 0.99, tol = 1e-10, plot = F)
## $root
## [1] 0
## 
## $value
## [1] 0
## 
## $iterations
## [1] 1
## 
## $converged
## [1] TRUE

4. Board in hall problem

Taken from (Kincaid & Cheney, 2008, p. 89):

In a building, two intersecting halls with widths \(\omega_1 = 9\) feet and \(\omega_2 = 7\) feet meet at an angle \(\alpha = 125^{\circ}\), as shown:

Board in hall problem
Board in hall problem

Assuming a two-dimensional situation, what is the longest board that can negotiate the turn? Ignore the thickness of the board. The relationship between the angles \(\Theta\) and the length of the board \((\mathscr{l} = \mathscr{l}_1 + \mathscr{l}_2)\) is

\[ \mathscr{l}_1 = \omega_1 \cdot \mathrm{csc}(\beta) \]

and

\[ \mathscr{l}_2 = \omega_2 \cdot \mathrm{csc}(\gamma) \]

where

\[ \beta = \pi - \alpha - \gamma \]

thus

\[ \mathscr{l} = \omega_1 \cdot \mathrm{csc}(\pi - \alpha - \gamma) + \omega_2 \cdot \mathrm{csc}(\gamma). \]

The maximum length of the board that can make the turn is found by minimizing \(\mathscr{l}\) as a function of \(\gamma\) . Taking the derivative and setting \(\frac{d\mathscr{l}}{d\gamma} = 0\), we obtain

\[ \omega_1 \cdot \mathrm{cot}(\pi - \alpha - \gamma) \cdot \mathrm{csc}(\pi - \alpha - \gamma) - \omega_2 \cdot \mathrm{cot}(\gamma) \cdot \mathrm{csc}(\gamma) = 0 \]

Substitute in the known values and numerically solve the nonlinear equation.

Solution:

\[ 9 \cdot \mathrm{cot}(\pi - \frac{125^{\circ}\cdot \pi}{180^{\circ}} - \gamma) \cdot \mathrm{csc}(\pi - \frac{125^{\circ}\cdot \pi}{180^{\circ}} - \gamma) - 7 \cdot \mathrm{cot}(\gamma) \cdot \mathrm{csc}(\gamma) = 0 \]

Approximate the value of angle \(\gamma\)

gamma <- bisection(function(gamma){9*1/tan(pi - (125*pi/180) - gamma)*1/sin(pi - (125*pi/180) - gamma) - 7*1/tan(gamma)*1/sin(gamma)}, a = 0, b = pi/4)

gamma
## $root
## [1] 0.4511959
## 
## $value
## [1] 5.061549e-07
## 
## $iterations
## [1] 27
## 
## $converged
## [1] TRUE

then substitute it in the length equations, \(\mathscr{l}_1\) and \(\mathscr{l}_2\)

# l2
(l2 <- 7 * 1/sin(gamma$root))
## [1] 16.0535
# l1
(l1 = 9 * 1/sin(pi - (125*pi/180) - gamma$root))
## [1] 18.47772
# total length l
(l = l1 + l2)
## [1] 34.53122

2.2 Method of False Position (Regula Falsi)

The False Position Method, also known as Regula Falsi, is a root-finding algorithm that improves upon the bisection method. Instead of halving the interval, it uses linear interpolation (a secant line) between the endpoints to estimate the root. This often leads to faster convergence while still ensuring the root remains bracketed between the endpoints.

Mathematical Formulation

Given a continuous function $ f(x) $ with $ f(a) $ and $ f(b) $ having opposite signs, the next estimate $ c $ is computed as:

\[ c = b - \frac{f(b)(b - a)}{f(b) - f(a)} \] This formula corresponds to the x-intercept of the secant line connecting the points \((a,f(a))\) and \((b,f(b))\).


R Implementation

As described in (Burden & Faires, 2011, p. 73):

#' Regula Falsi (False Position) Method for Root Finding
#'
#' Finds a root of a continuous function within a specified interval using the false position method.
#' Uses plot_root() for visualization if plot=TRUE.
#'
#' @param f The function for which the root is sought (must be continuous)
#' @param a Lower bound of the interval
#' @param b Upper bound of the interval
#' @param e Tolerance for convergence (default = 1e-6)
#' @param NMAX Maximum number of iterations (default = 101)
#' @param plot Logical indicating whether to plot the results using plot_root (default = FALSE)
#' @param title Optional title for the plot
#'
#' @return The approximate root of the function in the given interval
#' @export
#'
#' @examples
#' f <- function(x) x^2 - 4
#' regula.falsi(f, from = 1, to = 3, plot = TRUE, title = "Finding sqrt(4)")

regula.falsi <- function(f, a, b, e = 1e-6, NMAX = 101, plot = FALSE, title = NULL) {
  # Input validation
  if (a >= b) stop("'a' must be less than 'b'")
  if (f(a) * f(b) >= 0) stop("Function must have opposite signs at endpoints")
  
  from <- a
  to <- b
  
  # Initialize variables
  fa <- f(a)
  fb <- f(b)
  root_history <- numeric(NMAX)
  
  # Main iteration loop
  for (i in 1:NMAX) {
    # Calculate new point using false position formula
    p <- b - fb * (b - a) / (fb - fa)
    root_history[i] <- p
    fp <- f(p)
    
    # Check for convergence
    if (abs(fp) < e) {
      if (plot) {
        # Use plot_root function for visualization
        plot_root(f, from, to, root = p, title = title)
        
        # Add iteration points in a different color
        points(root_history[1:i], rep(0, i), 
               col = "darkorange", pch = 1, lwd = 1)
      }
      return(p)
    }
    
    # Update interval
    if (fp * fb < 0) {
      a <- b
      fa <- fb
    }
    b <- p
    fb <- fp
  }
}

# Save function to file
dump("regula.falsi", file = "regula.falsi.R")

The root of \(\mathrm{sine}\) function on \(\left[\frac{\pi}{16}, \; \frac{31\pi}{16}\right]\)

regula.falsi(f = function(x){log(x)}, a = pi/16, b = 31*pi/16, plot = TRUE,
             title  = expression(paste("y = " * sin(x))))

## [1] 1.000001

2.3 Newton–Raphson Method

The Newton–Raphson method is an efficient iterative algorithm for finding increasingly accurate approximations to the roots of a real-valued function.
By exploiting derivative information, it achieves quadratic convergence near the root (provided the initial guess is sufficiently close and \(f'(x) \neq 0\)).


Formula

Given a function \(f(x)\) and its derivative \(f'(x)\), the iteration is defined as:

\[ x_{n+1} \;=\; x_n - \frac{f(x_n)}{f'(x_n)}. \]

Starting from an initial guess \(x_0\), this process generates a sequence \(\{x_n\}\) that converges to a root of \(f(x)=0\).


Stopping Criteria

The iteration continues until one of the following conditions is met:

  • \(|f(x_n)| < \epsilon\) (the function value is sufficiently small), or
  • \(|x_{n+1} - x_n| < \epsilon\) (successive approximations are sufficiently close).

Here, \(\epsilon > 0\) is a chosen tolerance.


Notes

  • Convergence is fast (quadratic) near the root.
  • A poor initial guess may lead to divergence or convergence to the wrong root.
  • If \(f'(x)\) is very small or zero, the method may fail.

R Implementation

#' Newton-Raphson Method (Simplified Robust Version)
#' Finds a root of a function using Newton-Raphson iteration.
#'
#' @param f Function whose root is to be found
#' @param f_prime Derivative of the function
#' @param x0 Initial guess (default = 1)
#' @param tol Tolerance for convergence (default = 1e-8)
#' @param max_iter Maximum number of iterations (default = 100)
#'
#' @return A list: root (or NA), number of iterations, and convergence status
#' @export
#'
#' @examples
#' f <- function(x) ((x - 2)*x + 1)*x - 3
#' newton_raphson(f, x0 = 1)

newton_raphson <- function(f, x0 = 1, tol = 1e-8, max_iter = 100) {
  # Install and load Deriv only if missing
  if (!requireNamespace("Deriv", quietly = TRUE)) {
    install.packages("Deriv")
  }
  library(Deriv)
  
  # Precompute derivative function once
  f_prime <- Deriv(f)
  
  x <- x0
  iterations <- 0
  converged <- FALSE
  
  for (i in 1:max_iter) {
    iterations <- i
    fx <- tryCatch(f(x), error = function(e) NA)
    dfx <- tryCatch(f_prime(x), error = function(e) NA)
    
    if (any(is.na(c(fx, dfx)))) {
      warning("Function or derivative evaluation failed at x = ", x)
      break
    }
    
    if (abs(dfx) < .Machine$double.eps) {
      warning("Derivative too small at x = ", x)
      break
    }
    
    x_new <- x - fx / dfx
    
    if (abs(fx) < tol || abs(x_new - x) < tol) {
      converged <- TRUE
      x <- x_new
      break
    }
    
    x <- x_new
  }
  
  list(
    root = if (converged) x else NA,
    iterations = iterations,
    converged = converged
  )
}


# save to the working directory getwd(), ls()
dump("newton_raphson", file = "newton_raphson.R")

The root of \(x^3 + 2x^2 + x - 3\) on \([1,3]\)

We define \[ f(x) = x^3 + 2x^2 + x - 3, \] with derivative \[ f'(x) = 3x^2 + 4x + 1. \]

Since \(f(1) = 1 > 0\) and \(f(3) = 42 > 0\), the interval \([1,3]\) is not suitable to enclose a sign change.
Checking instead on \([0,1]\): \[ f(0) = -3 < 0, \quad f(1) = 1 > 0, \] so by the Intermediate Value Theorem, the root lies in \((0,1)\).

Applying Newton’s method with iteration \[ x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)} = x_n - \frac{x_n^3 + 2x_n^2 + x_n - 3}{3x_n^2 + 4x_n + 1}, \] starting at \(x_0 = 0.5\), yields convergence to \[ x^\ast \approx 0.685. \]

Thus, the real root of \(f(x)=0\) is \[ \boxed{0.685 \; (\text{approx.})} \] within the interval \((0,1)\).

# Example usage with your specific function
f <- function(x) ((x - 2)*x + 1)*x - 3
f_prime <- function(x) (3*x - 4)*x + 1

# Find root 
newton_raphson(f, x0 = 2)
## $root
## [1] 2.174559
## 
## $iterations
## [1] 5
## 
## $converged
## [1] TRUE

Computing Square Roots

Compute the square root of \(\tfrac{1}{2}\) by using Newton’s method.
Note that any positive real number has two square roots, one positive and one negative.


Solution:

A square root of \(x\) is a number \(r\) such that \(r^2 = x\).
For \(x = \tfrac{1}{2}\), we want \[ r = \sqrt{\tfrac{1}{2}}. \]

This is equivalent to solving \[ r^2 - \tfrac{1}{2} = 0. \]

Define the function \[ f(r) = r^2 - \tfrac{1}{2}. \]

Its derivative is \[ f'(r) = 2r. \]

Newton’s iteration is therefore \[ r_{n+1} = r_n - \frac{f(r_n)}{f'(r_n)} = r_n - \frac{r_n^2 - \tfrac{1}{2}}{2r_n} = \frac{1}{2}\left(r_n + \frac{1/2}{r_n}\right). \]

Starting with an initial guess \(r_0 > 0\), this sequence converges rapidly to \[ r = \sqrt{\tfrac{1}{2}} \approx 0.70710678. \]

(The negative root \(-\sqrt{\tfrac{1}{2}}\) can be obtained by starting with a negative initial guess.)


R Implementation

f <- function(r) r^2 - (1/2)

(root1 <- newton_raphson(f, x0 = 1))
## $root
## [1] 0.7071068
## 
## $iterations
## [1] 5
## 
## $converged
## [1] TRUE
(root2 <- newton_raphson(f, x0 = -1))
## $root
## [1] -0.7071068
## 
## $iterations
## [1] 5
## 
## $converged
## [1] TRUE
# Extract just the root values
plot_root(
  function(r) r^2 - 1/2,
  a = -1.5,
  b = 1.5,
  root = c(root1$root, root2$root),
  title = expression(paste("y = " * r^2 - frac(1,2)))
)


2.4 Secant Method

The Secant Method is a numerical technique for finding the roots of a real-valued function. It is a variation of the Newton-Raphson method but avoids explicit differentiation. Instead of requiring the derivative, the Secant Method approximates it using the slope of a secant line through two prior points:

\[ x_{n+1} = x_n - f(x_n) \cdot \frac{x_n - x_{n-1}}{f(x_n) - f(x_{n-1})} \] This method is especially useful when:

  • The derivative \(f′(x)\) is difficult or impossible to compute.

  • An analytical derivative is unavailable, but the function can be evaluated.

Compared to Newton-Raphson:

  • No derivative needed

  • May converge more slowly

  • May not converge if the function behaves poorly near the root


R Implementation

As described in (Kincaid & Cheney, 2008, p. 122):

#' Secant Method for Root Finding
#'
#' Finds a root of a continuous function within a given interval using the secant method,
#' which approximates the derivative using finite differences.
#'
#' @param f A continuous function for which the root is sought.
#' @param a Initial first guess for the root.
#' @param b Initial second guess for the root.
#' @param nmax Maximum number of iterations (default = 100).
#' @param tol Tolerance for convergence (default = 1e-6).
#' @param plot Logical. If `TRUE`, plots the function and root (default = `FALSE`).
#' @param title Optional title for the plot.
#'
#' @return The approximate root if convergence is achieved; otherwise, `NA`.
#' @export
#'
#' @examples
#' f <- function(x) x^3 - x - 2
#' root <- secant(f, a = 1, b = 2, plot = TRUE, title = "Root of x^3 - x - 2")
#' print(root)
secant <- function(f, a, b, nmax = 100, tol = 1e-6, plot = FALSE, title = NULL) {
  source("plot_root.R")  # Load plotting function if needed
  
  from <- a
  to <- b
  
  fa <- f(a)
  fb <- f(b)
  
  # Ensure 'a' is the better initial guess (smaller |f(x)|)
  if (abs(fa) > abs(fb)) {
    temp <- a; a <- b; b <- temp
    temp <- fa; fa <- fb; fb <- temp
  }
  
  for (n in 1:nmax) {
    # Secant update: x_{n+1} = x_n - f(x_n) * (x_n - x_{n-1}) / (f(x_n) - f(x_{n-1}))
    dx <- (b - a) / (fb - fa) * fa
    a_new <- a - dx
    
    # Check for convergence
    if (abs(dx) < tol) {
      if (plot) {
        plot_root(f, from, to, root = a_new, title = title)
      }
      return(a_new)
    }
    
    # Update points for next iteration
    b <- a
    fb <- fa
    a <- a_new
    fa <- f(a)
  }
  
  warning("Maximum iterations reached without convergence.")
  return(NA)  # Return NA if no convergence
}
# save to the working directory getwd(), ls()
dump("secant", file = "secant.R")

Solve \(x^3 - x - 1 = 0\)

secant(f = function(x){x^(3) - x - 1}, a = -1.5, b = 1.5, plot = TRUE, title = expression(paste("y = " * x^3 - x - 1)))

## [1] 1.324718

The arc-length of catenary by root finding

Taken from (Kincaid & Cheney, 2008, p. 76):

An electric power cable is suspended (at points of equal height) from two towers that are 100 meters apart. The cable is allowed to dip 10 meters in the middle. How long is the cable?

Solution:

Catenary problem
Catenary problem

Numerically solve for lambda: It is known that the curve assumed by a suspended cable is a catenary. When the y-axis passes through the lowest point, we can assume an equation of the form

\[ y(x) = \lambda \cdot \cosh(x/\lambda). \]

The conditions of the problem are that \(y(50) = y(0) + 10\). Hence, we obtain

\[ \begin{aligned} \lambda \cdot \cosh(\frac{50}{\lambda}) &= \lambda + 10 \\ \\ \lambda \cdot \cosh(\frac{50}{\lambda}) - \lambda - 10 &= 0 \end{aligned} \]

(lambda <- secant(function(lambda){lambda*cosh(50/lambda) - lambda - 10}, a = 100, b = 200))
## [1] 126.6324

Substitute \(\lambda\) into the arc length formula of the catenary

\[ y(x) = \lambda \cdot \cosh(x/\lambda) \]

arc.length.catenary <- function(x) {return(lambda*sinh(x/lambda))}

2 * arc.length.catenary(50)
## [1] 102.6187

2.5 Points of Intersection of Two Graphs of Functions

Taken from (Kincaid & Cheney, 2008, p. 85):

Approximate where the graphs of \(y = 3x\) and \(y = e^x\) intersect.

Solution:

The graphs intersect when \[ 3x = e^x \quad \Longleftrightarrow \quad 3x - e^x = 0. \]

This equation has no closed-form solution, but numerical methods can be used to approximate the points of intersection.


R Implementation

f <- function(x){3*x - exp(x)}
(root1 <- regula.falsi(f, a = 0, b = 1))
## [1] 0.6190616
(root2 <- regula.falsi(f, a = 1, b = 2))
## [1] 1.512134
plot_root(f, a = 0, b = 2, root = c(root1, root2), title = expression(paste("y = ", e^x - 3*x)))


Find the intersections of
\[ f(x) = e^x \quad \text{and} \quad f(x) = 3x \]

To find intersection points, solve \[ e^x = 3x. \]

This transcendental equation has no closed-form solution, but can be solved numerically.

Define \[ h(x) = e^x - 3x. \]

  • At \(x=0\): \(h(0) = 1 > 0\)
  • At \(x=1\): \(h(1) = e - 3 \approx -0.282 < 0\) ⇒ a root lies in \((0,1)\).
  • At \(x=2\): \(h(2) = e^2 - 6 \approx 1.389 > 0\) ⇒ another root lies in \((1,2)\).

Numerical approximation gives: \[ x_1 \approx 0.6191, \quad y_1 \approx 1.857 \\ x_2 \approx 1.5129, \quad y_2 \approx 4.539 \]

Thus, the curves intersect at approximately \[ (0.6191,\; 1.857) \quad \text{and} \quad (1.5129,\; 4.539). \]

# x continuum
x <- seq(0.3, 1.8, by = .1)

# plot# e^x
plot(x, exp(x), 
     type = "l",
     lwd = 5,
     ylab = "f(x)",
     col = "tomato",
     xlim = c(0, 2),
     ylim = c(0, 7))

# plot 3x
lines(x, 3*x,
      lwd = 5,
      col = "dodgerblue")

# tracing lines
segments(x0 = c(root1, root2), y0 = c(3*root1, exp(root2)),
         x1 = c(root1, root2), y1 = -.8,
         lty = 3,
         lwd = .5)

# arrows
if (!require("shape")) 
  install.packages("shape", dependencies = TRUE)
library(shape)

Arrowhead(x0 = c(root1, root2), y0 = -.25,
          angle = 270,
          npoint = 25,
          arr.lwd = .2,
          arr.length = .5,
          arr.type = "curved",
          arr.adj = 1,
          lty = 1)

# coordinate point
points(x = c(root1, root2), y = c(3*root1, exp(root2)),
       col = "white",
       bg = "black",
       pch = 21,
       cex = .7)

# plot title
title(expression(e ^ x*phantom(" vs. 3*x")),
      cex.main = 2,
      col.main = "tomato")
title(expression(phantom(e ^ x)*" vs. "*phantom(3*x)),
      cex.main = 1.5,
      col.main = "black")
title(expression(phantom("e^x vs. ")*3*x),
      cex.main = 2,
      col.main = "dodgerblue")


2.6 Fixed-Point Iteration

Fixed-point iteration is a fundamental numerical method for solving equations of the form

\[ x = g(x). \]

A fixed point is a value of \(x\) where the function \(g(x)\) intersects the line \(y = x\).
This method is especially useful in root-finding problems of the type \(f(x) = 0\), since such equations can often be reformulated into a fixed-point form \(x = g(x)\).


Algorithm

Initialize X = (x₁, …, x_k)


**Input:** function $g(x)$, initial guess $x_0$, tolerance $ε$, maximum iterations $N$

Set $k = 0$
Set $x = x_0$

While $k < N$:
    $x_{new} = g(x)$
    If $|x_{new} - x| < ε$:
        **Return:** $x_{new}$   // converged
    $x = x_{new}$
    $k = k + 1$

**Return:** "Did not converge within $N$ iterations"


Convergence Conditions

  • If \(g\) is continuous on an interval \([a,b]\) and \(g(x) \in [a,b]\) for all \(x \in [a,b]\), then \(g\) has at least one fixed point in \([a,b]\).
  • If \(|g'(x)| < 1\) for all \(x\) near the fixed point, the iteration will converge to that fixed point (local convergence).
  • If \(|g'(x)| > 1\), the iteration will diverge.

Example 1: Solve \(f(x) = \cos(x) - x = 0\) using fixed-point iteration

We rewrite the equation as \[ x = \cos(x). \]

Starting from \(x_0 = 0.5\), we generate the sequence \[ x_{1} = \cos(0.5), \quad x_{2} = \cos(x_1), \quad x_{3} = \cos(x_2), \;\dots \]

This sequence converges to the fixed point \[ x^\ast \approx 0.7391, \] which is the unique solution to \(\cos(x) = x\).


Example 2: Solve \(f(x) = x^3 - x - 1 = 0\) on \((1, 2)\)

  • \(f(1) = -1 < 0\), \(f(2) = 5 > 0\) ⇒ by the Intermediate Value Theorem, a root lies in \((1,2)\).
  • \(f'(x) = 3x^2 - 1 > 0\) for \(x \in (1,2)\)\(f\) is strictly increasing there, so the root is unique.

\[ x \approx \boxed{1.324717957}, \]

the unique real solution (also known as the plastic constant, satisfying \(x^3 = x + 1\)).


Notice \(f(1) = -1\) and \(f(2) = 5\). As a result, the graph of \(f(x)\) crosses the \(x\)-axis between 1 and 2. In other words, a root of \(f(x)\) exists on \((1, 2)\).


Step 1: Rearrange into fixed-point form \[ \begin{aligned} x^3 - x - 1 &= 0 \\ x^3 &= x + 1 \quad &\text{(adding \(x+1\) to both sides)} \\ x &= (x+1)^{\tfrac{1}{3}} \quad &\text{(taking the cubic root)} \end{aligned} \]

Thus, we define \[ g(x) = \sqrt[3]{x + 1}. \]


Step 2: Convergence check \[ g'(x) = \tfrac{1}{3}(x+1)^{-2/3}. \]

For \(x \in [1, 2]\), \[ |g'(x)| \approx 0.2 < 1 \quad \Rightarrow \quad \text{**Convergence guaranteed**}. \]


Step 3: Iterative method The fixed-point iteration is \[ x_{n+1} = g(x_n), \quad n = 0,1,2,\dots \]

Starting with \(x_0 = 1.5\), the sequence converges to \[ x^\ast \approx 1.324717957, \] the unique root in \((1,2)\).


R Implementation

#' Fixed-Point Iteration with Optional Plots (iter/epsilon/cobweb/spiral)
#'
#' Iteratively solves equations of the form \eqn{x = g(x)} using the fixed-point method,
#' optionally visualizing the iteration history.
#'
#' @param g A function \code{g(x)} for which a fixed point is sought.
#' @param x0 Numeric scalar. Initial guess (default = 0).
#' @param tol Numeric scalar. Convergence tolerance (absolute + relative) (default = 1e-6).
#' @param max_iter Integer. Maximum number of iterations (default = 100).
#' @param plot Logical. If \code{TRUE}, produce a plot (default = \code{FALSE}).
#' @param plot_type Character. One of \code{"iter"}, \code{"epsilon"}, \code{"cobweb"}, or \code{"spiral"}.
#'   If \code{plot = TRUE} and \code{plot_type} is missing, defaults to \code{"iter"}.
#' @param alpha Relaxation parameter (0 < \code{alpha} ≤ 1). \code{alpha} < 1 applies damping.
#'   Default \code{1}.
#' @param x_range Optional numeric length-2 \code{c(xmin, xmax)} for the cobweb/spiral plot axes.
#'   If \code{NULL}, inferred from the iteration history (with padding).
#' @param curve_samples Integer number of points to sample for drawing \eqn{y = g(x)}
#'   in the cobweb plot (default = 400).
#'
#' @return A list with components:
#' \describe{
#'   \item{value}{Approximate fixed point if converged; otherwise \code{NA}.}
#'   \item{converged}{Logical; \code{TRUE} if the tolerance was met.}
#'   \item{iterations}{Number of iterations performed.}
#'   \item{history}{Data frame with columns \code{iter}, \code{x}, and \code{epsilon}
#'     (step-to-step error; \code{NA} at iter 0).}
#' }
#'
#' @details
#' Local convergence is ensured if \eqn{|g'(x^*)| < 1} at the fixed point \eqn{x^*}.
#' The stopping rule is \eqn{|x_{k+1}-x_k| < \text{tol}\,(1+\max(1,|x_{k+1}|))}.
#'
#' @examples
#' # 1) Simple contraction: x = cos(x)  (cobweb)
#' g1 <- function(x) cos(x)
#' res1 <- fixed_point(g1, x0 = 0.5, plot = TRUE, plot_type = "cobweb")
#' res1$value
#'
#' # 2) x^3 - x - 1 = 0  =>  x = (x + 1)^(1/3)  (spiral)
#' g2 <- function(x) (x + 1)^(1/3)
#' res2 <- fixed_point(g2, x0 = 1.5, plot = TRUE, plot_type = "spiral")
#'
#' # 3) Error decay view
#' res3 <- fixed_point(g1, x0 = 0.5, plot = TRUE, plot_type = "epsilon")
#'
#' @export
fixed_point <- function(g,
                        x0 = 0,
                        tol = 1e-6,
                        max_iter = 100,
                        plot = FALSE,
                        plot_type = c("iter", "epsilon", "cobweb", "spiral"),
                        alpha = 1,
                        x_range = NULL,
                        curve_samples = 400) {

  # ---- Argument checks ----
  stopifnot(is.function(g), is.numeric(x0), length(x0) == 1, alpha > 0, alpha <= 1)
  if (plot && missing(plot_type)) plot_type <- "iter"
  plot_type <- match.arg(plot_type)

  # ---- Storage for history (pre-allocated for speed) ----
  x <- x0
  hist_iter <- integer(max_iter + 1)      # iteration indices 0..k
  hist_x    <- numeric(max_iter + 1)      # x_k values
  hist_eps  <- numeric(max_iter + 1)      # |x_{k}-x_{k-1}| errors
  hist_iter[1] <- 0
  hist_x[1]    <- x0
  hist_eps[1]  <- NA_real_

  converged <- FALSE
  k <- 0

  # ---- Main iteration loop ----
  for (k in 1:max_iter) {
    gx <- g(x)                             # evaluate g at current x
    if (!is.finite(gx)) {                  # guard against NA/Inf
      warning("Non-finite value returned by g at iter = ", k, ".")
      break
    }
    x_new <- x + alpha * (gx - x)          # relaxed update (alpha=1 is standard)
    err   <- abs(x_new - x)                # step-to-step error

    # record in history (k becomes the new row)
    hist_iter[k + 1] <- k
    hist_x[k + 1]    <- x_new
    hist_eps[k + 1]  <- err

    # absolute + relative stopping rule
    if (err < tol + tol * max(1, abs(x_new))) {
      converged <- TRUE
      x <- x_new
      break
    }
    x <- x_new
  }

  # ---- Trim history to actual length used ----
  history <- data.frame(
    iter    = hist_iter[1:(k + 1)],
    x       = hist_x[1:(k + 1)],
    ε       = hist_eps[1:(k + 1)]
  )

  # ---- Optional plotting ----
  if (plot) {
    if (!requireNamespace("ggplot2", quietly = TRUE)) {
      warning("`ggplot2` not found. Plot disabled.")
    } else {
      # -----------------------
      # Plot 1: x_k by iteration
      # Plot 2: epsilon_k (log scale) by iteration
      # -----------------------
      if (plot_type %in% c("iter", "epsilon")) {
        p <- ggplot2::ggplot(history, ggplot2::aes(x = iter))
        if (plot_type == "iter") {
          # Thin lines, red points for iterates
          p <- p +
            ggplot2::geom_line(ggplot2::aes(y = x), linewidth = 0.4, alpha = 0.8) +
            ggplot2::geom_point(ggplot2::aes(y = x), size = 1.6, color = "red") +
            ggplot2::labs(title = "Fixed-Point Iterates", x = "Iteration", y = "x_k")
        } else {
          # Thin lines, log-scale y for error decay
          p <- p +
            ggplot2::geom_line(ggplot2::aes(y = ε), linewidth = 0.4, alpha = 0.8) +
            ggplot2::geom_point(ggplot2::aes(y = ε), size = 1.6, color = "red") +
            ggplot2::scale_y_log10() +
            ggplot2::labs(title = "Fixed-Point Error (ε)", x = "Iteration", y = "epsilon_k")
        }
        p <- p + ggplot2::theme_minimal(base_size = 12)
        print(p)

      # -----------------------
      # Plot 3: Cobweb diagram
      # -----------------------
      } else if (plot_type == "cobweb") {
        xs  <- history$x
        gxs <- vapply(xs, g, numeric(1))

        # Axis range inferred from iterates & g(iterates) with a small padding
        if (is.null(x_range)) {
          rng  <- range(c(xs, gxs), finite = TRUE)
          pad  <- diff(rng) * 0.15 + 1e-9
          xlim <- c(rng[1] - pad, rng[2] + pad)
        } else {
          xlim <- x_range
        }

        # Smooth curve y = g(x) to visualize the function
        grid <- seq(xlim[1], xlim[2], length.out = curve_samples)
        curve_df <- data.frame(x = grid, y = vapply(grid, g, numeric(1)))

        # Build the “staircase” path:
        # (x_k, x_k) -> (x_k, g(x_k)) -> (x_{k+1}, x_{k+1})
        path <- data.frame(
          x = c(rbind(xs[-length(xs)], xs[-length(xs)], xs[-1])),
          y = c(rbind(xs[-length(xs)], gxs[-length(xs)], xs[-1])),
          step = rep(1:(length(xs)-1), each = 3)
        )

        # Compose the cobweb plot with thin, blue tracing lines and clear annotations
        p <- ggplot2::ggplot() +
          # Diagonal y = x
          ggplot2::geom_abline(slope = 1, intercept = 0, linewidth = 0.8, color = "black", alpha = 0.6) +
          ggplot2::annotate("text", x = xlim[1], y = xlim[1]*1.05, label = "y = x",
                            hjust = -0.1, vjust = -0.5, size = 3) +
          # Function curve y = g(x)
          ggplot2::geom_line(data = curve_df, ggplot2::aes(x = x, y = y),
                             linewidth = 0.9, color = "darkred") +
          ggplot2::annotate("text", x = xlim[2], y = tail(curve_df$y, 1)*1.05, label = "y = g(x)",
                            hjust = 1.1, vjust = -0.5, color = "darkred", size = 3) +
          # Staircase tracing path (thin blue)
          ggplot2::geom_path(data = path,
                             ggplot2::aes(x = x, y = y, group = step),
                             arrow = ggplot2::arrow(length = grid::unit(3, "pt")),
                             linewidth = 0.4, color = "blue", alpha = 0.7) +
          # Iterates on diagonal (red points)
          ggplot2::geom_point(data = data.frame(x = xs, y = xs),
                              ggplot2::aes(x = x, y = y), size = 1.6, color = "red") +
          ggplot2::coord_cartesian(xlim = xlim, ylim = xlim, expand = FALSE) +
          ggplot2::labs(title = "Cobweb Plot for Fixed-Point Iteration", x = "x", y = "y") +
          ggplot2::theme_minimal(base_size = 12)

        # Mark final iterate as approximate fixed point (green)
        fp <- tail(xs, 1)
        if (is.finite(fp)) {
          p <- p +
            ggplot2::annotate("point", x = fp, y = fp, size = 3, color = "darkgreen") +
            ggplot2::annotate("text", x = fp*1.02, y = fp,
                              label = sprintf("  Fixed Point ≈ %.6f", fp),
                              hjust = 0, vjust = -0.6, size = 3, color = "darkgreen")
        }
        print(p)

      # -----------------------
      # Plot 4: “Spiral” state-space view (x_k vs x_{k+1})
      # -----------------------
      } else if (plot_type == "spiral") {
        xs <- history$x
        if (length(xs) < 2L) {
          warning("Not enough iterates to plot a spiral.")
        } else {
          # Build time-ordered pairs (x_k, x_{k+1})
          phase <- data.frame(
            k  = 0:(length(xs)-2),
            x  = xs[-length(xs)],
            y  = xs[-1]
          )
          fp <- tail(xs, 1)

          # Axis range with padding
          if (is.null(x_range)) {
            rng  <- range(c(phase$x, phase$y), finite = TRUE)
            pad  <- diff(rng) * 0.15 + 1e-9
            xlim <- c(rng[1] - pad, rng[2] + pad)
          } else {
            xlim <- x_range
          }

          # Thin, semi-transparent blue path; red points; annotated diagonal
          p <- ggplot2::ggplot() +
            ggplot2::geom_abline(slope = 1, intercept = 0, linewidth = 0.8, color = "black", alpha = 0.6) +
            ggplot2::annotate("text", x = xlim[1], y = xlim[1]*1.05, label = "y = x",
                              hjust = -0.1, vjust = -0.5, size = 3) +
            ggplot2::geom_path(
              data = phase,
              ggplot2::aes(x = x, y = y),
              arrow = ggplot2::arrow(length = grid::unit(3, "pt")),
              linewidth = 0.4, color = "blue", alpha = 0.7
            ) +
            ggplot2::geom_point(
              data = phase,
              ggplot2::aes(x = x, y = y),
              size = 1.6, color = "red"
            ) +
            ggplot2::coord_cartesian(xlim = xlim, ylim = xlim, expand = FALSE) +
            ggplot2::labs(
              title = "Spiral Plot of Iterates",
              x = expression(x[k]),
              y = expression(x[k+1])
            ) +
            ggplot2::theme_minimal(base_size = 12)

          if (is.finite(fp)) {
            p <- p +
              ggplot2::annotate("point", x = fp, y = fp, size = 3, color = "darkgreen") +
              ggplot2::annotate("text", x = fp*1.02, y = fp,
                                label = sprintf("  Fixed Point ≈ %.6f", fp),
                                hjust = 0, vjust = -0.6, size = 3, color = "darkgreen")
          }
          print(p)
        }
      }
    }
  }

  # ---- Convergence message if not met ----
  if (!converged) warning("Did not converge in ", max_iter, " iterations.")

  # ---- Return result with full history ----
  list(
    value = if (converged) x else NA_real_,
    converged = converged,
    iterations = k,
    history = history
  )
}

R Implementation

# g(x) = cos(x): show cobweb
g <- function(x) cos(x)
res <- fixed_point(g, x0 = 0.5, plot = TRUE, plot_type = "cobweb")

# show epsilon decay
res <- fixed_point(g, x0 = 0.5, plot = TRUE, plot_type = "epsilon")


Example: Plot of
\[ f(x) = \sqrt[3]{\,x + 1\,} \]

This function represents a cube-root curve shifted one unit to the left.
The graph passes through the point \((-1, 0)\) and increases monotonically across all real \(x\).

g <- function(x) (x + 1)^(1/3)
root <- fixed_point(g, x0 = 1.5, plot = TRUE)

print(root)  # Output: ~1.324717 (the real root)
## $value
## [1] 1.324718
## 
## $converged
## [1] TRUE
## 
## $iterations
## [1] 8
## 
## $history
##   iter        x            ε
## 1    0 1.500000           NA
## 2    1 1.357209 1.427912e-01
## 3    2 1.330861 2.634785e-02
## 4    3 1.325884 4.977185e-03
## 5    4 1.324939 9.444108e-04
## 6    5 1.324760 1.793521e-04
## 7    6 1.324726 3.406607e-05
## 8    7 1.324719 6.470693e-06
## 9    8 1.324718 1.229085e-06

R Implementation

# x continuum
x <- seq(-1, 1.5, by = .01)

root <- secant(f = function(x){x^(3) - x - 1}, a = -1.5, b = 1.5)

# plot y = x
plot(x, x,
     lwd = 1,
     type = "l",
     col = "black",
     xlab = "",
     ylab = "",
     asp = 1,
     axes = F,
     xlim = c(-1.2, 1.5),
     ylim = c(0, 1.5))

# plot function
lines(x, (x + 1)^(1/3), 
      type = "l",
      lwd = 5,
      ylab = "f(x)",
      col = "tomato",
      asp = 1) 

# tracing lines
segments(x0 = root, y0 = root,
         x1 = root, y1 = 0,
         lty = 3,
         lwd = .5)

# arrows
Arrowhead(x0 = root, y0 = 0,
          angle = 270,
          npoint = 25,
          arr.lwd = .2,
          arr.length = .5,
          arr.type = "curved",
          arr.adj = 1,
          lty = 1)

# coordinate point
points(x = root, y = root,
       col = "white",
       bg = "tomato",
       pch = 21,
       cex = .7)

# plot title
title(main = expression(paste("g(x) = " * sqrt(x+1, 3))),
      cex.main = 1.5,
      col.main = "tomato")

# axes
axis(1, pos = 0)
axis(2, pos = 0)

axis(1, at = round(root, 2),
     pos = 0,
     col = "dodgerblue", 
     lwd.tick = 2,
     col.axis = "dodgerblue",
     font = 2)


Example: Find the point of intersection of
\[ y = \sqrt[3]{\,x+1\,} \quad \text{and} \quad y = x. \]

To find the intersection, solve: \[ x = \sqrt[3]{\,x+1\,}. \]

Cubing both sides: \[ x^3 = x + 1 \;\;\;\Rightarrow\;\;\; x^3 - x - 1 = 0. \]

The real solution to this cubic is \[ x \approx 1.3247, \quad y \approx 1.3247. \]

Thus, the curves intersect at the point \[ (1.3247,\;1.3247). \]

plot_root(f = function(x){(x + 1)^(1/3) - x}, a = -1, b = 1.5, root = root, title = expression(paste("y = ", sqrt(x+1, 3), " - x")))


3 Numerical Integration

Numerical integration is a set of techniques used to approximate the value of a definite integral when an exact solution is difficult or impossible to obtain analytically.
Common methods include the trapezoidal rule, Simpson’s rule, and adaptive quadrature.

Use in Probability Theory and Statistics

In probability and statistics, many key quantities are defined as integrals that often do not have closed-form solutions. Numerical integration is applied to approximate these values:

  • Cumulative Distribution Functions (CDFs):
    For a random variable with probability density function \(f(x)\), the CDF is
    \[ F(x) = \int_{-\infty}^x f(t)\,dt. \]
    When \(f(x)\) involves complex functions (e.g., normal distribution), numerical integration helps evaluate \(F(x)\).

  • Probabilities:
    Computing probabilities over intervals often requires evaluating
    \[ P(a \leq X \leq b) = \int_a^b f(x)\,dx. \]

  • Expected Values and Moments:
    The expectation of a continuous random variable is
    \[ \mathbb{E}[X] = \int_{-\infty}^\infty x f(x)\,dx, \]
    which may need numerical approximation if no closed form exists.

  • Likelihood Functions in Inference:
    Some likelihoods involve integrals over latent variables or nuisance parameters; numerical integration makes estimation feasible.


👉 In short: Numerical integration is a crucial computational tool in probability and statistics for evaluating probabilities, expectations, and distribution functions when analytic integration is not possible.

Helper Function: Plotting the Areas under Curves

#' Plot a Function and Shade the Integral Area
#'
#' This function plots a continuous function \( f \) over a specified range and
#' shades the area under the curve between points \( a \) and \( b \).
#'
#' @param f A function of one variable to plot.
#' @param a Lower limit of integration (start of shaded area).
#' @param b Upper limit of integration (end of shaded area).
#' @param from Numeric, start of the x-axis plotting range. Defaults to \code{a}.
#' @param to Numeric, end of the x-axis plotting range. Defaults to \code{b}.
#' @param title Optional character string for the plot title.
#'
#' @return No return value. The function produces a plot as a side effect.
#'
#' @examples
#' f <- function(x) sin(x)
#' plot_integral(f, 0, pi, from = 0, to = 2*pi, title = "Integral of sin(x)")
#'
#' @export
plot_integral <- function(f, a, b, from = a, to = b, title = NULL) {
  x <- seq(from, to, length.out = 100)
  y <- f(x)
  
  # function plot
  plot(x, y,
       xlim = c(from, to),
       ylim = c(ifelse(min(y) < 0, min(y), 0), max(y)),
       xlab = "x",
       ylab = "y",
       main = title,
       col.main = "tomato",
       type = "l",
       lwd = 3,
       col = "tomato")
  
  # area shading
  x_area <- seq(a, b, length.out = 100)
  y_area <- f(x_area) 
  polygon(c(x_area, b, a, a), c(y_area, 0, 0, f(a)),
          border = adjustcolor("dodgerblue", alpha.f = 0.3),
          col = adjustcolor("dodgerblue", alpha.f = 0.3))
} # end plot_integral

# save to the working directory getwd(), ls()
dump("plot_integral", file = "plot_integral.R")

Plot the Area Under \(e^{\cos(x)}\) Between \(x = 0\) and \(x = \pi\)

The function is
\[ y = e^{\cos(x)}, \quad x \in [0,\pi]. \]

To visualize the definite integral \[ \int_{0}^{\pi} e^{\cos(x)} \, dx, \] we plot the curve \(y = e^{\cos(x)}\) and shade the region under it between \(x=0\) and \(x=\pi\).

plot_integral(f = function(x) exp(cos(x)), 
              a = 0, b = pi, from = -3*pi/2, to = 3*pi/2,
              title = expression(paste("y = " * e^{cos(x)})))


3.1 Riemann Sums

Riemann sums are a fundamental technique to approximate the area under a curve \(y=f(x)\) over an interval \([a,b]\). The idea is to partition the interval into \(n\) subintervals, then approximate the area under the curve by summing the areas of simple shapes—most commonly rectangles—over each subinterval.

Mathematical Definition

Given a partition

\[ a = x_0 < x_1 < \cdots < x_n = b, \]

the width of the \(i-th\) subinterval is

\[ \Delta x_i = x_i - x_{i-1}. \]

We choose a sample point \(c_i \in [x_{i-1}, x_i]\) and approximate the area over that subinterval by the rectangle with height \(f(c_i)\):

\[ S_n = \sum_{i=1}^n f(c_i) \Delta x_i. \]

As \[ \to \infty\] and the partition gets finer, \[S_n\] approaches the definite integral

\[ \int_a^b f(x) \, dx. \]


Types of Riemann Sums

  • Left Riemann sum: Use \(c_i = x_{i-1}\) (left endpoint of each subinterval)

  • Right Riemann sum: Use \(c_i = x_i\) (right endpoint)

  • Midpoint Riemann sum: Use

\[ c_i = \frac{x_{i-1} + x_i}{2} \]


R Implementation

As described in (Kincaid & Cheney, 2008, p. 185):

#' Approximate definite integral using the Trapezoidal Rule
#'
#' @param f Function to integrate
#' @param a Lower limit of integration
#' @param b Upper limit of integration
#' @param n Number of subintervals (default 1e5)
#' @return Numeric approximation of the integral
#' @export
trapezoidal <- function(f, a, b, n = 1e5) {
  h <- (b - a) / n
  x <- seq(a, b, length.out = n + 1)
  y <- f(x)
  
  # Trapezoidal sum formula
  integral <- (h / 2) * (y[1] + 2 * sum(y[2:n]) + y[n + 1])
  
  return(integral)
}

# Save to working directory
dump("trapezoidal", file = "trapezoidal.R")

Example: Approximate the area under \(y = e^{\cos x}\) on \([0,\pi]\).

\[ \text{Area} \;\approx\; \int_{0}^{\pi} e^{\cos x}\,dx. \]

This integral has a closed form via the modified Bessel function \(I_0\): \[ \int_{0}^{2\pi} e^{\cos x}\,dx = 2\pi I_0(1) \;\;\Rightarrow\;\; \int_{0}^{\pi} e^{\cos x}\,dx = \pi I_0(1). \]

Numerically, \[ I_0(1) \approx 1.266065877 \quad\Longrightarrow\quad \int_{0}^{\pi} e^{\cos x}\,dx \approx \pi \cdot 1.266065877 \approx \boxed{3.97746}. \]

trapezoidal(f = function(x) exp(cos(x)), a = 0, b = pi)
## [1] 3.977463

R Implementation

#' Plot function and trapezoidal approximation
#'
#' @param f Function to plot and approximate
#' @param a Lower bound of interval
#' @param b Upper bound of interval
#' @param n Number of subintervals (default 10 for clearer visualization)
#' @param main Optional plot title
#' @export
plot_trapezoidal <- function(f, a, b, n = 10, main = NULL) {
  x <- seq(a, b, length.out = 1000)
  y <- f(x)
  
  # Plot the function curve
  plot(x, y, type = "l", lwd = 2, col = "blue",
       xlab = "x", ylab = "f(x)", main = main)
  
  # Partition points for trapezoids
  xs <- seq(a, b, length.out = n + 1)
  ys <- f(xs)
  
  # Draw trapezoids
  for (i in 1:n) {
    polygon(
      x = c(xs[i], xs[i], xs[i + 1], xs[i + 1]),
      y = c(0, ys[i], ys[i + 1], 0),
      col = adjustcolor("orange", alpha.f = 0.5),
      border = "red"
    )
  }
  
  # Overlay points on partition
  points(xs, ys, pch = 19, col = "red")
}

# Save to working directory
dump("plot_trapezoidal", file = "plot_trapezoidal.R")
plot_trapezoidal(function(x) exp(cos(x)), 0, pi, n = 8, main = "Trapezoidal Approximation")


Example: Approximate the area under the curve \(y = e^{-x^2}\) on the interval \([0,1]\).

\[ \text{Area} \;\approx\; \int_{0}^{1} e^{-x^2}\,dx. \]

This integral has no elementary antiderivative. It relates to the error function: \[ \int_{0}^{1} e^{-x^2}\,dx \;=\; \frac{\sqrt{\pi}}{2}\,\mathrm{erf}(1) \;\approx\; 0.746824. \]

trapezoidal(f = function(x) 1/exp(x^2), a = 0, b = 1)
## [1] 0.7468241
plot_trapezoidal(f = function(x) 1/exp(x^2), 
                 a = 0, b = 1, n = 4,
                 main = expression(paste("y = " * frac(1, e^{x^2}))))


3.2 The Error Function

The error function, denoted by \(\mathrm{erf}(x)\), arises in probability, statistics, and partial differential equations. It is defined as

\[ \mathrm{erf}(x) \;=\; \frac{2}{\sqrt{\pi}} \int_0^x e^{-t^2}\,dt. \]

This function is closely related to the normal distribution:
it represents (after scaling) the probability that a normally distributed random variable with mean \(0\) and variance \(\tfrac{1}{2}\) falls between \(0\) and \(x\).

In statistics, \(\mathrm{erf}(x)\) is used in expressing the cumulative distribution function (CDF) of the standard normal distribution:

\[ \Phi(z) = \frac{1}{2}\Big[1 + \mathrm{erf}\!\Big(\tfrac{z}{\sqrt{2}}\Big)\Big], \]

where \(\Phi(z)\) is the standard normal CDF.

# Integrand for erf(x): (2/sqrt(pi)) * exp(-x^2)
f <- function(x) (2/sqrt(pi)) * exp(-x^2)

# Numeric approximation on [0, 1]
trap_val <- trapezoidal(f, a = 0, b = 1)   # add n=... here if your function supports it

# Plots with correct math label
plot_integral(
  f = f, a = 0, b = 1, from = -2, to = 2,
  title = expression(y == (2/sqrt(pi)) * e^{-x^2})
)

plot_trapezoidal(
  f = f, a = 0, b = 1, n = 8,
  main = expression(y == (2/sqrt(pi)) * e^{-x^2})
)

# Exact value via relationship to the standard normal CDF
erf <- function(x) 2 * pnorm(x * sqrt(2)) - 1
exact_val <- erf(1)

# Compare
c(trapezoidal = trap_val, exact = exact_val, abs_error = abs(trap_val - exact_val))
##  trapezoidal        exact    abs_error 
## 8.427008e-01 8.427008e-01 6.924017e-12

Arc Length of a Catenary by Numerical Integration

The arc length of a catenary curve defined by

\[ y = a \cosh\!\left(\tfrac{x}{a}\right) \]

between \(x = x_1\) and \(x = x_2\) is given by

\[ L = \int_{x_1}^{x_2} \sqrt{1 + \left(\frac{dy}{dx}\right)^2}\, dx. \]

Since \[ \frac{dy}{dx} = \sinh\!\left(\tfrac{x}{a}\right), \] the integrand becomes \[ \sqrt{1 + \sinh^2\!\left(\tfrac{x}{a}\right)} \;=\; \cosh\!\left(\tfrac{x}{a}\right). \]

Thus, \[ L = \int_{x_1}^{x_2} \cosh\!\left(\tfrac{x}{a}\right) dx, \]

which can be evaluated analytically, but may also be approximated numerically when exact evaluation is difficult for more general curves.

Taken from (Kincaid & Cheney, 2008, p. 189):

Verify the numerical approximation given for the arc length in the catenary example above.

Solution:

From calculus, the length of a curve is

\[ \displaystyle{\int\limits_{a}^{b} \sqrt{1+ \left[ f'(x)\right]^2} \; dx} \]

where \(f\) is a function whose graph is the curve on the interval \(a \leq x \leq b\).

The arc length of a catenary is

\[ f(\theta) = \mathrm{cosh} \left(\frac{\theta}{\lambda} \right) \]

and its derivative is

\[ \frac{d}{d\theta} \; f(\theta) = \mathrm{sinh} \left(\frac{\theta}{\lambda} \right) \]

Substituting this into the arc-length formula we get

\[ \displaystyle{\int\limits_{a}^{b} \sqrt{1+ \left[ \mathrm{sinh} \left(\frac{\theta}{\lambda} \right)\right]^2} \; dx} \]

trapezoidal(function(theta){sqrt(1 + (sinh(theta/lambda))^2)}, a = -50, b = 50)
## [1] 102.6187

3.3 Romberg Algorithm

The Romberg algorithm applies Richardson extrapolation to the trapezoidal rule to improve the accuracy of numerical integration. Richardson extrapolation is a technique for accelerating the convergence of a sequence, allowing for more accurate approximations.

Romberg integration constructs a triangular array of estimates for a definite integral. The method proceeds as follows:

  • The first column consists of estimates using the composite trapezoidal rule, where the step size is halved with each row:

\[R(k,1)=T(hk),\] where \(hk=b−a2k−1\)

  • Each subsequent column is computed using Richardson extrapolation, which eliminates lower-order error terms:

\[ R[k, j] = R[k, j-1] + (1 / (4^(j-1) - 1)) * (R[k, j-1] - R[k-1, j-1]) \] Where:

  • \(R(k,j)\) is the Romberg estimate at row kk, column jj,

  • \(T(hk)\) is the trapezoidal approximation with step size \(hk\),

The array is built iteratively until a desired level of convergence is achieved or a maximum depth is reached.


Advantages:

  • Converges more rapidly than the trapezoidal or Simpson’s rule for sufficiently smooth functions.

  • Produces highly accurate integral approximations using fewer function evaluations.

\[ R(n, m) = R(n, m - 1) + \frac{1}{4^m - 1} \cdot \left[R(n, m - 1) - R(n - 1, m - 1)\right] \]


R Implementation

As described in (Kincaid & Cheney, 2008, p. 206):

#' Romberg Integration
#'
#' Numerically approximates the definite integral of a function using the Romberg method.
#' This algorithm applies Richardson extrapolation to improve accuracy over the trapezoidal rule.
#'
#' @param f Function to integrate. Should be a function of one variable (e.g., \code{function(x) sin(x)}).
#' @param a Lower limit of integration.
#' @param b Upper limit of integration.
#' @param n Number of rows in the Romberg integration table (default = 5).
#'
#' @return A matrix containing the Romberg table of estimates. The most accurate estimate
#'         is in the bottom-right corner \code{[n, n]}.
#'
#' @examples
#' f <- function(x) exp(-x^2)
#' Romberg(f, 0, 1)
#'
#' # Compare with built-in integration
#' integrate(f, 0, 1)
#'
#' @export
Romberg <- function(f, a, b, n = 5) {
  r <- matrix(NA, n, n)
  h <- b - a
  r[1, 1] <- (h / 2) * (f(a) + f(b))
  
  for (i in 2:n) {
    h <- h / 2
    sum <- 0
    
    for (k in seq(1, (2^(i - 1)), by = 2)) {
      sum <- sum + f(a + k * h)
    }
    
    r[i, 1] <- 0.5 * r[i - 1, 1] + sum * h
    
    for (j in 2:i) {
      r[i, j] <- r[i, j - 1] + (r[i, j - 1] - r[i - 1, j - 1]) / (4^j - 1)
    }
  }
  
  return(r)
}

# Save to file for later sourcing
dump("Romberg", file = "Romberg.R")

Example: Approximate the area under the curve \(y=\dfrac{4}{1+x^2}\) on the interval \([0,1]\).

\[ \text{Area} \;\approx\; \int_{0}^{1} \frac{4}{1+x^2}\,dx. \]

(This integral can be evaluated exactly as \(4 \arctan(1) = \pi\), but in practice it may also be approximated numerically)

options(digits = 3)
Romberg(function(x) 4/(1 + x^2), a = 0, b = 1)
##      [,1] [,2] [,3] [,4] [,5]
## [1,] 3.00   NA   NA   NA   NA
## [2,] 3.10 3.11   NA   NA   NA
## [3,] 3.13 3.13 3.13   NA   NA
## [4,] 3.14 3.14 3.14 3.14   NA
## [5,] 3.14 3.14 3.14 3.14 3.14

Number of Prime Integers by the Logarithmic Integral

In number theory, the distribution of prime numbers up to a given value \(x\) is closely approximated by the logarithmic integral function:

\[ \pi(x) \;\approx\; \operatorname{Li}(x) \;=\; \int_2^x \frac{dt}{\ln(t)} , \]

where \(\pi(x)\) is the prime-counting function (the number of primes less than or equal to \(x\)).

The logarithmic integral provides a much better approximation to \(\pi(x)\) than the simpler estimate \(\tfrac{x}{\ln(x)}\) from the Prime Number Theorem.

Taken from (Kincaid & Cheney, 2008, p. 189):

The logarithmic integral is a special mathematical function defined by the equation

\[ \mathrm{li}(x) = \displaystyle{\int\limits_{2}^{x} \frac{1}{\ln{t}} \; dt} \]

For large \(x\), the number of prime integers less than or equal to \(x\) is closely approximated by \(\mathrm{li}(x)\). For example, \(\mathrm{li}(200)\) is around 50 and there are 46 primes less than 200.

Verify that \(\mathrm{li}(200) \approx 50\).

Solution:

Romberg(f = function(x) 1/log(x), a = 2, b = 200, n = 7)
##       [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 161.5   NA   NA   NA   NA   NA   NA
## [2,] 102.2 98.3   NA   NA   NA   NA   NA
## [3,]  73.5 71.6 71.2   NA   NA   NA   NA
## [4,]  59.9 59.0 58.8 58.8   NA   NA   NA
## [5,]  53.6 53.2 53.1 53.1 53.1   NA   NA
## [6,]  50.9 50.7 50.6 50.6 50.6 50.6   NA
## [7,]  49.7 49.7 49.6 49.6 49.6 49.6 49.6

3.4 Adaptive Quadrature (Recursive Simpson)

Unlike the straight-line segments used in the trapezoidal rule, Simpson’s rule approximates an integral with quadratic polynomials (parabolic arcs). On a single panel \([a,b]\) with midpoint \(m=(a+b)/2\), \[ S[a,b] \;=\; \frac{b-a}{6}\times\Big[f(a) + 4f(m) + f(b)\Big], \] which is more accurate than trapezoids for smooth functions.

Adaptive quadrature applies such a rule on subintervals that are refined automatically. The algorithm subdivides an interval wherever the integrand is “difficult” (rapid variation, kinks, near-singular behavior), while keeping coarser panels where the function is smooth. A common strategy is adaptive Simpson: compare one-panel Simpson \(S[a,b]\) to the sum of two half-panel Simpson estimates \(S[a,m]+S[m,b]\); if the difference is small enough (e.g., \(\frac{\big| S[a,m]+S[m,b]-S[a,b]\big|}{15}\) below a tolerance), accept; otherwise recurse on each half.

For smooth, well-behaved integrands, adaptive methods are typically as efficient as fixed-step rules. Their key advantage appears on badly behaved integrands, where fixed rules struggle: adaptivity concentrates work where it matters most, delivering accuracy with fewer function evaluations overall.


R Implementation

As described in (Kincaid & Cheney, 2008, p. 224):

#' Adaptive Simpson's Rule for Numerical Integration
#'
#' Approximates the definite integral of a function \code{f} over the interval \code{[a, b]}
#' using adaptive Simpson's Rule. Optionally plots the function and shaded area.
#'
#' @param f A function to integrate.
#' @param a Lower bound of integration.
#' @param b Upper bound of integration.
#' @param from Plot range lower bound (default is \code{a}).
#' @param to Plot range upper bound (default is \code{b}).
#' @param epsilon Desired accuracy (default is \code{1e-6}).
#' @param level Internal recursion level (should not be manually set by user).
#' @param level.max Maximum recursion depth to avoid infinite recursion (default is 101).
#' @param plot Logical; if \code{TRUE}, plots the function with shaded integral region (default is \code{FALSE}).
#' @param title Optional title for the plot.
#'
#' @return The approximate value of the definite integral.
#' @examples
#' Simpson(function(x) sin(x), 0, pi, plot = TRUE, title = "∫ sin(x) dx from 0 to π")
#'
#' @export
#' 
Simpson <- function(f, a, b,
                    from = a, to = b,
                    epsilon = 1e-6,
                    level = 1, level.max = 101,
                    plot = FALSE, title = NULL) {

  # Optional: source once outside in your script, not here.
  # if (!exists("plot_integral")) source("plot_integral.R")

  # Increment depth (do it once per call)
  level <- level + 1

  # Midpoints
  m <- (a + b) / 2
  d <- (a + m) / 2
  e <- (m + b) / 2
  h <- b - a

  # Simpson on [a,b]
  fa <- f(a); fm <- f(m); fb <- f(b)
  if (!all(is.finite(c(fa, fm, fb)))) stop("Non-finite f() on [a,b].")
  one_simpson <- h * (fa + 4*fm + fb) / 6

  # Simpson on [a,m] and [m,b] combined
  fd <- f(d); fe <- f(e)
  if (!all(is.finite(c(fd, fe)))) stop("Non-finite f() on subintervals.")
  two_simpson <- h * (fa + 4*fd + 2*fm + 4*fe + fb) / 12

  # Tolerance test (classic |S2 - S1|/15)
  err <- abs(two_simpson - one_simpson) / 15

  # If depth cap or acceptable error, return improved estimate
  if (level >= level.max) {
    warning("Maximum recursion level reached; returning local estimate.")
    simpson_result <- two_simpson + (two_simpson - one_simpson) / 15
  } else if (err < epsilon) {
    simpson_result <- two_simpson + (two_simpson - one_simpson) / 15
  } else {
    # Recurse on halves — USE NAMED ARGS to avoid positional bugs
    left_simpson  <- Simpson(f, a = a, b = m,
                             from = from, to = to,
                             epsilon = epsilon/2,
                             level = level, level.max = level.max,
                             plot = FALSE, title = title)
    right_simpson <- Simpson(f, a = m, b = b,
                             from = from, to = to,
                             epsilon = epsilon/2,
                             level = level, level.max = level.max,
                             plot = FALSE, title = title)
    simpson_result <- left_simpson + right_simpson
  }

  # Optional plotting (do this only at the top call, e.g., when level == 2)
  if (plot && level == 2) {
    if (exists("plot_integral")) {
      plot_integral(f, a = from, b = to, from = from, to = to, title = title)
    } else {
      message("plot_integral() not found; skipping plot.")
    }
  }

  return(simpson_result)
}

# Save to working directory
dump("Simpson", file = "Simpson.R")

Example: Approximate the area under the curve

\(y=\dfrac{\cos(2x)}{e^{x}}\) on the interval \([0,\tfrac{5\pi}{4}]\).

\[ \text{Area} \;\approx\; \int_{0}^{\tfrac{5\pi}{4}} \frac{\cos(2x)}{e^{x}}\,dx. \]

Simpson(function(x){cos(2*x)/exp(x)}, a = 0, b = 5*pi/4, from = -pi/4, to = 3*pi/2, plot = TRUE, epsilon = 1e-10, level.max = 1e3, 
        title = expression(paste("y = ", frac(cos(2*x), e^{x}))))

## [1] 0.208

R Implementation

#' Plot Simpson's Rule Approximation with Parabolic Arcs (Pretty, with alpha)
#'
#' @param f       Function to integrate.
#' @param a,b     Integration limits.
#' @param n       Number of subintervals (must be even). Default: 6.
#' @param title   Optional plot title.
#' @param col_fun Color for true function curve. Default: "steelblue".
#' @param col_arc Color for Simpson parabolic arcs. Default: "firebrick".
#' @param alpha_fill Alpha (0–1) for shaded area under each arc. Default: 0.25.
#' @param show_bins Logical; draw vertical bin lines at panel boundaries. Default: TRUE.
#' @param show_points Logical; show sample points used by Simpson (nodes + midpoints). Default: TRUE.
#' @return NULL (produces a plot)
plot_simpson_bins <- function(
  f, a, b, n = 6, title = NULL,
  col_fun = "steelblue", col_arc = "firebrick",
  alpha_fill = 0.25, show_bins = TRUE, show_points = TRUE
) {
  if (n %% 2 != 0) stop("n must be even for Simpson's rule.")
  if (!is.function(f)) stop("f must be a function of a single numeric vector.")

  # --- Sample the function at the Simpson nodes ---
  x <- seq(a, b, length.out = n + 1)
  y <- f(x)

  # Build a smooth reference curve for the true function
  x_dense <- seq(a, b, length.out = max(400, 50 * n))
  y_dense <- f(x_dense)

  # --- Plot base: function curve, grid, baseline ---
  op <- par(no.readonly = TRUE)
  on.exit(par(op), add = TRUE)
  par(lend = 2)  # rounded line ends look nicer

  # Determine y-limits with a little padding
  pad <- 0.05 * diff(range(y_dense, y, finite = TRUE))
  ylim <- range(y_dense, y, finite = TRUE)
  ylim <- c(ylim[1] - pad, ylim[2] + pad)

  plot(x_dense, y_dense, type = "n",
       xlab = "x", ylab = "f(x)",
       main = if (is.null(title)) "Simpson's Rule (Parabolic Panels)" else title,
       cex.main = 2,   # increase main title font size (default is 1.0)
       ylim = ylim)

  # Light grid
  abline(h = pretty(ylim), col = grDevices::adjustcolor("grey70", alpha.f = 0.5), lwd = 0.5)
  abline(v = pretty(c(a, b)), col = grDevices::adjustcolor("grey70", alpha.f = 0.5), lwd = 0.5)

  # Baseline y=0
  abline(h = 0, col = grDevices::adjustcolor("black", alpha.f = 0.5), lwd = 1)

  # True function curve (slightly thicker, semi-transparent)
  lines(x_dense, y_dense, col = grDevices::adjustcolor(col_fun, alpha.f = 0.9), lwd = 5)

  # Optional vertical bin lines
  if (show_bins) {
    abline(v = x, col = grDevices::adjustcolor("grey50", alpha.f = 0.3), lwd = 0.8, lty = "dotted")
  }

  # --- Draw each Simpson parabolic panel with shading ---
  for (i in seq(1, n, by = 2)) {
    # Three nodes for this Simpson panel: (x_i, x_{i+1}, x_{i+2})
    xs <- x[i:(i + 2)]
    ys <- y[i:(i + 2)]

    # Fit quadratic through the three points:
    # ys = a0 + a1*xs + a2*xs^2  (raw = TRUE keeps simple powers)
    # coef order: (Intercept, x, x^2)
    fit <- stats::lm(ys ~ stats::poly(xs, 2, raw = TRUE))
    cf  <- stats::coef(fit)

    # Smooth arc across this panel
    x_fine <- seq(xs[1], xs[3], length.out = 200)
    y_fine <- cf[1] + cf[2] * x_fine + cf[3] * x_fine^2

    # Shade area under the parabola down to baseline y=0 (classic visual)
    polygon(
      x = c(x_fine, rev(x_fine)),
      y = c(y_fine, rep(0, length(y_fine))),
      col = grDevices::adjustcolor(col_arc, alpha.f = alpha_fill),
      border = NA
    )

    # Draw the parabolic arc itself (thin, semi-transparent)
    lines(x_fine, y_fine, col = grDevices::adjustcolor(col_arc, alpha.f = 0.9), lwd = 2)
  }

  # Optional: mark Simpson nodes and midpoints
  if (show_points) {
    # Nodes at panel boundaries (black outline)
    points(x, y, pch = 21, bg = grDevices::adjustcolor("white", alpha.f = 0.9), col = grDevices::adjustcolor("black", alpha.f = 0.8), cex = 1)
    
    # Midpoints for each panel (these are x_{i+1} for i = 1,3,5,...)
    mids_idx <- seq(2, n, by = 2)
    points(x[mids_idx], y[mids_idx], pch = 21,
           bg = grDevices::adjustcolor(col_arc, alpha.f = 0.7),
           col = grDevices::adjustcolor(col_arc, alpha.f = 1.0), cex = 1.2)
  }

  # Legend with transparent swatches
  legend("topright",
         legend = c("f(x)", "Simpson parabolic arc", "Shaded Simpson panel", "Nodes", "Midpoints"),
         lwd = c(2.2, 1.2, NA, NA, NA),
         pch = c(NA, NA, 15, 21, 21),
         pt.bg = c(NA,
                   NA,
                   grDevices::adjustcolor(col_arc, alpha.f = alpha_fill),
                   grDevices::adjustcolor("white", alpha.f = 0.9),
                   grDevices::adjustcolor(col_arc, alpha.f = 0.7)),
         col = c(grDevices::adjustcolor(col_fun, alpha.f = 0.9),
                 grDevices::adjustcolor(col_arc, alpha.f = 0.9),
                 grDevices::adjustcolor(col_arc, alpha.f = alpha_fill),
                 grDevices::adjustcolor("black", alpha.f = 0.8),
                 grDevices::adjustcolor(col_arc, alpha.f = 1.0)),
         bty = "n", cex = .9, inset = 0.01)
}
f <- function(x){cos(2*x)/exp(x)}
plot_simpson_bins(f, 0, 5*pi/4, n = 8, title = "Simpson's Rule Approximation with Parabolic Arcs")


Normal Distribution in Statistics

The normal distribution (also called the Gaussian distribution) is one of the most important probability distributions in statistics.

  • It describes a continuous random variable whose values cluster symmetrically around a central mean.
  • Its probability density function (PDF) is given by:

\[ f(x \mid \mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}} \, \exp\!\left(-\frac{(x - \mu)^2}{2\sigma^2}\right), \]

where
- \(\mu\) = mean (center of the distribution)
- \(\sigma^2\) = variance (spread of the distribution)


Key Properties:

  • Shape: Symmetric, bell-shaped curve.
  • 68–95–99.7 Rule: About 68% of data lie within 1 standard deviation of the mean, 95% within 2, and 99.7% within 3.
  • Applications: Widely used in hypothesis testing, confidence intervals, regression, and natural/measurement processes.

The normal distribution is central in statistics due to the Central Limit Theorem, which states that sums (or averages) of many independent random variables tend to follow a normal distribution, regardless of the original distribution.


Example: Approximate the area under the standard normal curve

The standard normal distribution is the normal distribution with mean \(\mu = 0\) and variance \(\sigma^2 = 1\). Its probability density function is:

\[ \phi(x) = \frac{1}{\sqrt{2\pi}} \, e^{-x^2/2}. \]

Approximating the area under the standard normal curve between two points corresponds to finding probabilities of the form:

The probability that a standard normal random variable \(Z\) lies between two values \(a\) and \(b\) is given by

\[ P(a \leq Z \leq b) = \int_a^b \phi(x)\,dx, \]

where \(\phi(x)\) is the standard normal probability density function.

More generally, the probability density function of a normal distribution with mean \(\mu\) and standard deviation \(\sigma\) is

\[ f(x) = \frac{1}{\sigma \sqrt{2\pi}} \, \exp\!\left(-\frac{(x-\mu)^2}{2\sigma^2}\right), \]

and integrating it over the entire real line gives

\[ \int_{-\infty}^{\infty} \frac{1}{\sigma \sqrt{2\pi}} \, e^{-\frac{(x-\mu)^2}{2\sigma^2}} \, dx = 1, \]

which confirms that the normal distribution is a valid probability distribution.

Since this integral has no closed-form expression, the area (probability) is approximated numerically using numerical methods.


About 68% of values drawn from a normal distribution are within one standard deviation, \(\sigma\) away from the mean, \(\mu\).

options(digits = 10)
Simpson(function(x){exp(-.5*x^2)/sqrt(2*pi)}, a = -1, b = 1, from = -4, to = 4, 
        plot = TRUE, epsilon = 1e-10, level.max = 1e5,
        title = expression(paste("y = ", frac(e^{-.5*x^2}, sqrt(2*pi))))) # 0.682689492137

## [1] 0.6826894921
f <- function(x){exp(-.5*x^2)/sqrt(2*pi)}
plot_simpson_bins(f, -1, 1, n = 4, title = "Simpson's Rule Approximation with Parabolic Arcs")


About 95% of the values lie within two standard deviations

# about 0.954499736104 of the values lie within two standard deviations 
Simpson(function(x){exp(-.5*x^2)/sqrt(2*pi)}, a = -2, b = 2, from = -4, to = 4, 
        plot = TRUE, title = expression(paste(paste("y = ", frac(1, sigma * sqrt(2 * pi)),
                                                    " ", e^{-frac((x - mu)^2, 2 * sigma^2)}))))

## [1] 0.9544997497
# and about 0.997300203937 are within three standard deviations
Simpson(function(x){exp(-.5*x^2)/sqrt(2*pi)}, a = -3, b = 3, from = -4, to = 4, 
        plot = TRUE, title = expression(paste("y = ", frac(1, sigma * sqrt(2 * pi)),
                                              " ", e^{-frac((x - mu)^2, 2 * sigma^2)})))

## [1] 0.9973002044
# 0.999936657516
Simpson(function(x){exp(-.5*x^2)/sqrt(2*pi)}, a = -4, b = 4, from = -4, to = 4, 
        plot = TRUE, title = expression(paste("y = ", frac(1, sigma * sqrt(2 * pi)),
                                              " ", e^{-frac((x - mu)^2, 2 * sigma^2)})))
## [1] 0.9999366617
# 0.5 of the values lie below the mean
Simpson(function(x){exp(-.5*x^2)/sqrt(2*pi)}, a = -6, b = 0, from = -4, to = 4, 
        plot = TRUE, title = expression(paste("y = ", frac(1, sigma * sqrt(2 * pi)),
                                              " ", e^{-frac((x - mu)^2, 2 * sigma^2)})))
## [1] 0.5000000565

4 Systems of Linear Equations

A system of linear equations is a collection of equations where each equation is linear in the same set of unknowns.
In matrix form, it can be written as:

\[ A \mathbf{x} = \mathbf{b}, \]

where \(A\) is the coefficient matrix, \(\mathbf{x}\) is the vector of unknowns, and \(\mathbf{b}\) is the right-hand side vector.
The goal is to find values of \(\mathbf{x}\) that satisfy all equations simultaneously.


Applications in Statistics Systems of linear equations arise naturally in many statistical procedures: - Linear Regression: solving the normal equations \((X^\top X)\beta = X^\top y\) to estimate regression coefficients.
- Generalized Least Squares: solving weighted systems for correlated errors.
- Multivariate Analysis: in computing covariance matrices, principal components, and discriminant analysis.
- Numerical Methods: iterative solvers appear in optimization algorithms for maximum likelihood and Bayesian inference.

4.1 Naive Gaussian Elimination

Naive Gaussian elimination is the basic form of Gaussian elimination used to solve a linear system of equations
\[ A \mathbf{x} = \mathbf{b}, \]
where the coefficient matrix \(A\) and the right-hand side vector \(\mathbf{b}\) are given.

The method systematically applies row operations to transform \(A\) into an upper-triangular form, after which the solution vector \(\mathbf{x}\) is obtained through back substitution. Unlike pivoting strategies, the naive approach does not perform row exchanges, which can lead to numerical instability for certain matrices.


R Implementation

As described in (Kincaid & Cheney, 2008, p. 252):

#' Naive Gaussian Elimination for Solving Linear Systems
#'
#' Solves the system \( A \mathbf{x} = \mathbf{b} \) using the Naive Gaussian Elimination method.
#'
#' @param A Numeric matrix representing the coefficient matrix \( A \).
#' @param b Numeric vector representing the right-hand side vector \( \mathbf{b} \).
#' @return Numeric vector \(\mathbf{x}\), the solution to the system.
#' @examples
#' A <- matrix(c(2,1,-1, -3,-1,2, -2,1,2), nrow=3, byrow=TRUE)
#' b <- c(8, -11, -3)
#' Naive.Gauss(A, b)
Naive.Gauss <- function(A, b) {
  a <- as.matrix(A)
  n <- nrow(a)
  
  x <- numeric(n)
  names(x) <- paste0("x", 1:n)
  
  # Forward elimination
  for (k in 1:(n - 1)) {
    for (i in (k + 1):n) {
      xmult <- a[i, k] / a[k, k]
      a[i, k] <- xmult
      
      for (j in (k + 1):n) {
        a[i, j] <- a[i, j] - xmult * a[k, j]
      }
      
      b[i] <- b[i] - xmult * b[k]
    }
  }
  
  # Back substitution
  x[n] <- b[n] / a[n, n]
  
  for (i in (n - 1):1) {
    sum <- b[i]
    for (j in (i + 1):n) {
      sum <- sum - a[i, j] * x[j]
    }
    x[i] <- sum / a[i, i]
  }
  
  return(x)
}

# Save the function to the working directory
dump("Naive.Gauss", file = "Naive.Gauss.R")

Solve the following system

\[ \begin{cases} 6x_1 - 2x_2 + 2x_3 + 4x_4 = 16 \\ 12x_1 - 8x_2 + 6x_3 + 10x_4 = 10 \\ 3x_1 - 13x_2 + 9x_3 + 3x_4 = -19 \\ -6x_1 + 4x_2 + x_3 - 18x_4 = -34 \\ \end{cases} \]

The coefficient and matrix and right-hand side vector:

# the coefficient matrix
A <- matrix(c(6, -2, 2, 4,
              12, -8, 6, 10, 
              3, -13, 9, 3,
              -6, 4, 1, -18), 
            4, 4, byrow = TRUE)

# right-hand side vector
b <- matrix(c(16, 26, -19, -34))

Solution:

(x <- Naive.Gauss(A, b))
## x1 x2 x3 x4 
##  3  1 -2  1
# Verify solution from Naive Gaussian Elimination

# Compute Ax to check if it matches b
Ax <- A %*% x

# Print computed Ax
print("Computed Ax:")
## [1] "Computed Ax:"
print(Ax)
##      [,1]
## [1,]   16
## [2,]   26
## [3,]  -19
## [4,]  -34
# Print original right-hand side vector b
print("Original vector b:")
## [1] "Original vector b:"
print(b)
##      [,1]
## [1,]   16
## [2,]   26
## [3,]  -19
## [4,]  -34
# Explanation:
# If the Naive Gaussian Elimination solved the system correctly,
# then Ax should be (approximately) equal to b.
# Minor differences may occur due to floating-point arithmetic.

Electrical Network Problem

Taken from (Kincaid & Cheney, 2008, p. 245):

A simple electrical network contains a number of resistances and a single source of electromotive force (a battery) as shown below.

Electrical network
Electrical network

Using Kirchhoff’s laws and Ohm’s law, we can write a system of linear equations that govern this circuit. If \(x_1\), \(x_2\), \(x_3\), and \(x_4\) are the loop currents as shown, then the equations are

\[ \begin{matrix} 15x_1 & - & 2x_2 & - & 6x_3 & & & = & 300 \\ \\ -2x_1 & + & 12x_2 & - & 4x_3 & - & x_4 & = & 0 \\ \\ -6x_1 & - & 4x_2 & + & 19x_3 & - & 9x_4 & = & 0 \\ \\ & - & x_2 & - & 9x_3 & + & 21x_4 & = & 0 \\ \end{matrix} \]


Approximate the unknown vector x.

Solution:

# The coefficient matrix
A <- matrix(c(15, -2, -6, 0,
              -2, 12, -4, -1, 
              -6, -4, 19, -9,
              0, -1, -9, 21), 
            4, 4, byrow = TRUE)

# right-hand side vector
b <- matrix(c(300, 0, 0, 0))
options(digits = 2)
Naive.Gauss(A, b)
##   x1   x2   x3   x4 
## 26.5  9.4 13.3  6.1

4.2 Gaussian Elimination with Scaled Partial Pivoting

Gaussian Elimination with Scaled Partial Pivoting is a variant of Gaussian elimination used to solve systems of linear equations.

  • Gaussian elimination reduces the system \(Ax = b\) to upper-triangular form by eliminating variables step by step.

  • Scaled partial pivoting improves numerical stability by:

    1. Scaling each row by its largest element (to account for different magnitudes).
    2. Choosing the pivot as the row with the largest ratio
      \[ \frac{|a_{ij}|}{s_i}, \] where \(s_i\) is the row’s scale factor.

👉 This prevents division by very small numbers and reduces round-off error compared to naive or even simple partial pivoting.


Gaussian Elimination with Scaled Partial Pivoting — Pseudocode


Input:  matrix A ∈ ℝ^(n×n), vector b ∈ ℝ^n, tolerance tol
Output: solution x ∈ ℝ^n or failure (near-singular/inconsistent)

1. Row scales
   For each row i = 1..n:
       s[i] ← max_j |A[i,j]|
       If s[i] = 0 then
           If b[i] ≠ 0: FAIL (inconsistent row 0 = b[i])
           Else: keep s[i] = 0 (row carries no pivot)

2. Forward elimination (scaled partial pivoting)
   For k = 1..n−1:
       a. Choose pivot row
          p ← argmax_{i=k..n, s[i] > 0} ( |A[i,k]| / s[i] )
          If all candidates have |A[i,k]| = 0: FAIL 

       b. Swap rows k ↔ p in A, b, and s

       c. If s[k] = 0 or |A[k,k]| / max(1, s[k]) < tol:
              FAIL (near-singular pivot)

       d. For i = k+1..n:
              m ← A[i,k] / A[k,k]
              A[i,k] ← 0
              For j = k+1..n:
                  A[i,j] ← A[i,j] − m·A[k,j]
              b[i] ← b[i] − m·b[k]

3. Back substitution
   If |A[n,n]| / max(1, s[n]) < tol:
       FAIL (near-singular)

   x[n] ← b[n] / A[n,n]

   For i = n−1..1 (step −1):
       t ← b[i] − ∑_{j=i+1..n} A[i,j]·x[j]
       If |A[i,i]| / max(1, s[i]) < tol:
           FAIL (near-singular)
       x[i] ← t / A[i,i]

4. Return
   Return x

R Implementation

As described in (Kincaid & Cheney, 2008, p. 267):

## Gaussian Elimination with Scaled Partial Pivoting
#' Gaussian Elimination with Scaled Partial Pivoting
#'
#' Solves the linear system \(Ax = b\) using Gaussian elimination with
#' scaled partial pivoting to improve numerical stability.
#'
#' @param A Numeric matrix representing the coefficient matrix.
#' @param b Numeric vector or matrix representing the right-hand side vector.
#'
#' @return Numeric vector containing the solution vector \(x\).
#'
#' @details
#' The function performs forward elimination with scaled partial pivoting,
#' followed by back substitution to find the solution vector.
#' Scaled partial pivoting selects pivot rows based on the ratio of the
#' largest element in each row to the pivot element to minimize rounding errors.
#'
#' @examples
#' A <- matrix(c(2,1,1,4,3,3,8,7,9), 3, 3, byrow = TRUE)
#' b <- c(2,5,10)
#' Gauss(A, b)
#'
#' @export
Gauss <-
  function(A, b) {
    a <- as.matrix(A)
    b <- as.matrix(b)
    
    # forward elimination with scaled partial pivoting
    n <- nrow(a)
    s <- numeric(n)
    l <- 1:n
    
    for (i in 1:n) { 
      s[i] <- max(abs(a[i, ]))
    }
    
    for (k in 1:(n - 1)) { 
      rmax <- 0
      j <- k
      
      for (i in k:n) {
        r <- abs(a[l[i], k]) / s[l[i]]
        if (r > rmax) { 
          rmax <- r
          j <- i 
        }
      }
      temp <- l[j]
      l[j] <- l[k]
      l[k] <- temp
      
      for (i in (k + 1):n) {
        xmult <- a[l[i], k] / a[l[k], k] 
        a[l[i], k] <- xmult
        
        for (j in (k + 1):n) {
          a[l[i], j] <- a[l[i], j] - xmult * a[l[k], j]
        }
      }
    }
    
    # Back substitution phase
    x <- numeric(n)
    
    for (k in 1:(n - 1)) {
      for (i in (k + 1):n) {
        b[l[i]] <- b[l[i]] - a[l[i], k] * b[l[k]]
      }
    }
    
    x[n] <- b[l[n]] / a[l[n], n]
    
    for (i in (n - 1):1) {
      sum <- b[l[i]]
      
      for (j in (i + 1):n) {
        sum <- sum - a[l[i], j] * x[j]
      }
      
      x[i] <- sum / a[l[i], i]
    }
    
    names(x) <- paste0("x", 1:n)
    return(x)
  }

# save to the working directory getwd(), ls()
dump("Gauss", file = "Gauss.R")

Example: Solve the following system of linear equations using Gaussian Elimination with Scaled Partial Pivoting

\[ \begin{cases} 2x_1 + 4x_2 - 2x_3 &= \;\; 6 \\ x_1 + 3x_2 + 4x_3 &= -1 \\ 5x_1 + 2x_2 &= \;\; 2 \\ \end{cases} \]


Solution:

# The coefficient matrix
A <- matrix(c(2, 4, -2,
              1, 3, 4,
              5, 2, 0), 
            3, 3, byrow = TRUE)

# right-hand side vector
b <- matrix(c(6, -1, 2))
Gauss(A, b)
## x1 x2 x3 
##  0  1 -1

References

Burden, R. L., & Faires, J. D. (2011). Numerical analysis. Brooks/Cole.
Kincaid, D. R., & Cheney, E. W. (2008). Numerical mathematics and computing. Cengage Learning.