Introduction

The Black-Scholes-Merton equation (shortened as the Black-Scholes) is no doubt one of the most important partial differential equations in economics, which serves as a critical foundation to modern quantitative finance (so-called the “quant”). In this paper, we are first going to look briefly at the history of its development, before deriving the complete equation from the perspective of delta hedging. Next, we plan to provide a comprehensive solution to the Black-Scholes, including both an analytic solution via Fourier transform and a numerical approach using a forward difference method. We will also introduce concepts like Greek measures and rephrase the Black-Scholes in “finance” terminologies. Then, from a separate track, we will examine the probability model underlying the assumptions of the Black-Scholes - Brownian motion. Using a random walk, we will build Brownian motion from ``scratch’’, before going on to offer more complex statistical analysis. Meanwhile, we will look at some applications: how the Black-Scholes is used as a pricing model, how the Black-Scholes translates into different forms with different assets (e.g. the Black-Scholes formula with put and call, the Black formula with bond option and interest rate option, etc). R simulations of related topics will be provided along the way.

History

The Black-Scholes was first invented by Fischer Black and Myron Scholes in the Pricing of Options and Corporate Liabilities which was published in the Journal of Political Economy in 1973. The initial version of the equation was derived by means of delta hedging, a measure of risk prevention through the purchasing of underlying asset of a financial derivative. Several years later, Robert C. Merton published a paper in which he examined the mathematical properties of the Black-Scholes, and hence the name “Black-Scholes-Merton equation”. Ever since its initial proposal, the Black-Scholes has proved to be a powerful instrument in estimating the price of European style options and thus facilitating option trading. However, despite Black-Scholes’ great theoretical elegance, various reality constraints (such as extreme volatility, liquidity imperfection, etc.) have limited the accuracy of its predicting power. To combat these shortcomings, practitioners of the industry have adopted different variations to avoid specific risks.

Derivation

First, we need some preliminary knowledge about financial derivatives before we can discuss the motivation of Black-Scholes. A call is a contract which grants the purchaser the right, but not the obligation, to buy a security at a pre-determined price from the call issuer on a future date, whereas a put is a contract which gives similar right to sell the security at a pre-determined price on a future date. A European style call/put can only be exercised at a certain future date while an American call/put can be exercised any time before the expiry date. Call and put are the simplest forms of “option”, a term we give to a body of financial derivatives with such similar characteristics that grant the purchaser the right or “option” to buy or sell the underlying asset. Like stocks and bonds, various financial derivatives and options are crucial to our financial system in that they can be used for reallocating abstract economic resources like money and risk tolerance, or for purely speculative purposes. Large volumes of options are traded on exchange daily, so a natural question is - how can we price them?

First, let us be clear with some notations that will be used consistently throughout the following text:

\(V\) is the price of an option (in general).
\(C\) is the price of a call option.
\(P\) is the price of a put option.
\(S\) is the current price of the stock.
\(X\) is the strike price (the “predetermined” price negotiated between buyer and seller).
\(t\) is the time until expiry.
\(r\) is the risk free interest rate.
\(\sigma\) is the volatility of the stock price.
\(\mu\) is the expected rate of return on the stock.

To derive the Black-Scholes, we have looked at some outside resource (Watkins). Following its treatment, we need to make a few assumptions at the beginning:

Ito’s lemma which states that \[dC = [\frac{\partial C}{\partial t} + \frac{\partial C}{\partial S} \mu S + \frac{1}{2}\frac{\partial ^2 C}{\partial S^2}\sigma^2 S^2]dt + \frac{\partial C}{\partial S}\sigma SdZ\] This is a relatively simple assumption by taking the first two terms of the Taylor expansion of \(C\).
The change in stock price can be modeled by \(dS = \mu Sdt + \sigma SdZ\). Like in the Ito’s lemma, there is a \(dZ\) term in the change of stock price which warrants more explanation that will be done in later text. For now, it is enough that we know \(Z(t)\) is a random variable which represents some nondeterministic component of a price movement.

Now we are ready to execute the final derivation. First, we construct a portfolio which consists of a single written call (“written” means that you are on the sell side) and \(\delta\) shares of the underlying stock. Denote the value of the portfolio as \(V\). Then we can have the change in value of our portfolio as \[dV = \delta dS - dC\] To hedge our risk, we should aim to reduce the exposure of our portfolio to the price movement of the underlying stock, which can be done by various “hedging” methods, the most common among which is “delta” hedge (“delta” will be explained more thoroughly when we talk about Greeks later). This require us to buy \(\frac{\partial C}{\partial S}\) shares of stocks. Therefore, we should set \(\delta = \frac{\partial C}{\partial S}\). So, \[dV = \frac{\partial C}{\partial S}dS - dC\] Then, recalling Ito’s lemma, we can substitute \(dC\) by its \(dt\) and \(dZ\) parts. That is, \[dV = \frac{\partial C}{\partial S}dS - [\frac{\partial C}{\partial t} + \frac{\partial C}{\partial S} \mu S + \frac{1}{2}\frac{\partial ^2 C}{\partial S^2}\sigma^2 S^2]dt - \frac{\partial C}{\partial S}\sigma SdZ\] We can also plug in \(dS = \mu Sdt + \sigma SdZ\) and get \[dV = \frac{\partial C}{\partial S}[\mu Sdt + \sigma SdZ] - [\frac{\partial C}{\partial t} + \frac{\partial C}{\partial S} \mu S + \frac{1}{2}\frac{\partial ^2 C}{\partial S^2}\sigma^2 S^2]dt - \frac{\partial C}{\partial S}\sigma SdZ\] Combining and simplifying terms, we see that all the \(dZ\) terms cancel out and we are left with \[dV = [-\frac{\partial C}{\partial t} - \frac{1}{2}\frac{\partial ^2 C}{\partial S^2}\sigma^2 S^2]dt\] Since we have got rid of the random variable part of \(dV\), our portfolio will just earn the risk free rate: \[dV = rVdt = r[\frac{\partial C}{\partial S}S - C]dt\] Plugging this back into our original equation and reorganizing the terms, we will end up with: \[\frac{\partial C}{\partial S}rS+ \frac{1}{2}\frac{\partial ^2 C}{\partial S^2}\sigma^2 S^2 + \frac{\partial C}{\partial t} = rC\] Since the Black-Scholes covers the pricing of a large range of options other than the call, we often use \(V\) instead of \(C\) and write the equation as \[\frac{\partial V}{\partial S}rS+ \frac{1}{2}\frac{\partial ^2 V}{\partial S^2}\sigma^2 S^2 + \frac{\partial V}{\partial t} = rV\] This is the version of the Black-Scholes equation that we will be working with in subsequent sections.

Solving the Black-Scholes

In this section, we will provide a complete solution to the Black-Scholes under the assumption that both volatility \(\sigma\) and risk-free interest rate \(r\) are constant. Note that this seemingly harmless assumption is not always (in fact, never) true since volatility, as its name implies, is volatile! And interest rate fluctuates a lot as well. Acknowledging these facts, we will introduce some complications at the end of this section, but first we will restrict our sight to constant \(\sigma\) and \(r\).

In terms of the materials we have learned, we notice a similarity of the Black-Scholes to heat equation, but besides a Laplacian term (second order derivative), the Black-Scholes also has a \(\frac{\partial V}{\partial S}\) term, which complicates things a bit. In an exhaustive paper by Francois Coppex (Coppex 2009), a solution was provided by transforming the Black-Scholes first to a parabolic form via changes of variables, and then to a standardized diffusion equation, which can then be solved by means of Fourier transform. Below, I will outline his proof.

(a) Reduce the Black-Scholes to a “Parabolic” Form

Conduct the following changes of variables: \[S = Ke^x\] \[V(S, t) = Kv(x, \tau)\] \[\tau = (T - t)\sigma^2/2\] Then rewriting the partial derivatives, we get \[\frac{\partial V}{\partial t} = - K \frac{\sigma^2}{2} \frac{\partial v}{\partial \tau}\] \[\frac{\partial V}{\partial S} = \frac{K}{S}\frac{\partial v}{\partial x}\] \[\frac{\partial^2 V}{\partial S^2} = \frac{\partial}{\partial S}(\frac{K}{S}\frac{\partial v}{\partial x}) = - \frac{K}{S^2}\frac{\partial v}{\partial x} + \frac{K}{S^2}\frac{\partial^2 v}{\partial x^2}\] Plugging these back into the original Black-Scholes and letting \[a = \frac{2r}{\sigma^2} - 1\] and \[b = -\frac{2r}{\sigma^2} = -(1 + a)\] We will get \[\frac{\partial v}{\partial \tau} = \frac{\partial^2 v}{\partial x^2} + a\frac{\partial v}{\partial x} + bv\]

(b) Further Reduction to a Diffusion Equation

Let \[v(x, \tau) = f(\tau)g(x)h(x, \tau)\] Then we calculate the partial derivatives: \[\frac{\partial v}{\partial \tau} = (\partial_{\tau}f)gh + fg(\partial_{\tau}h)\] \[\frac{\partial v}{\partial x} = f(\partial_x g)h + fg(\partial_x h)\] \[\frac{\partial^2 v}{\partial x^2} = f(\partial_x^2 g)h + 2f(\partial_x g)(\partial_x h) + fg(\partial_x^2 h)\] Further, transform \(f(\tau)\) and \(g(x)\) such that \[f(\tau) = c_1e^{\hat{f}(\tau)}\] \[g(x) = c_2e^{\hat{g}(x)}\] Inserting all these changes of variables into the parabolic form, we will have \[(\partial_{\tau} h) = (\partial_x^2 h) + (\partial_x h)[2(\partial_g \hat{g}) + a] + h[-(\partial_{\tau} \hat{f}) + (\partial_x^2 \hat{g}) + (\partial_x \hat{g})^2 + a(\partial_x \hat{g}) + b]\] Setting the coefficients of \((\partial_x h)\) and \(h\) to zero, we will get \[\hat{g}(x) = - \frac{ax}{2} + c_1\] \[\hat{f}(\tau) = (b - \frac{a^2}{4})\tau + c_2\] Then we will be left with \[\frac{\partial h}{\partial \tau} = \frac{\partial^2 h}{\partial x^2}\] which is a standard diffusion equation.

(c) Solving the Diffusion with Fourier Transform

If we denote the Fourier transform of \(h\) as \(\mathcal{F}(h) = \hat{h}(k)\). Then the original diffusion equation \[\frac{\partial h}{\partial \tau} = \frac{\partial^2 h}{\partial x^2}\] will be transformed into \[\frac{\partial \hat{h}}{\partial \tau} = -k^2 \hat{h}\] where \(k = 2\pi ix\). Note that the derivative on the left side is with respect to \(\tau\), so \(k\) (though contains x) contributes as a contant coefficient in front of the \(\hat{h}\) term. We know the solution to this equation is \[\hat{h}(k, \tau) = \hat{h}(k, 0)e^{-k^2\tau}\] in which \(\hat{h}(k, 0)\) is the initial condition.

To get solutions in the original space of functions, we need to apply inverse Fourier transform. Let \[\mathcal{F}(h_1) = \hat{h_1} = e^{-k^2\tau}\] \[\mathcal{F}(h_2) = \hat{h_2} = \hat{h_2}(k, 0)\] Consulting the Fourier transform table, we have \[\mathcal{F}^{-1}(\hat{h_1}) = h_1 = \frac{1}{\sqrt{2\tau}}e^{- x^2/(4\tau)}\] \[\mathcal{F}^{-1}(\hat{h_2}) = h_2 = h(x, 0)\] Since the inverse Fourier of the product of two functions is the convolution of their respective inverse Fourier (Dyke 2014), we can have \[h(x, \tau) = (h_1*h_2)(x, \tau) = \frac{1}{\sqrt{4\pi \tau}} \int_{\mathbb{R}} e^{-\frac{(x - \lambda)^2}{4\tau}}h(\lambda, 0)d\lambda\] Thus the proof is complete.

(d) Numerical Approach

In the previous subsection, we have solved the Black-Scholes “formally”. However, it is not always guaranteed that the integral \[\int_{\mathbb{R}}e^{-\frac{(x - \lambda)^2}{4\tau}}h(\lambda, 0)d\lambda\] gives an analytic expression. At such time, numerical approximation can be used to recover the equation. Let us take on a concrete example. In \[\frac{\partial V}{\partial S}rS+ \frac{1}{2}\frac{\partial ^2 V}{\partial S^2}\sigma^2 S^2 + \frac{\partial V}{\partial t} = rV\] set \(r = 0\) (which is not an unrealistic assumption for interest rate during bad times, when the Federal Reserve does everything they can to push the economy). Further suppose \(\sigma S = 1\), that is, higher price leads to lower volatility in a perfectly reciprocal manner. Then we just need to solve \[V_t = -\frac{1}{2} V_{SS}\] For illustrative purpose, we plan to drop the coefficient in front of \(V_{SS}\) and just look at the standardized heat equation \[V_t = V_{SS}\] The numerical technique we are going to demonstrate is called forward difference method (Sauer 2012). The general idea is to lay down a “grid” in the coordinate plane of \(\mathbb{S} \times \mathbb{t}\) and to discretize the partial differential equation with a “mesh”. We begin by dividing the \(S\) interval of interest into \(M\) equally spaced pieces and the \(t\) interval into \(N\) pieces. Define \(h = (S_2 - S_1)/M\) and \(k = (t_2 - t_1)/N\) as the step sizes in the \(S\) and \(t\) direction. Applying the centered-difference formula for the second derivative to the \(S\) variable yields \[V_{SS}(S, t) = \frac{1}{h^2}(V(S + h, t) - 2V(S, t) + V(S - h, t))\] Similarly, the first derivative with respect to \(t\) can be approximated by \[V_t(S, t) = \frac{1}{k}(V(S, t + k) - V(S, t))\] Plugging these approximations into \((S_i, t_j)\) for \(V_t = V_{SS}\), we get \[\tilde{V}_{i, j + 1} = \sigma \tilde{V}_{i + 1, j} + (1 - 2\sigma)\tilde{V}_{i, j} + \sigma \tilde{V}_{i - 1, j}\] where \(\sigma = \frac{Dk}{h^2}\). Here we are using \(\tilde{V}\) instead of \(V\) to indicate the approximation of the true value. Switching to a matrix representation, we can expect \[ \begin{pmatrix} \tilde{V}_{1, j + 1}\\ . \\ . \\ . \\ \tilde{V}_{m, j + 1} \end{pmatrix} = \begin{pmatrix} 1 - 2\sigma & \sigma & 0 & ... & 0\\ \sigma & 1 - 2\sigma & \sigma & ... & 0 \\ 0 & \sigma & 1 - 2\sigma & ... & 0 \\ ... & ... & ... & ... \\ 0 & ... & 0 & \sigma & 1- 2\sigma \end{pmatrix} \begin{pmatrix} \tilde{V}_{1, j}\\ . \\ . \\ . \\ \tilde{V}_{m, j} \end{pmatrix} + \sigma \begin{pmatrix} \tilde{V}_{0, j}\\ 0 \\ ... \\ 0 \\ \tilde{V}_{m + 1, j} \end{pmatrix} \] where \(m = M - 1\). For intervals of interest, we set \(0 \leq t \leq 1\) and \(2100 \le S \le 2110\) (today’s S&P). Note that we also need proper boundary conditions to serve as the starting points for our simulation. In this example, let us assume a Dirichlet type (as opposed to Neumann type) boundary condition dictated by \[V(S, 0) = \delta (S - 2100) + V(2100, 0)\] \[V(2100, 0) = \theta t + V(2100, 0)\] \[V(2110, t) = \theta t + V(2110, 0)\] where \(\delta\) and \(\theta\) are parameters whose economic meanings will be explained more thoroughly in later sections. Building all these in R, we will have

forward_dif <- function(s1, s2, t1, t2, D, M, N, delta, theta){
  
  # Use forward difference method to approximate the solution of
  # a standardardized heat equation
  #
  # Args:
  #   s1: the lower bound of the s dimension.
  #   s2: the upper bound of the s dimension.
  #   t1: the lower bound of the t dimension.
  #   t2: the upper bound of the t dimension.
  #   D: coefficient in front of the second derivative in the heat
  #      equation.
  #   M: number of intervals along the s axis
  #   N: number of intervals along the t axis
  #   delta + theta: parameters
  #
  # Returns:
  #   A list, whose first slot is s, second slot is t, third slot is w
  #   (approximation of v) and fourth slot is sigma, a stability criterion.
  
  m <- M - 1; h <- (s2 - s1)/M; k <- (t2 - t1)/N
  sigma <- D*k/(h^2)
  w <- matrix(0, nrow = N + 1, ncol = M + 1)
  x <- matrix(rep(s1 + h*0:M, N + 1), nrow = N + 1, byrow = T)
  t <- matrix(rep(t1 + k*0:N, M + 1), ncol = M + 1)
  w[1, ] <- delta*h*0:M
  w[, 1] <- w[1, 1] + theta*k*0:N; w[, M + 1] <- w[1, M + 1] + theta*k*0:N
  A <- diag(rep(1 - 2*sigma, m)) +
    rbind(cbind(rep(0, m - 1), diag(rep(sigma, m - 1))), t(rep(0, m))) +
    rbind( t(rep(0, m)), cbind(diag(rep(sigma, m - 1)), rep(0, m - 1)))
  for(i in 1:N){
    w[i + 1, 2:M] <- A%*%w[i, 2:M] + c(w[i, 1], rep(0, m - 2), w[i, M + 1])
  }
  return(list(x, t, w, sigma))
}

result <- forward_dif(s1 = 2100, s2 = 2110,
                   t1 = 0, t2 = 10, D = 1,
                   M = 20, N = 1000,
                   delta = 10, theta = 10)

scatterplot3d(as.vector(result[[1]]),
              as.vector(result[[2]]),
              as.vector(result[[3]]),
              rainbow(length(as.vector(result[[1]]))),
              col.axis="black", col.grid="lightblue",
              xlab = "S", ylab = "t", zlab = "V",
              main = "Forward Difference Approximation of Heat Equation")

However, caution must be taken that the \(S\) and \(t\) intervals must be divided into pieces small enough. Otherwise, we might run the risk of a non-convergent, or, using the jargon of numerical analysis, a “unstable” solution. More specifically, the Von Neumann Theorem stipulates that the forward difference method is stable if and only if \[\frac{Dk}{h^2} \le \frac{1}{2}\] The proof of this theorem requires linear algebra knowledge about the relationship between the spectral radius of a square matrix and the convergence of its resulting linear transformation. We do not plan to dive into the technicalities here.

(e) Complications

Nonconstant interest rate

In the previous solution, we have assumed a constant risk free rate \(r\). However, interest rates fluctuate! There are many models (Damiano Brigo 2001) which explain its fluctuation. For example, the Vasicek model predicts the change in interest rate as \[dr = a(b - r)dt + \sigma_rdW(t)\] where \(a\), \(b\) and \(\sigma_r\) are all constants. This makes intuitive sense because we can expect the movement of interest rate to be determined by both a mean-reverting main term \(a(b - r)\) which might reflects some underlying fundamental economic driver, as well as a random Wiener process \(dW(t)\) whose effect is amplied through a constant volatility \(\sigma_r\). In practice, \(dW(t)\) and \(dB(t)\) are often used interchangeably and standard Brownian motion might be a more familiar name. Here, we will just mention the solution to the Vasicek model: \[r(t) = r(0)e^{-at} + b(1 - e^{-at}) + \sigma_r e^{-at}\int_0^t e^{as}dW(s)\] Of course, the Vasicek model is flawed in many ways (for example, it will allow interest rate to go to negative infinity), so many other models such as the Cox-Ingersoll-Ross model and the Black-Derman-Toy model were invented to fix its shortcomings. In fact, modeling short rate (i.e. instantaneous spot rate) is a nontrivial branch of financial economics by itself and our point here is simply that a constant \(r\) is too naive an assumption.

Nonconstant volatility

In our solution, we have assumed an unvarying volatility \(\sigma\), which may be unrealistic in practice. There is an enormous body of research (Ladokhin 2009) on volatility modeling. Of various models, historical volatility models (e.g., Exponential Weighted Moving Average), implied volatility, and autoregressive conditional heteroskedastic models (e.g., the GARCH family of models) are some of the common techniques. Here, we will mention a relatively simple one - Exponentially Weighted Moving Average (EWMA), which is a metric suggested by the Risk Metrics framework (J.P. Morgan/Reuters, 1996) defined as follows: \[\hat{\sigma}_{t + 1} = \sqrt{(1 - \lambda)\sum_{i = 1}^n \lambda^{i - 1}(r_{t + 1 - i} - \bar{r})^2}\] where \(\lambda\) is the parameter of the model (decay factor); \(r_i\)’s are previously observed returns; \(\bar{r}\) is exponentially weighted moving average mean of \(r_i\)’s given by \[\bar{r} = (1 - \lambda)\sum_{i = 1}^n \lambda^{i - 1}r_{t + 1 - i}\] Also note that we write \(\hat{\sigma}\) to indicate this is only an estimation of the true volatility \(\sigma\). To show why \(\hat{\sigma}_{t + 1}\) might differ from \(\hat{\sigma}_t\), i.e. varying volatility, we can write \(\hat{\sigma}_{t + 1}\) in a recursive expression as: \[\hat{\sigma}_{t + 1} = \sqrt{\lambda \hat{\sigma}_t^2 + (1 - \lambda)r_t^2}\] If we are to assume \(\hat{\sigma}_{t + 1} = \hat{\sigma}_t = \hat{\sigma}\), then we will get \[\hat{\sigma} = \sqrt{\lambda \hat{\sigma}^2 + (1 - \lambda)r_t^2}\] which gives \[\hat{\lambda} = r_t\] which does not necessarily (in fact, rarely) hold.

Greek Measures

For a given option, its so-called “Greeks” are simply a family of partial derivatives of its value with respect to various underlying characteristics. Here is a list of commonly used Greeks:

Delta: \[\Delta = \frac{\partial V}{\partial S}\] It is the partial derivative of the option value with respect to the price of the underlying asset.
Gamma: \[\Gamma = \frac{\partial^2 V}{\partial S^2}\] It is the second order partial derivative of the option value with respect to the price of the underlying asset.
Theta: \[\Theta = \frac{\partial V}{\partial t}\] It is the partial derivative of the option value with respect to the time to maturity, sometimes denoted on a daily basis as: \[\Theta = \frac{1}{365}\frac{\partial V}{\partial t}\]
Vega: \[v = \frac{\partial V}{\partial \sigma}\] It is the partial derivative of the option value with respect to the volatility, sometimes denoted by basis point as: \[v = 0.01 \frac{\partial V}{\partial \sigma}\]
Rho: \[\rho = \frac{\partial V}{\partial r}\] It is the partial derivative of the option value with respect to the risk free interest rate, sometimes denoted by basis point as: \[\rho = 0.01 \frac{\partial V}{\partial r}\]

Greeks are important for approximating microscopic changes in option prices as well as hedging positions. From an industry point of view, they are often examined against different times to maturity as the below graph (Ben-Michael 2013) shows.

The study of Greeks is a broad field by itself and interested readers should consult outside resources (Weishaus 2014). Here, we use Greeks simply to rephrase the Black-Scholes: \[\Delta rS+ \frac{1}{2}\Gamma \sigma^2 S^2 + \Theta = rV\] Now, we plan to spend two paragraphs introducing some applications of the Black-Scholes. For a European style call option, its value at the time of expiry is the difference between the asset price \(S\) and a pre-specified strike price \(K\), i.e., \(V(S, T) = max\{(S - K), 0\}\). With this explicit payoff formula, we can go through the changes of variables specified by our solution in the previous section to get \(h(\tau, 0)\) and calculate the integral, before changing everything back into the orginial variables. Omitting those computation, we simply point out that the integral turns out to be doable and yields a neat expression: \[C(S, t) = S^{-\delta (T - t)}\Phi(d_1) - K^{-r(T - t)}\Phi(d_2)\] Here, \[d_1 = \frac{\ln (S/K) + (r - \frac{\sigma^2}{2})(T - t)}{\sigma \sqrt{T - t}}\] \[d_2 = \frac{\ln (S/K) + (r + \frac{\sigma^2}{2})(T - t)}{\sigma \sqrt{T - t}}\] where \(\Phi\) is the cumulative probability function of the normal distribution; \(S\) is the initial price of the underlying asset; \(K\) is the strike price; \(\delta\) is the dividend of the underlying asset; \(T - t\) is the duration of the option; \(\sigma\) is the volatility; \(r\) is the risk free interest rate. Of course, we have made a simplified assumption that all the dividend payments of the asset will be fully reinvested.

Similarly, the price of a European style put option can be determined. In fact, we only need to make minor changes to the pricing formula of the call option to get \[P(S, t) = K^{-r(T - t)}\Phi(-d_2) - S^{-\delta (T - t)}\Phi(-d_1)\] At last, we want to mention that various other financial derivatives like future options, interest rate options, bond options, etc., can all be thus priced using the Black-Scholes framework.

From Random Walk to Brownian motion

A one-dimensional random walk (Mark A. Pinsky 2011) of a particle on integers can be defined as follows. Suppose a particle starts off at \(0\) and its position after taking \(n\) steps is denoted as \(S(n)\), which depends only on \(S(n - 1)\) and nothing beyond. More specifically, we have our “one-step” conditional probability as:

\[ S(n) = \begin{cases} S(n - 1) + 1 & \text{Probability} = p_n \\ S(n - 1) & \text{Probability} = r_n \\ S(n - 1) - 1 & \text{Probability} = q_n \end{cases} \]

where \(p_n + r_n + q_n = 1\). Here, \(p_n\) denotes the probability that the particle will move one unit right; \(r_n\) denotes the probability that the particles remains at the same position; and \(q_n\) detnotes the probability that the particles moves one unit left. Note that variations of this “baseline” model abound. For example, we might dictate that the particle cannot stay still for each move, in which case \(r_n = 0\). Now, let’s actually realize a random walk in R.

randwalk <- function(step = 1, start, t1, t2){
  
  # Computes the path of a one-dimensional random walk
  #
  # Args:
  #   step: the number of steps to be taken within each unit time;
  #         default is 1.
  #   start: the starting point of the random walk.
  #   t1: the starting time of the walk.
  #   t2: the end time of the walk.
  #
  # Returns:
  #   A n by 2 matrix whose first column is time indices and second column
  #   is the position at that time.
  
  x <- matrix(0, nrow = 1 + step*(t2 - t1), ncol = 2)
  x[1, 1] <- t1; x[1, 2] <- start
  for(i in 2:nrow(x)){
    x[i, 1] <- (i - 1)/step + t1
    x[i, 2] <- x[i - 1, 2] + sample(c(-1, 1), 1)/sqrt(step)
  }
  x <- data.frame(x); colnames(x) <- c("time", "price")
  return(x)
}

# Let's plot the path of a 500-step random walk
x <- randwalk(start = 0, t1 = 0, t2 = 500)
plot(x$price~x$time, col = rainbow(5), xlab = "time", ylab = "position")

This idea can be extended to accommodate higher dimensions in a similar fashion. The only additional thing we need is a distance metric and a correponding conditional probability distribution which are required for determining next steps. Below is a simulation for a random walk in 3D.

randwalk3d <- function(step = 1, start, t1, t2){
  
  # Computes the path of a one-dimensional random walk
  #
  # Args:
  #   step: the number of steps to be taken within each unit time;
  #         default is 1.
  #   start: the starting point of the random walk.
  #   t1: the starting time of the walk.
  #   t2: the end time of the walk.
  #
  # Returns:
  #   A n by 2 matrix whose first column is time indices and subsequent
  #   columns are the position indices x, y and z.
  
  x <- matrix(0, nrow = 1 + step*(t2 - t1), ncol = 4)
  x[1, 1] <- t1;
  x[1, 2] <- start[1]; x[1, 3] <- start[2]; x[1, 4] <- start[3]
  for(i in 2:nrow(x)){
    x[i, 1] <- (i - 1)/step + t1
    x[i, 2:4] <- x[i - 1, 2:4]; rand <- sample(0:2, 1)
    x[i, 2 + rand] <- x[i, 2 + rand] + sample(c(-1, 1), 1)/sqrt(step)
  }
  x <- data.frame(x); colnames(x) <- c("time", "x", "y", "z")
  return(x)
}

# plotting
x <- randwalk3d(step = 100, start = c(0, 0, 0), t1 = 0, t2 = 1000)
scatterplot3d(x[, 2], x[, 3], x[, 4], rainbow(nrow(x)),
              col.axis="black", col.grid="lightblue",
              xlab = "X", ylab = "Y", zlab = "Z",
              main = "3D random walk")

Random walks have great application in option pricing. For example, the price movement of a stock can be modeled by a binomial or multinomial tree. For each period, the price of a stock will go to one of the pre-specified nodes, with a probability that can be deduced from the risk-neutral assumption which equates the projected profit of buyers with that of sellers. It is not our intention to dive into too much technical details about binomial or multinomial tree pricing, but the below graph (“Constructing Binomial Tree to Describe Stock Price Movement” 2015) shows roughly how it works.

This is a binomial tree. At each node, there is a probability of \(p\) that the stock price will increase by a factor of \(u\) and a probablity of \(1 - p\) that the stock price will fall by a factor of \(d\). Starting from an initial price of \(S_0\), this will be a model of five periods. Of course, complications arise in many ways and the functionality of binomial trees is not restricted to plain stock either. In fact, more often than not, these trees are used for pricing complicated financial derivatives of nonlinear relationship and running Monte Carlo simulations.

While random walk and trees can model discrete changes nicely, the real world is much more volatile, and a closer-to-truth model that can describe continuous changes is needed. Standard Brownian motion is such a continuous model that can be viewed as a limit of basic random walks where \(p_i = q_i = 0.5\). Instead of shifting a single unit each unit time, we can engineer the particle to move \(\frac{1}{\sqrt{n}}\) units of distance each \(\frac{1}{n}\) units of time. This movement becomes a continuous process if we let \(n \to \infty\) to infinity. A standard Brownian \(B(t)\) defined this way has three important properties:

\(B(0) = 0\) and \(B(t)\) is a continuous function of \(t\).
Every increment \(B(s + t) - B(s)\) follows normal distribution with mean zero and variance \(t\).
For every \(0 \leq t_1 \leq t_2 \leq t_3 \leq t_4\), \(B(t_4) - B(t_3)\) is independent from \(B(t_2) - B(t_1)\).

In fact, these three properties are so crucial that some other literatures use them as an equivalent defination for the standard Brownian motion. However, since our definition is by means of random walk, these results should not appear to be immediately obvious. Serious readers should consult external sources for their proofs (Prasad Chalasani 1997), which is a good exercise of the Law of Large Numbers and the Central Limit Theorem. Note that multiple variations of Brownian motion exist. For example, an arithmetic Brownian motion is defined as: \[AB(t) = (\mu - \frac{1}{2}\sigma^2)t + \sigma B(t)\] where \(\mu\) is called drift factor and \(\sigma\) is called diffusion factor; for another example, a geometric Brownian motion is defined as: \[GB(t) = GB(t_0)e^{AB(t)}\] where \(AB(t)\) is an arithmetic Brownian motion. Other equivalent definitions exist. For instance, a geometric Brownian motion \(GB(t)\) can also be defined by: \[\frac{d(GB(t))}{GB(t)} = \mu dt + \sigma dB(t)\] In fact, solving this stochastic differential equation yields our first definition.

Brownian motion is favored by many researchers due to its Markov property, i.e., incrementals during nonoverlapping intervals in the time space are independent from each other. Of course, there are other nice properties that people want their models to fulfill, such as mean-reverting, etc, so various other stochastic models were invented, quite a few of which used Brownian motion as their basic building blocks. Here, it suffices for our purpose just to mention the names of some of these models: Gaussian process, Ornstein-Uhlenbeck process, position process, etc. Finally, we will end this section with a R simulation of stock price movement using a quasi-geometric Brownian motion.

geometric.randwalk <- function(step, start, t1, t2, mu, sigma){
  
  # Simulate a geometric Brownian motion using random walk
  #
  # Args:
  #   step: the number of steps to be taken within each unit time.
  #   start: the starting price of the stock at time 0.
  #   t1: the starting time of the process.
  #   t2: the end time of the process.
  #   mu: drift coefficient
  #   sigma: diffusion coefficient
  #
  # Returns:
  #   A n by 2 matrix whose first column is time indices and second
  #   column is the stock price at the corresponding time.
  
  x <- randwalk(step = step, start = 0, t1 = t1, t2 = t2)
  x[, 2] <- start*exp(sigma*x[, 2] +
                        (mu - 0.5*sigma^2)/step*1:nrow(x))
  return(x)
}

# plot the path of a stock price movement!
x <- geometric.randwalk(step = 500, start = 100,
                        t1 = 0, t2 = 1,
                        mu = 0.2, sigma = 0.1)
plot(x$price~x$time, col = rainbow(5),
     xlab = "time", ylab = "position")

Bibliography

Ben-Michael, Martha. 2013. “Options Selling and Stock Price Movement.” Variance Solutions, January.

“Constructing Binomial Tree to Describe Stock Price Movement.” 2015. Finance Train.

Coppex, Francois. 2009. “Solving the Black-Scholes Equation: A Demystification,” December.

Damiano Brigo, Fabio Mercurio. 2001. Interest Rate Models – Theory and Practice with Smile, Inflation and Credit (2nd ed. 2006 ed.). Springer Verlag.

Dyke, Phil. 2014. An Introduction to Laplace Transforms and Fourier Series. Springer.

Ladokhin, Sergiy. 2009. “Volatility Modeling in Financial Markets.” VU University Amsterdam, July.

Mark A. Pinsky, Samuel Karlin. 2011. An Introduction to Stochastic Modeling. Elsevier Inc.

Prasad Chalasani, Somesh Jha. 1997. Stochastic Calculus and Finance.

Sauer, Timothy. 2012. Numerical Analysis. Pearson Education, Inc.

Watkins, Thayer. “Derivation of the Black-Scholes Equation for Option Value.” San Jose State University Department of Economics.

Weishaus, Abraham. 2014. ASM Study Manual for Exam MFE-Financial Economics-9th ed, Seventh Printing NEW EDITION. Actuarial Study Material.

An Introduction to the Black-Scholes-Merton Equation

Yuanchu Dang

December, 2015