I’d like to acknowledge that, I have faithfully and to the best of my abilities, tried to abide by the norms expected by Heriot-Watt University of its students. I have taken especial care to list out all of my referrences and calim none of the research as my own. The Bibliography at the end of the paper should be sufficient to show the breadth of the reasearch undertaken by me.
I’d also like to thank my fellow colleagues Ms. Catherine McMaster, Mr. Muyang Wang & Mr. Alexandre Bryson, for their invaluable help in explaining with clarity some of the mathematical concepts encapsulated in this paper. For this I am indebted to them.
And lastly, I’d like to thank Prof. Johnston for this very practical assignment. Many concepts that I’d run into while working on other assignments finally made sense as I set to apply them for this one. It was truly fun and a great note to end the semester on.
Ron Sam John
H00306135
The calibrtion of the Vasicek model can be carried out by performing the maximum likelihood estimation. The likelihood function for the Vasicek model can be constructed from the normal density function given by[1]. This section is going to deal with deriving the parameter estimators used within the algorithm. The normal density function is goven by the equation:
\[\begin{equation} f(x;\theta) = \frac{1}{\sqrt{2 \pi \sigma^2}} \cdot exp(\frac{-1}{2}(\frac{x-\mu}{\sigma})^2) \end{equation}\]The exact solution of the Vasicek model is given by:
\[\begin{equation} r_t = \theta + (r_0-\theta)e^{-\kappa t} + \sigma \int_0^Te^{-\kappa(t-u)}dW_u \end{equation}\]In a later section, the method of deriving equation (2) will be briefly described. From the above expression we can calculate the conditional mean and variance and they are expressed as:
\[\begin{equation} E_s[x_t] = \theta + (x_s - \theta) e^{-\alpha(t-s)} \end{equation}\] \[\begin{equation} Var_s[x_t] = \frac{\sigma^2}{2\alpha} (1 - e^{-2\alpha(t-s)}) \end{equation}\]Equation (3) & (4) can then be substituted into equaiton (1) to get the desired likelihood function. For simplicity’s sake \(Var_s[x_t]\) will be substituted as \(V^2\). Thus we get the following:
\[\begin{equation} L = \Pi_{i=1}^n \frac{1}{\sqrt{2\pi V^2}} \cdot exp(\frac{-1}{2}(\frac{x_i - \theta - (x_{i-1} - \theta)e^{-\alpha \delta t}}{V})^2) \end{equation}\]Now that we have the likelihood we can take the log likelihood to simplify things. Taking log of equaton (5) we get:
\[\begin{align} l &= log(L) \\ &= \frac{-1}{2} log(\frac{1}{2\pi r^2}) + \Sigma^n_{i=1} log[exp(\frac{-1}{2}(\frac{x_i - \theta - (x_{i-1} - \theta)e^{-\alpha \delta t}}{V})^2)] \\ &= -\frac{1}{2} \Sigma_{i=1}^n [log(2\pi) + log(V^2) + \frac{(x_i - b x_{i-1} - c)^2}{V^2}] \end{align}\]We are now faced with the challenge of approximaating the parameters \(\theta\), \(b\) and \(V^2\). This should be as straightforward are carrying out the maximization of equation() with respect to the specified parameters. The resultant derivations are given below.
Firstly, we set about determining \(V^2\)
\[\frac{\partial l}{\partial V^2} = -\frac{1}{2}\Sigma^n_{i+1}[0 + \frac{1}{V^2} - \frac{(x_i - b x_{i-1} - c) (1)}{V^4})] = 0\] Now isolating the \(V^2\) terms we get:
\[\frac{n}{V^2} = \frac{\Sigma^n_{i-1}(x_i - bx_{i-1} - c)^2}{V^4}\] Therefore, ssubstituting for \(c = \theta(1 - b)\), we finally get,
\[\begin{equation} V^2 = \frac{1}{n} \Sigma^n_{i-1}(x_i - bx_{i-1} - \theta(1 - b))^2 \end{equation}\]Secondly, we can determine \(\theta\) is like manner,
\[ \frac{\partial l}{\partial \theta} = -\frac{1}{2}\Sigma_{i-1}^n[0 + \frac{1}{V^2} \cdot 2\{x_i - bx_{i-1} - \theta(1 - b)\}\{-(1-b)\}] = 0\] \[\Sigma^n_{i=1}(x_i - bx_{i-1}) = n\theta\Sigma^n_{i=1} (1-b)\] And isolating the \(\theta\) we finally get,
\[\begin{equation} \theta = \frac{\Sigma^n_{i=1}(x_i - bx_{i-1})}{n \Sigma^n_{i=1}(1 - b)} \end{equation}\]And finally, we can work to estimate the \(b\) parameter. Carrying out similar steps, by taking the partial derivative of the log likelihood with respect to \(b\) and equating to \(0\) we get,
\[\frac{\partial l}{\partial b} = -\frac{1}{2}\Sigma^n_{i=1}\{0 + 0 + \frac{2}{V^2}(x_i - bx_{i-1} - c)(0 - x_{i-1})\} = 0\] \[\Sigma^n_{i=1} \frac{2}{V^2}(x_i - bx_{i-1} - c)(-x_{i-1}) = 0\] \[\Sigma^n _{i=1} (-x_ix_{i-1} + bx^2_{i-1} + c x_{i-1}) = 0\]
Isolating the \(b\) we get,
\[\begin{equation} b = \frac{\Sigma^n_{i=1} x_i x_{i-1} - \Sigma^n_{i=1}c x_{i-1}}{\Sigma^n_{i=1} x^2_{i-1}} \end{equation}\]We can now look to find a value for \(c\) in the above equation by running the partial dericative of the log likelihood funciton with respect to c and plugging it into equation (9)
\[\frac{\partial l}{\partial c} = -\frac{1}{2}\Sigma^n_{i=1}\{0 + \frac{2}{V^2}(x_i + bx_{i-1} - c)(-1)\} = 0\] Isolating the \(c\) we finally get, \[\begin{equation} c = \Sigma^n_{i=1}\frac{(x_i - bx_{i-1})}{n} \end{equation}\]Plugging this back into equation (9), we get,
\[b = \frac{\Sigma^n_{i=1}x_ix_{i-1}-\Sigma^n_{i=1}\frac{(x_i - bx_{i-1})}{n}\cdot \Sigma^n_{i=1}x_{i-1}}{\Sigma^n_{i=1}x^2_{i-1}}\] \[b[n\Sigma^n_{i-1}x_{i-1}^2 + (\Sigma^n_{i=1}x_{i-1})^2] = n\Sigma^n_{i=1}x_ix_{i-1} - \Sigma^n_{i=1}x_i\Sigma^n_{i=1}x_{i=1}\]
And we finally get,
\[\begin{equation} b = \frac{n\Sigma^n_{i=1}x_ix_{i-1} - \Sigma^n_{i=1}x_i\Sigma^n_{i=1}x_{i=1}}{n\Sigma^n_{i-1}x_{i-1}^2 + (\Sigma^n_{i=1}x_{i-1})^2} \end{equation}\]In the chuck of code given below, the likelihood estimation is carried out by use of the maximum likilihood estimator given by[1]:
\[\hat{\alpha} = (X^TX)^{-1}X^T\] where,
\[ X = {\left(\begin{array}{cccc} 1 & 1 & \cdots & 1\\ r_0 & r_1 & \cdots & r_{n-1} \end{array}\right)}^T \]
and,
\[r = (r_1,r_2,\cdots, r_n)^T\] This has been implemented in the funciton below. As a result, the algorithm calibrates the model with the included data as input and with time step of \(dt = 1(day)\) .
#Chunk of code used to read in the LIBOR rates
data <- read.csv("D:/HWU MSc QFM/Semester 2/Advanced Derivative Pricing/Assign 3/data(2).csv", header = TRUE)
data <- data$D
data <- as.matrix(data)
#User generated fucntion for Parameter calibration.
Vasicek_Params <- function(data, dt) {
N <- length(data)
ones <- seq(1,1,length.out = (N-1))
x <- cbind(ones, data[1:(N-1)])
#Maximum Likelihood estimation
ols <- solve(t(x) %*% x) %*% (t(x) %*% data[2:N])
resid <- data[2:N] - (x %*% ols)
c<- ols[1]
b <- ols[2]
delta <- sd(resid)
alpha <- -(log(b)/dt)
theta <- c/(1-b)
sigma <- delta/sqrt((b^2-1)*dt/(2 * log(b)))
params <- c(alpha, theta, sigma)
print(params)
}
The output of the algorithm should give us the desired parameters of the Vasicek model given the data. The following snippet of code, runs the function and spits out the parameter values, alpha, theta and sigma respectively,
# Running funciton with inout as included data and dt=1. Output
# yields the alpha, theta and sigma respectively
params <- Vasicek_Params(data, 1)
## [1] 0.05839170 3.08194128 0.05815179
Based off the output of the parameters of the Vasicek model, we get from running the Ordinary Least Squares regression, we can model the yield curve for the given rates in the question.
Looking at the case of, say, pricing a bond, we know that price is given by the general expression:
\[\begin{equation} P(0,T) = E_Q[e^{\int_0^Tr_s ds}] \end{equation}\]We can now focus on modelling the interest rate \(r_s\) in the above equation with the Vasicek model and subsequently calibreating it to get parameter values.
The Vasicek model of interest rates is modelled by the process below. We’re changing the notation ever so slightly to be consistent with the underlying literature[6]:
\[\begin{equation} dr_t = \kappa(\theta - r_t)dt + \sigma dW_t \end{equation}\]\(\kappa\), \(\theta\) and \(\sigma\) are the parameters that we get from running the MLE estimation from Question 1. \(\kappa\) determines the speed at which mean reversion occues, \(\theta\) is the long run mean, i.e the mean reversion occurs around this quantity, and lastly \(\sigma\) is the volatility. The exact solution for Equation (15) can be found using Ito’s Lemma and is given by:
Therefore we can integrate it again and use the analysis to price bonds from Equation (13). We now have:
\[\begin{align} \int_0^T r_s ds &= \theta T + \int_0^T(r_0 - \theta) e^{-\kappa t}dt + \sigma \int_0^T\int_0^t e^{-\kappa(t-u)}dW_u \\ &= \theta T + \frac{1}{\kappa}(r_0 - \theta)(1 - e^{-\kappa T}) + \frac{\sigma}{\kappa}\int_0^T(1 - e^{-\kappa(t-u)} dW_u) \end{align}\]We can then compute the conditional mean anv variance of the above equation:
\[\begin{equation} E_Q[\int_0^Tr_s ds] = \theta T + \frac{1}{\kappa}(1-e^{-\kappa T})(r_0 - \theta) \\ \end{equation}\] \[\begin{align} Var_Q[\int_)^T r_s ds] &= \int_0^T (\frac{\sigma}{\kappa}(1 - e^{-\kappa(T-u)}))^2 du\\ &= \frac{\sigma^2}{\kappa^2}(T - \frac{2}{\kappa}(1-e^{-\kappa T}) + \frac{1}{2\kappa}(1 - e^{-2\kappa T})) \end{align}\]Thereby, we can now model the Yield Curve, by utillising the moment generating function, and we can thus write[6],
\[\begin{align} P(0,t) &= E_Q[e^{-\int_0^T r_s ds}]\\ &= exp\{-(\theta T + \frac{1}{\kappa}(1 - e^{\kappa T})(r_0 - \theta)) + \frac{\sigma^2}{2 \kappa^2}(T - \frac{2}{\kappa}(1 - e^{-\kappa T}) + \frac{1}{2\kappa}(1 - e^{-2\kappa T}))\} \end{align}\]In general Equation (22) can be simplified with the help of the following substitutions. For a bond at time \(t\) with maturity at \(T\) we can write it as:
\[G(t,T) = \frac{1}{\kappa}(1 - e^{-\kappa(T-t)})\] \[H(t,T) = (\theta - \frac{\sigma^2}{2 \kappa^2})[G(t,T) - T + t] - \frac{\sigma^2}{4\kappa} G(t,T)^2\]
and thus we get,
\[P(t,T) = e^{H(t,T) - G(t,T)r_t}\] Implementing these equations into the code, the yield curve for each of the rates can be computed and in plotted in Figure 1.
x_ti <- c(1.5,3,6) #initial values
T <- 1:220
#Setting up the equations for G and H to calculate the yield
G <- (1/params[1]) * (1 - exp(-params[1]*(T)))
H <- ((params[2] - (params[3]^2/(2*(params[1]^2)))) *(G-T)) -
((params[3]^2/(4*params[1]))*G^2)
#Initialising empty matrix to store results from calculatio of Yield
#3X220 matrix containing the yields of the 3 rates in each row.
r_mat <- matrix(0, nrow = 3,ncol = length(T))
for(i in 1:3){
for(j in 1:length(T)){
P <- exp(H[j] - (G[j] * x_ti[i]))
r_short <- -(log(P)/T[j])
r_mat[i,j] <- r_short
}
}
c1 <- max(r_mat)
c2 <- min(r_mat)
plot(x = T, y = r_mat[1,], type = "l", lwd = 2,
xlab = "Time to Maturity", ylab = "Yield", ylim = c(c2,c1)) #1.5 in black%
lines(x = T, y = r_mat[2,], type = "l", lwd = 2, col = 2) #3 in red%
lines(x = T, y = r_mat[3,], type = "l", lwd = 2, col = 4) #6 in blue%
Yield Curve Plot for given rates.
The essential format of conducting a Monte Carlo simulation, is to generate multiple paths for the process and evaluate its expected value. That is, given a a process \(X = (X_t)_{t \in [0,T]}\), we need to compute \(E[f(X)]\). Applying Monte Carlo simulation over the funciton and generating independent paths, \(X^{(n,1)}\), \(X^{(n,2)}\), … , \(X^{(n,N)}\), by repeatedly solving the discretised function, we get[2]: \[E(f(X))= \frac{1}{M}\Sigma^M_{j=1} f(X^{(n,j)})\] Where \(M\) is the numer of samples of the simulation, which in our case is \(M=5000\) for the case of standard Monte Carlo Simulation. When we move onto using variance reduction techniques with antithetic sampling we will shrink our sample size down to \(M=2500\).
underlying any Monte Carlo simulation is a discretition process that determines the paths for the same. Therfore it becomes important to define what form this process might take. For our analysis we will use the discretization scheme described in [3], given by,
\[\begin{equation} x(t_i) = c + bx({t_{i-1}}) + \delta \epsilon(t_i) \end{equation}\]where, \[c = \theta(1 - e^{-\alpha\Delta t})\] \[b = e^{-\alpha \Delta t}\] \[\delta = \sigma\sqrt{(1-e^{-2\alpha \Delta t})/2\alpha}\] with \(\epsilon(t_i)\) being Gaussian white noise \(\tilde{}N(0,1)\) and
This is a formulation if the AR(1) process having \(0<b<1\) when \(0<a\). This signifies that the AR(1) process is stationary and mean-reverting[3]. In implementing the scheme into the code, it is sufficient to use equation (23) as the underlying process for the Monte Carlo simulation. The parameters \(b\), \(c\) and \(\delta\) are obtained from the calibration algorithm from question 1, as the outputs of the likelihood estimation. As for the stochastic part of the process, it is sufficeint to sample from a standard normal distribution.
As part of our analysis, we will use the Monte Carlo simulation for a simulation of 220 daily LIBOR rates to estimate what the expected value of the LIBOR rate might be after 220 days and compare it with the actual data that has been presented to us. Subsequent errors can then be analysed for accuracy.
set.seed(500)
T <- 220
sd_ir <- sd(data)
mean_ir <- mean(data)
W <- rnorm(T)
dW <- cumsum(W)
t <- 1 #1 day window
theta <- params[1]
alpha <- params[2]
sigma <- params[3]
N <- length(data)
ones <- seq(1,1,length.out = (N-1))
x <- cbind(ones, data[1:(N-1)])
#x <- as.matrix(x)
ols <- solve(t(x) %*% x) %*% (t(x) %*% data[2:N])
resid <- data[2:N] - (x %*% ols)
c<- ols[1]
b <- ols[2]
delta <- sd(resid)
N_1 <-5000
r <- seq(0,0,length.out = T)
#Standard Monte Carlo
mat_MC <- matrix(0, nrow = N_1, ncol = T) #Initializing Zero matrixd
mat_MC[,1] <- data[1] #Setting initial data fort aall simulations
for(j in 1:N_1){
r[1] <- data[1]
for (i in 2:T){
r[i] <- c + (b * r[i-1]) + (rnorm(1)*delta)
mat_MC[j,i] <- r[i]
}
}
matplot(t(mat_MC[1:100,]), col = 8, type = "l",
xlab = "Time Horizon", ylab = "Rates")
lines(x = 1:T, y = colMeans(mat_MC), col = 2, lwd = 2)
Standard Monte Carlo simulation of LIBOR rates for 220 time steps. The red line shows the average value of all the simulations in each time step. We can notice a slight upward drift in the expected value.
The expected value can be computed as the mean of the end points of all the simulations. This is given by.
#Expected value at time t = 220 after 5000 simulations.
mean_ir_MC <- mean(mat_MC[,T])
mean_ir_MC
## [1] 3.081743
Comparing it to the actual data at \(t = 220\) we observe a slight deviation of 0.024847%.
#The last data point in the original data.
data[221]
## [1] 3.10659
This is not surprising as elements of sampling error can creep in. But what we do see is an elegant and simple solution for calculationg expected values. It is no wonder that Monte Carlo simulations are in wide use in the finance industry. Also, another point to note, is that as the sample size increases, we should see the expected value to converge to its true value. But this comes at a cost. The computational efficiency required is the necessary trade-off we make to get better, accurate results. In the next part we will discuss how better to circumvent this.
One of the drawbacks of the Vasicek model is its non-zero probability of giving us negative rates. While chances of observing negative rates are quite low, the existence of the possibility throws a major wrench when trying to price financial instruments. For this reason, many other models that preclude the possiblity of negatice rates have been developed. One of the more recognised in the Cox-Ingersoll-Ross(CIR) model. So, the simplest solution would be to change the underlying model beng used.
We can now look into the case of using variance reduction techniques on the simulation. The whole reason for embarking on variance reduction is to improve computational efficiency. And in the case of antithetic sampling, the method of implementation requires that if sampling is done from a standard normal distribution[4] ie \(N \tilde{} (0,1)\), we can utilize the pairs of samples given by \(Z\) & \(-Z\). Therefore, for half the number of simulations we have an equivalent number of samples to give us results within limits of accuracy as desired.
To illustrate this in Figure 3, the plot on the left shows modelled interest rates run over 5000 simulations of 220 timesteps. and the plot to the right shows the Monte Carlo simulation taken for a sample size of 2500 over 220 time steps. Accounting for the y-limits in the graphs to be equivalent, one can see that, the variance of the paths in the simulation conducted with half the sample size is far less.
#Monte Carlo with Variance Reduction using Antithetic Variants
N_2 <- 2500 #Modufying sample size to half the original
r_av_1 <- seq(0,0,length.out = T)
r_av_2 <- seq(0,0,length.out = T)
mat_MC_1 <- matrix(0, nrow = N_2, ncol = T)
mat_MC_1[,1] <- data[1]
sd_std <- seq(0,0,length.out = N_2)
sd_av <- seq(0,0,length.out = N_2)
for(j in 1:N_2){
r_av_1[1] <- data[1]
r_av_2[1] <- data[1]
for (i in 2:T){
r_av_1[i] <- c + (b * r_av_1[i-1]) + (rnorm(1)*delta)
r_av_2[i] <- c + (b * r_av_2[i-1]) - (rnorm(1)*delta)
mat_MC_1[j,i] <- (r_av_1[i] + r_av_2[i])/2
}
#Calculating the standard deviations for each simulated path
sd_std[j] <- sd(r_av_1)
sd_av[j] <- sd(mat_MC_1[j,])
}
par(mfrow = c(1,2))
matplot(t(mat_MC[1:100,]), type = "l", col = 8, ylim = c(2.4, 3.6)
,xlab = "Time Horizon", ylab = "Rates")
lines(x = 1:220, colMeans(mat_MC), col = 2, lwd = 2)
matplot(t(mat_MC_1[1:100,]), type = "l", col = 8, ylim = c(2.4, 3.6)
,xlab = "Time Horizon", ylab = "Rates")
lines(x = 1:220, colMeans(mat_MC_1), col = 2, lwd = 2)
The reduction is variance is apparent beween the 2 graphs. On the left is the simulated paths with 5000 samples. And on the right is the simulatedpaths with vaiance reduction implmented.
This reduction in variances can further be highlighted in the plot below, which shows the standard deviations for each of the 5000 sample paths of the simulation without variance reduction(in red) and 2500 sample paths of the simulation with variance reduction(in black). Although, only the first 500 standard deviations are plotted below for visual clarity(Figure 4).
plot(x = 1:500, y = sd_std[1:500], col = 2, type = "l", ylim = c(0.05,0.25),
xlab = "Time Step (0-100)", ylab = "Standard Deviation")
lines(x = 1:500, y = sd_av[1:500], col = 4)
The 2 lines in the plot show that the standard deviations for the simulated paths without variance reduction(red) and with(blue). As one observes the blue line is consistently below the red line, showing variation reduction is considerable.
Following the dictates of the question, the following plot shows one sample path from a Monte Carlo simulation for 220 time steps(black) compared against the plot of the original data(red). Although insignificant, the mean reversion can be observed as the path of the Monte Carlo simulation weaves in and around the long term mean(Figure 5).
set.seed(500)
MC_CV <- function(){
N <- 5000
result <- seq(0,0, length.out = T)
result_cv <- seq(0,0, length.out = T)
result[1] <- data[1]
result_cv[1] <- data[1]
r <- seq(0,0,length.out = T)
r_1 <- seq(0,0,length.out = T)
#Setting Initial values.
r[1] <- data[1]
r_1[1] <- data[1]
#Monte Carlo Sim that generates 221 samples of data.
for(i in 2:(T+1)){
#Standard MC Sim
r[i] <- c + (b * r[i-1]) + (rnorm(1)*delta)
}
plot(r, type = "l", ylim = c(2.6,3.6), xlab = "Time Horizon", ylab = "Rates") #Std MC
lines(x = 1:221, y = data, col = 2) # actual data
#Monte Carlo Sim to generate 5000 samples of data.
for(i in 2:N){
r_1[i] <- c + (b * r_1[i-1]) + (rnorm(1)*delta)
}
#Initialising an empty list to store results of calibrations.
list_of_vec <- list(2)
list_of_vec[[1]] <- r # 220 Samples
list_of_vec[[2]] <- r_1 # N=5000 samples
return(list_of_vec)
}
#Executes the user generated fucntion
results <- MC_CV()
A single Monte Carlo simulated path(black) compared to the movement of original LIBOR rates(red).
The snippet of code below is running the calibration with the maximum likelihood estimation technique implemented in Question 1 to get the parameters of the Vasicek model for a single set of 220 data points by means of Monte Carlo simulation.
#Parameter calibration with 220 samples from MC Simulaion
Vasicek_Params(results[[1]], 1)
## [1] 0.14612162 3.01316391 0.06201752
Comparing it with the parameters of the original data, one can see a large deviation is expected figures. This is to be expected as the sample size is quite small. Following the law of large numbers it stands to reason that, once the sample size has been increased, we should see the parameters converge more to the expected value or the true value. The following snippet of code gives us the result of the calibration for 5000 samples of data
#Parameter calibration with 5000 samples from MC Simulation
Vasicek_Params(results[[2]], 1)
## [1] 0.05347091 3.07247847 0.05731334
Comparing it with the original parameters from running the calibration on the original data we see that the sample size has considerable influence on the accuracy of results. And the previous results with a sample size of 5000 have results well within 10% of the original parameters.
#Parameter calibration with original data.
Vasicek_Params(data,1)
## [1] 0.05839170 3.08194128 0.05815179
The plotting of Yield curves given a set of parameters from the underlying model is subject to to variety. Considering that, the yield can vary if the underlying model is substituted for another. Therefore, the problem of accuracy in pricing bonds and other financial instruments becomes an issue. This is why more sophisticated models like the Heath-Jarrow-Merton or Black-Karasinski models have been developed to better capture market factors.
The elegant application of the law of large numbers in question 4 is remarkable. The most obvious conclusion being that, as the size of the sample increases, the convergence to the expected values is improved. Although, this does come with the downside that, in order for better accuracies, the computational power required increases significantly. Hence, this is a drawback that institutions must be aware of when carrying forward their analyses.
[1] Calibration of Interest Rates, J. Cerny, Charles University, Faculty of Mathematics and Physics, Prague, Czech Republic,WDS’12 Proceedings of Contributed Papers, Part I, 25-30, 2012.
[2] An Algorithmic Introduction to Numerical Simulation of Stochastic Differential Equations, Desmond J. Higham, 2001 Society for Industrial and Applied Mathematics, Vol. 43,No . 3,pp . 525-546.
[3] A Stochastic Processes Toolkit for Risk Management,Damiano Brigo, Antonio Dalessandro, Matthias Neugebauer, Fares Triki, 2007
[4] Monte Carlo methods for security pricing, Phelim Boyle, Mark Broadie, Paul Glasserman, Journal of Economic Dynamics and Control.
[5] Interest Rate Models: Theory and Practice, Damiano Brigo, Fabio Mercurio, Springer Finance, 2001.
[6] Lecture Notes, Chaper 5 & 6.