true

Abstract

The purpose of this disertation is to analyse financial stock data and run various statistical anlysis to the data, quantify uncertainty in data estimates that is pertinent to what industry professionals might be interested in. In addition, it will also undertake looking at possible trends, and inferences that can be made from said data. Some of the mentioned analyses will include fitting of various distributions to the data to ascertain the best fit. Statistical entities like the Akaike Information Criterion and the Bayes Information Criterion will help us pinpoint this. In addition, the value-at-Risk based on the fitted model will also be evaluated. And a series of bootstrapping estimates will be calculated with the aim of finding confidence intervals for the various parameters of the data.

1. Introduction

In recent times we’ve seen much fluctuations in the market. With the onset of the financial crisis in 2007-2008, the need for careful examination of financial trends and a holistic view of the market is needed. In addition to looking at the larger picture, there comes a time when the smaller pieces to the whole must also be subject to the greatest scrutiny. This dissertation aims to look at a portion of these smaller pieces. Greater scrutiny of these front-office operations as well as instituting newer, stricter regulations means having to be constantly on top of the processes that lead to big decisions. These decisions may prove beneficial or fatal to the future of financial companies. This is why a greater emphasis has been given to quantitative accuracy, while holding to regulations and beefing up security measures.

Financial companies and corporations are tasked with wealth management, transfer and control operations. Therefore, many sophisticated methods are employed with the aim of regulating, controlling and forecasting expected losses/profits or any other pertinent information. This project is interested in looking at a specific subset of these sophisticated methodologies. Specifically, we are intersted in analysing daily returns of a stock index in the marketplace. Data from various markets are quite easily available with the advent of information technology now. Specifically, using the programming language R and in concert with the quantmod package, this is easily facilitated.

1.1. Data

The data under scrutiny for this project will be stock data of General Motors(GM). The required dataset can easily be collated with the use of the quantmod package in R. All of the code used will be detaiiled in the appendix. Stock data usually entails a data table with multiple columns listing various numerics about the stock index. Our primary interest lies in looking at the daily returns of this stock. As such we are only interested in the “Adjusted Close Price” and we extract this by means of data wrangling and perform our analysis on this specific subset of the full dataset.

There are many reasons for choosing the adjusted close price. These prices are adjusted by companies at the end of a trading day to stay in sync with overall price flow. In addition, the data are usually transformed in some way to ensure the parsimonious nature of the data. This allows for consistent methods of analysis to be carried out while maintaining the integrity of the results. In our specific case, we are intersted in the difference of log prices(\(\Delta_t\)), i.e. \[\Delta_t = log(price_t) - log(price_{t-1})\]

Log prices prices come with the advantage that the change in prices over time can be factored in as a percentage change in stock price. Log prices also correspond to how certain models use them to model prices over time. Models, for example like the Black-Scholes Model.

2. Analysis-Fitting Distributions

As a first step it would do us good to analyse the time series plot of the daily adjusted close prices of the stock(Figure 1). This rudimentary graphical analysis will give an initial idea as the performance of the stock, indicating increasing or decrasing stock prices. This provides an initial look into the performance of the stock over time.

Daily Adjusted Close Prices for General Motors.

A cursory look at the plot shows that the stock had a decreasing trend from 2010 to 2012 experienceing a condiserable drop in stock value towards the end of 2013. Subsequetly the stock price picks up and experiences frequent ups and downs, or in more rigorous terms, the stock experiences considerable volatility. Towards the present time we can see that the stock is on the rise and may prove a lucrative investment for interested investors.

Though, a far more important quantity to analyze is the change in daily adjusted stock prices from day to day. The plot in Figure 2 shows us exactly that. The movements of the change in log prices on a day to day basis exhibit movements about a mean value.

Daily Adjusted Log Prices for General Motors.

The plot shows us an interesting phenomenon. Inspecting the plot reveals that there sections in the plot that show clustering. That is to say, there are specific time periods where volatilities seems to aggregate. For example, in the year 2011, a large cluster of volatilities is observed and seemingly dies down. This is a clear indicatior as to the performance of the stock in the market, with the prices fluctuating due to a number of market related factors. Although details of what those reasons might be are beyond the scope of this project. Similar clusters can be observed all throughout the timeline.

Another interesting graphical analytic we can use is to plot the histogram of the daily log prices. This is given in Figure 3. One intersting property of this plot is that, the deviations in the log prices are about \(0\)(roughly).

Histogram of Log Returns

Also, a cursory glance at the plot seems to indicate excess kurtosis and slight skewness, although more rigorous statical tests will have to be carried out to confirm that hypothesis, since we cannot be 100% certain that that is the case at this point in time.

2.1 Fitting empirical Distributions to data

The ideal next step would be to fit well known empirical distributions over the data to ascertain a good fit. Once that has been accomplished, we can refer to the parameters of the distribution in order to understand the data. In our prelimnary analysis of the data we ascertained that the data seem to display excess kurtosis i.e. the data exhibit the presence of fat tails. Before moving on to such deeper analysis let us posit that the data are normally distributed \(\sim N(\mu,\sigma^2)\)

2.1.1 Testing for Normality.

Let us assume our null hypothesis to be, \(H_0\) = data \(\sim N(\mu, \sigma^2)\). The normal distribution is recognized as one of the one of the cornerstone probability distributions is statistics. It is widely used in natural & social sciences to model random variables, the parameters for which are unknowm. It is also a continuous probability distribution. A random variable \(X\) is said to be normally distributed if[1]: \[n(x;\mu,\sigma) = \frac{1}{\sigma \sqrt{2\pi}}e^{\frac{1}{2}(\frac{x-\mu}{\sigma})^2}\]

for \(- \infty < x < \infty\) &

\(\sigma \neq 0\)

From the probability density given above, it is clear that there are 2 parameters that define the distribution, namely the mean or the expected value, \(\mu\) and the standard deviation \(\sigma\). We can use the maximum likelihood technique to calculate the parameters. \[\mu = \frac{1}{n}\Sigma_{i=1}^n x_i\] \[\sigma^2 = \frac{1}{n}\Sigma_{i=1}^n(x_i - \mu)^2\]

where \(n\) is the total number of datapoints in the set and \(x_i\) correspond to the datapoints themselves.

These are estimated to be:

MLE Estimates
Mean	Std. Deviation
0.0001428	0.0178529

Now that the MLE estimates are known, we can proceed with fitting the normal distribution over the distribution of the data to ascertain its fit. Figure 4 shows just that.

Fitting Normal Distribution with mean = 0.0001453 and std dev = 0.0178535, over histogram of Daily Adjusted Log Prices.

We can surmise that the normal distribution(magenta line) is a reasonable fit over the histogram. But we also observe that the peak of the histogram is not covered by the normal, nor are the tails. Hence our prelimnary observation of excess kurtosis seems to hold fruit. This can further be observed by plotting the QQ-plot for the data in Figure 5.

QQ plot for data fitted to Normal Distribution.

As is expected the QQ-plot reveals just how poorly the data fits the normal distribution. If the data were truly normally distributed, we would expect the sample quantile to be more or less in-line with the red line. Analysis of the tails shows that the points of the plot get steeper towards the left and the right. This indiciates a fair bit of symmetery. It also serves as an indication for fatter tails than the normal distribution. The blue dashed lines repesent the 5% and 95% quantiles of the sample data. So, while 90 percent of the data falls on the red line, the data points near the tail exhibit the greatest deviation. This in an important observation to note.

One final graphical analytic that can be used is the plot of the cumulative distribution function(CDF). By comparing the CDF’s of the data and a well known distribution like the normal, the fit of the distribution to the data can be evaluated and analysed. This is displayed in FIgure 6.

CDF Plot of data and Normal Distribution.

This plot clearly shows how poor a fit the normal distribution is to the data i.e. the solid black line. In later sections, we will deal with these issues by relying on distributions that account for fat tails.

More rigorous statistical tests will now be carried out to confirm our observations from the graphical analytic tools that the normal distribution is a poor fit for the data. For this purpose we make use of the Chi-squared goodness of fit test. It is also known as Pearson chi-squared test and holds some benefits, namely, the computation is fairly straighforward, as the count size increases it rapidly converges to the chi-squared distribution and the whole process is fairly intuitive.

In our case we split the data in 100 equal length cells into which we will classify the data. Subsequently, the \(\chi\)-squared statistic can be calculated as[2]: \[\chi^2 = \sum\{\frac{(observed - expected)^2}{expected}\}\]

The chi-squared test gives us a p-value of zero and a \(\chi\)-squared statistic of 277.4886 with 97 degrees of freedom, which suggests that the assumption of normality can readily be rejected.

2.1.2 Alternate Hypothesis,

Now that our null hypothesis has been rejected thoroughly, we can now posit an alternate hypothesis, namely, \(H_1\) = data \(\nsim N(\mu, \sigma^2)\). We can make use of the Jarque-Bera test to calculate statistics for the skewness and kurtosis, although there are other tests that can be used to test normality as well. The sample skewness and kurtosis coefficients will be defined by the following formulae:

\[b = \frac{(1/n)\sum^n_{i=1}(X_i - \bar{X})^3}{((1/n)\sum^n_{i=1}(X_i - \bar{X})^2)^{3/2}}\] \[k = \frac{(1/n)\sum^n_{i=1}(X_i - \bar{X})^4}{((1/n)\sum^n_{i=1}(X_i - \bar{X})^2)^2}\]

With the Jarque-Bera statistic given by[3]:

\[T = \frac{1}{6}n(b^2 + \frac{1}{4}(k-3)^2)\]

Thereby, we get the following values of skewness, kurtosis, the test statistic and the respective p-value for the data under consideration in the following table.

Jarque-Bera Test estimates.
Skewness	Kurtosis	JB Statistic	P-value
0.1046	7.09197	1538.195	0

Now that we’ve made conclusive remarks as to the non-normality of the data, it now time to see if we will have better luck fitting some heavy-tailed distribution to the data. It’s been observed by others that the t-distribution is a good fit for daily return prices. Hence we will turn our attention to the t-distribution and the non-central t-distribution.

2.2 Student’s t-distribution.

The Student’s t-distribution was formulated by William Sealy Gosset while working for a brewery in Dublin. It has come to be one of the most important distributions for statistical analyses. The distribution function for which is given by[4]: \[f(t;\nu) = \frac{\Gamma(\frac{\nu + 1}{2})}{\sqrt{\pi\nu}\Gamma(\frac{\nu}{2})}\cdot(1 + \frac{t^2}{\nu})^{-\frac{\nu +1}{2}}\]

where, \(\nu\) is the degree of freedom(df) and \(\Gamma(\cdot)\) refers to the gamma function given by the formula: \[\Gamma(\alpha) = \int_0^\infty y^{\alpha-1} e^{-y}dy\] for all \(\alpha \in \mathbb{R}, \alpha > 0\).

The t-distribution can further be generalised by means of introducing a scale and location parameter. Resorting to an alternate definition, a random variable \(T\) is said to have a standard t-distribution if it is of the form[5]: \[T = \frac{Z}{\sqrt{\chi^2_\nu/\nu}}\] where, \(Z\) is a normally distributed random variable ie \(Z \sim N(0,1)\) and \(\chi^2_n\) is a random variable from a chi-squared distribution with \(\nu\) degrees of freedom.

A random variable \(X\) is said to have a non-standard or general t-distribution if it is of the form: \[X(t) = m + sT(t)\] where, \(m\) and \(s\) are the aforementioned location and scale parameters respectively. It is this scaled variant of the t-distribution that we are attempting to fit to the data. For independently, indentically distributed(i.i.d.) variables \(T\). The moments of the scaled/general t-distribution with \(\nu\) degreeos of freedom are given by: \[E[X(t)] = m\] \[Var[X(t)] = s^2 \frac{\nu}{\nu - 2}\] , provided \(\nu>2\) \[Skewness[X(t)] = 0\] \[Kurtosis[X(t)] = \frac{3(\nu-2)}{\nu-4}\], provided \(\nu>4\)

Turning our attention to fitting the t-distribution to the data, firstly, an easy, albiet rudimentary way to test this out will be an iterative process of fitting multiple distributions by tweaking the parameters. In our case, we are tweaking the df parameter, running it from 3,5,7,9,20 and 25 to see for which df the data seems to best fit(Figure 7). We can then be more rigorous in our analysis and optimize the parameters to obtain the best fit.

Fitting General T-Distribution to data.

We can perform a similar rudimentary analysis on the QQ-plots as well. This is shown in FIgure 8. Looking at the QQ-plots, it seems as if quantiles for 5 degrees of freedom seem to show the most promise, as the points fall more or less on the line, and a consistent linear relationship is observed. Of course, to form a solid conclusion we can optimse the fit using the likelihood of the distribution. This next stage in the analysis will be carried out now.

QQ plot for different df’s

The MLE estimates from optimizing the parameters of the t-distribution are given in the below table. These estimates are calulated using the mle.t funciton provided to us, which uses the inbuilt R function optim, used for numerical optimiztion, to get the maximum likelihood estimates of the parameters.

Optimised fit of t-distribution to data with parameters mean = 0.000279, std dev = 0.0128352 and df = 3.945898

MLE Estimates for t-distribution
Mean	Std. Deviation	df
0.0002761	0.0128457	3.955347

Subsequently, we can also analyse the QQ plot in Figure 10 to confirm the goodness of fit. As can be observed, the points in the plot show a considerable linear relationship and also seem to lie on the \(x=y\) diagonal which further cements that the fit is quite good.

QQ Plot of optimised t-distribution to data.

Similarily we can also look at the CDF plots in Figure 11 for the data and observe how well it fits the data.

Optimised CDF plot of data and t-distribution.

The fit in Figure 9 seems to be quite optimal capturing the peak of the histogram, while clinging close to the tails of the distribution.

And while the data do appear symmetric about the mean, this can be hard to guage by merely observing the plot. From running the Jarque-Bera test we’ve alrealy observed a sample skewness \(b = 0.10777\). While it may not be much, we can make use of the non-central t-distribution to get a better fit.

2.3 Non-Central t-distribution

Similar to the definition of the random variable from a t-distribution, we can generate random variables from a non-central t-distribution by introducing a non-centrality parameter \(d\). The formulation is given by[6]: \[T(t) = \frac{Z(t) + d}{\sqrt{\chi^2_\nu/\nu}}\] where \(Z\) and \(\chi^2_n\) are as previously defined, while \(d \in \mathbb{R}\) is some constant.

The distribution function is given by[6]: \[f(t;\nu,\delta) = \frac{e^{-\delta^2/2}}{\sqrt{\nu\pi}\Gamma(\frac{\nu}{2})}\cdot\Sigma^\infty_{r = 0}\frac{(t\delta)^r}{r!\nu^{nu/2}}(1 + \frac{t^2}{\nu})^{-\frac{\nu + r + 1}{2}} \cdot 2^{r/2}(\frac{\nu + r +1}{2})\]

Following the Hand-book on Statistical Distributions for experimentalists[6], the moments of the non-central t-distribution are given as: \[E[X(t)] = \sqrt{\frac{\nu}{2}} \frac{\Gamma(\frac{n-1}{2})}{\Gamma\frac{\nu}{2}}\delta\] \[Var[X(t)] = \frac{\nu}{\nu-2}(1 + \delta^2)\] \[Skewness[X(t)] = \nu^{3/2}\frac{\sqrt2\Gamma(\frac{\nu-3}{2})}{4\Gamma(\frac{\nu}{2})}\delta(3 + \delta)\] \[Kurtosis[X(t)] = \frac{\nu^2}{(\nu-2)(\nu-4)}(\delta^4 + 6\delta^2 + 3)\] Now that the theory has been set down, let us carry out the fitting of the distribution to the data. Similar to the estimating the parameters of the t-distribution, the mle.nct function that was provided will help us calculate the required estimates of the non-central t-distribution. The following plot shows us just that and the estimates are given in Table 4. The fit in Figure 12, at first glance looks quite optimal, now that the non-centrality(though slight) has been accounted for.

Optimised fit of NCT to data with parameter values, mean = 0.00186, std dev = 0.0128, df = 3.92147 and ncp = ???0.11271.

MLE Estimates for NCT-distribution
Mean	Std. Deviation	Df	NCP
0.0018786	0.0128095	3.969669	-0.1159325

Further analysis of the QQ plot in Figure 13 below also confirms our hypothesis that the NCT distribution is an optimal fit to the data, observing only a slight deviation at the left end of the plot, suggesting that the data are flatter on this side. Otherwise, the fit is quite good.

QQ Plot of optimised NCT fit to data.

We can also plot the CDF of the data and the NCT to examine the fit. The following plot in FIgure 14 shows us that the fit is quite similar to that of the t-distribution. While it may be difficult to ascertain which is better from a graph, we use better tecniques in later sections to ascertain which theoretical distribution provides the best fit of all.

CDF plot of data and optimised NCT fit.

2.4 Comparing Tails of the fitted distributions.

The purpose of fitting these well known distributions to the data primarily lies with the convenience of knowing the properties of these distributions. Such knowledge gives us an idea of, for example, the fot in the tails. And for daily stock returns this is an important statistic to know. We’ll take a deeper look at the estimation of the Value-at-risk in the next section. For this section we’ll content ourselves as observing the fit of our distributions considered in the previous sections to the data. The following plot shows us the upper and lower tails(of the CDF) and the subsequent fits of the distribution(Figure 15).

Checking the fit of the Normal(purple), t(blue) and NCT(red) distributions to the tails of the data.

As can be observed, the normal distribution(magenta) is a poor fit to extreme values in the distribution. On the other hand the t & non-central t-distributions have a relative degree of success in tracking the data quite well. This is further accentuated by the Akaike Criterion Index(AIC) of the three distributions. The AIC is a statistic that helps us ascertain how the fit of model compared to other models. Suppose we have \(m\) models \(M_1, M_2 \cdots M_m\) adn that for model \(j\) there are \(k_j\) parameters denoted by \(\theta_j = (\theta_{j1}, \cdots,\theta_{jk_j})\). It also makes use of the likelihood function \(L_j(\theta_j;X)\). Thus we should choose the model that minimises[3]: \[AIC(M_j) = -2lnL_j(\hat\theta_j;X) + 2k_j\] Here \(\hat\theta_j\) are the maximum likelihood estimates of the parameters \(\theta_j\). The number of parameters imposes a penalty on the value of the loglikelihood for each parameter of the model under consideration. The parameter estimates of all three distributions and their respective AIC’s are tabulated below. It is clear, looking at the AIC, that it is least for the non-central t-distribution and thus gives us the best fit for the data under consideration.

Another estimate that can be used to compare the goodness of fits of distributions is the Bayesian Information Criterion(BIC) or the Schwarz Criterion. This statistic too is based on the loglikelihood and is closely related to the AIC. Increasing the number of parametrs of a distribution can increase the likelihood estimates, but also runs the risk of overfitting[4]. Similar to the AIC we should choose the model that minimises[8]: \[BIC(M_j) = -2ln L_j(\hat{\theta_j}:X) + k_j ln(n_j)\] where the loglikelihood is as described above and \(k_j\) is the number of parameters of model \(j\). The only difference is the \(ln(n_j)\), where \(n_j\) is the number of data points for model \(j\), which penalizes the estimate. This is a larger when compared to the penalty incurred in the AIC.

MLE Estimates, AIC and BIC.
	Mean	Sigma	DoF	NCP	AIC	BIC
Normal	0.00014	0.01785	0.00000	0.00000	-11458.05	-11440.97
T	0.00028	0.01285	3.95535	0.00000	-11735.41	-11712.63
NCT	0.00188	0.01281	3.96967	-0.11593	-11736.06	-11718.97

We can thereby conclude, looking at the estimates of the AIC and BIC that the non-central t-dsitribution is therefore the best fitting model since these values are the least.

2.5 Calculating Value-at-Risk(VaR)

It was alluded to in the easlier section that finding appropriate distributions to accurately fit the tails or extreme values of a distribution can play an important role in the accuracy of our analysis. As a crude example, we can apply this to the determining the Value-at-risk of the stock. Mathematically, the VaR is defined for some confidence interval \(\alpha\in(0,1)\) as[3]: \[VaR_\alpha(L) = inf\{l\in\mathbb{R}:P(L>l)\leq1-\alpha\}=inf\{l\in\mathbb{R}:F_L(l)\geq\alpha\}\] That is to say, the \(VaR_\alpha\) is actually the quantile of \(F_L\) in excess of the probability \(1-\alpha\). For time series data(like the data we are using), the definition of the VaR varies slightly. For log-returns, \(R_t = log(X_{t+1}- X_t\) for \(t = 1,2,\cdots,n\), the \(VaR_\alpha\) of a normal distribution \(F_L\) with mean \(\mu\) and variance \(\sigma^2\), we have: \[VaR_\alpha = \mu + \sigma\Phi^{-1}(\alpha)\] where \(\Phi^{-1}(\alpha)\) is the \(\alpha\)-quantile of standardized log-returns i.e. \[Z(t) = \frac{R_t - \mu}{\sigma}\] Similarily, for the general t-distribution, with scale and location parameters, \(m\) and \(s\) respectively. The method of moments estimates are given in the previous section. Suppose our returns \(R_t\) is such that \((R_t - m)/s\) has a standard t-distribution with \(\nu\) degrees of freedom, the VaR is given by: \[VaR_\alpha = m + st^{-1}_\nu(\alpha)\] This is an example of the application of parametric estimation of VaR. In the preceeding sections, we estimated the fit and the estimates of 3 different distributions. With this knowledge in hand, we can proceed with calculating the VaR’s and these are displayed in the table below. One interesting point to note, is that the VaR from the normal distribution greatly overestimates the loss. While this may not necessarily be a a point of contention, when financial institutions deal with large amounts of capital(in the millions/billions), the more accurate estimates will go a long way to things like capital allocation. And we’ve already ascertained that the NCT provides the best fit, therefore, the VaR from the NCT should be the most reliable, atleast in practice.

95% VaR for the fitted distributions.
Normal	General t	NCT
-0.0432226	-0.0288858	-0.0278929

3.Analysis-Bootstrapping

Bootstrapping is a computer based resampling method applied to data to aid in the estimation of measures of accuracy to statistical quantities. In this section we’ll attempt to to carry out this ssame idea in the estimation of the following parameters;mean, variance, 5% quantile, 95% quantile, the coefficient of variation(CV) and the median. Suppose we have data points denoted by some \(x = (x_1,x_2,\cdots,x_n)\) and we compute a statistic of interest say \(s(x)\). In our scenario, this statistic will correspond to the parameters mentioned. A bootstrap sample can be obtained by sampling \(n\) times from the original data points, with replacement.

In our case, the original data points correspond to the daily adjusted log prices. The sample size or \(n\) is the number of daily adjusted log prices in the original dataset. Also, iterations of this operation are carried for bootstrap iterations of 100, 200, 300, 400, 500, 750, 1000 and 10000. Herein, we can rely on the well known central limit theorem for a preliminary analysis and say that, the accuracy of the estimates for larger number of iterations should theoretically be greater. We will try and hash this out in our analysis. It should be mentioned that, the type of bootstrapping being perfoormed in our analysis is called the non-parametric bootstrap. It is called so because we do not consider the underlying distribution of the data points, but merely seek to sample from them with replacement as mentioned above and subsequently look at the distribution of the estimates and ascertain what information we may.

3.1 Bias and Standard Errors

The bootstrap algorithm essentially calculates and generates a large number of independent bootstrap samples \(x^{*1}, x^{*2},\cdots,x^{*B}\) each of size \(n\)(total number of datapoints in our data set). And corresponding to these bootstrap samples is a bootstrap replication of \(s(x)\). For our consideration, the number of bootstrap iterations are \(B = 100, 200, 300, 400, 500, 750, 1000,10000\). If the statistic under consideration is the mean, then \(s(x^*)\) is the mean of the bootstrap sample under consideration. When estimating these bootstrap replicates, it would be useful to guage the variation that might be observed from sample estimate to sample estimate. For this we can make use of the standard error which is also the standard deviation of the bootstrap replications given by[9]: \[\hat{se}_{boot} = \{\frac{\sum^B_{b=1}[s(x^{*b})-s(\cdot)]^2)}{(B-1)}\}^{1/2}\] where, \(s(\cdot) = \sum^B_{b=1}s(x^{*b})/B\)

Standard errors are a simple, yet intuitive measure of statistical accuracy. More sophisticated methods of standard errors can be formulated, but for our purposes, the simple measure of the standard error will give us insight into the variation of accuracies for differing bootstrap iterations.

Another useful measure of statistical accuracy concerns the bias. The bias can be defined as the simple difference between the expectation of an estimator \(\hat\theta\) and the actually estimate or quantity \(\theta\) i.e. \[bias = E(\hat\theta) - \theta\] Intuituitively, we can guess that a large bias is undesirable as this may indicate an unsually large deviance, thereby leading us to conclude that out algorithm may need to be adjusted. We should also be expecting some bias, because inherent in the bootstrap is the assumption of randomness in the samples. Greaer accuracy of the possible bias can be achieved asymptotically. The plots of the bias in Fig 17-22 for varying bootstrap iterations tend to show some trend, converging to a final value which need not strictly be \(0\).

The following plot in Figure 16 shows us the variation of the estimates i.e. \(s(x^*)\) for each of the parameters under consideration for different bootstrap iterations, while the horizontal red line indicates the point estimates of the parameters calculated from the data.

Parameter estimates from bootstrapping with the horizontal red line showing the sample point estimates.

While no significant trend can be observed for the variance, 5% quantile and median(though a mild tendency to the true value can be inferred), the mean shows a significant trend to the point estimates. The coefficient of variation shows a great deal of fluctuation. This is to be expected because following it’s definition as the ratio of the standard deviation to the mean, any minor changes in the denominator will affect the value. Since our values of the mean are quite small and centered around \(0\), this only further amplifies the fluctuations of the coefficeint of variation. The following table shows the parameter estimates for each bootstrap sample, along with the expected parameter estimates.

Values for parameter estimates for different bootstrap iterations
	Mean	Variance	5% quantile	95% quantile	CV	Median
100	0.0001191	0.0003200	-0.0287562	0.0278674	199.20571	0.0005396
200	0.0001816	0.0003204	-0.0287878	0.0277303	-41.92483	0.0005658
300	0.0001553	0.0003186	-0.0289706	0.0277512	35.70358	0.0005774
400	0.0001322	0.0003181	-0.0289274	0.0277619	76.92921	0.0005252
500	0.0001368	0.0003193	-0.0288606	0.0277998	-51.72112	0.0005659
750	0.0001185	0.0003178	-0.0288440	0.0277771	15.41792	0.0005884
1000	0.0001438	0.0003182	-0.0288680	0.0277962	73.56596	0.0005598
10000	0.0001375	0.0003187	-0.0288520	0.0277304	111.84316	0.0005701
Sample estimate	0.0001428	0.0003189	-0.0290586	0.0280242	125.02587	0.0005807

The following table shows us the variation of the bias with the different bootstrap sample sizes.

Bias for parameter estimates for different bootstrap iterations.
	Mean	Variance	5% quantile	95% quantile	CV	Median
Bias 100	2.38e-05	-1.1e-06	-0.0003024	0.0001568	-74.17983	4.11e-05
Bias 200	-3.87e-05	-1.5e-06	-0.0002708	0.0002939	166.95070	1.49e-05
Bias 300	-1.24e-05	3.0e-07	-0.0000880	0.0002731	89.32229	3.20e-06
Bias 400	1.06e-05	8.0e-07	-0.0001312	0.0002623	48.09667	5.55e-05
Bias 500	6.00e-06	-4.0e-07	-0.0001980	0.0002244	176.74700	1.47e-05
Bias 750	2.43e-05	1.1e-06	-0.0002146	0.0002472	109.60795	-7.70e-06
Bias 1000	-9.00e-07	7.0e-07	-0.0001906	0.0002281	51.45992	2.08e-05
Bias 10000	5.30e-06	2.0e-07	-0.0002066	0.0002938	13.18271	1.06e-05

Similarily, the table below shows us the variation of the standard errors

Std error for parameter estimates for different bootstrap iterations.
	Mean	Variance	5% quantile	95% quantile	CV	Median
Std Err 100	2.4e-06	1e-07	3.04e-05	1.58e-05	7.4553539	4.1e-06
Std Err 200	2.7e-06	1e-07	1.92e-05	2.08e-05	11.8348216	1.1e-06
Std Err 300	7.0e-07	0e+00	5.10e-06	1.58e-05	5.1656414	2.0e-07
Std Err 400	5.0e-07	0e+00	6.60e-06	1.31e-05	2.4078450	2.8e-06
Std Err 500	3.0e-07	0e+00	8.90e-06	1.00e-05	7.9122822	7.0e-07
Std Err 750	9.0e-07	0e+00	7.80e-06	9.00e-06	4.0049875	3.0e-07
Std Err 1000	0.0e+00	0e+00	6.00e-06	7.20e-06	1.6281197	7.0e-07
Std Err 10000	1.0e-07	0e+00	2.10e-06	2.90e-06	0.1318337	1.0e-07

Plots of variation of standard error and bias for the the different parameter estimates.

The series of plots in Figure 17-22 indicate the change in estimates of the bias and standard errors for the different bootstrap iterations of the parameter estimates.

A clear trend to \(0\) is clearly observed in the plots of the standard errors(on the left). As is expected, the standard error tends to zero as the sample size increases, indicating the parameter estimates to be more accurate asymtotically. The plots of the bias(on the right) are however a different story. The plots suggest that, the bias converges to the true bias with increasing sample size. As mentioned previously, sampling error plays a role in the estimation of the bias, which can be corrected if an accurate estimate of the bias is known. These values are indicated in Table(?)

3.2 Confidence Intervals

Confidence intervals as a concept is reasonably straighforward. Suppose, we have a parameter estimated from some data, it becomes paramount to have confidence in the estimation of said parameter. Confidence intervals help us guage this by giving us an idea of the possible deviations of the parameter i.e. an interval estimate is always more practical than a point estimate. This can help make further inferences down the road.

We can similarily find out confidence intervals for the parameters under consideration from the bootstrap samples. Relying on the asymtotic property that we get better accuracies with larger samples, we can use the quantile funciton in R to calcuate confidence intervals. The theoretical confidence intervals for a parameter \(\theta\) is dependent on the estimate \(\hat\theta\) and the estimated standard error \(\hat{se}\). And the usual 90% confidence interval for \(\theta\) is: \[CI = \hat\theta \pm 1.645\cdot \hat{se}\] To put it in more rigorous terms, suppose we randomly sample from a dataset of unknown distribution F i.e. \(F\rightarrow x = (x_1,x_2,...,x_n)\). We know from the previous section that there exists a standard error \(\hat{se}\) for parameter \(\hat\theta\) based on the bootstrap. Moving forward with the assumption of normality(need not strictly hold true) of the bootstrapped distribution, the parameters are distributed as[9]: \[\hat\theta \sim N(\theta,\hat{se}^2)\] Or equivalently, \[\frac{\hat\theta-\theta}{\hat{se}}\sim N(0,1)\] With the above approximation we now have: \[Prob_F\{z^{(\alpha)}\leq\frac{\hat\theta-\theta}{\hat{se}}\leq z^{(1-\alpha)}\} = 1 - 2\alpha\] where \(z^{(\alpha)}\) indicates the \(100.\alpha^{th}\) percentile of a \(N(0,1)\) distribution and similarily for \(z^{(1 - \alpha)}\). And in general, we get the confidence intervals from the above relation as: \[[\hat\theta - z^{(1-\alpha)}.\hat{se},\hat\theta - z^{(\alpha)}.\hat{se}]\] The above relation is said to indicate the standard confidence interval with coverage probability equal to \((1-2\alpha)\) and can be rewritten as its more familiar form: \[\hat\theta \pm z^{(1-\alpha)}.\hat{se}\] The following table shows us the parameter estimates along with their 90% confidence intervals.

90% Bootstrap Confidence Intervals for parameter estimates.
	Mean	Variance	5% quantile	95% quantile	CV	Median
Lower	-0.0004900	0.0002900	-0.0307800	0.0252300	-351.0523	0.0000000
Upper	0.0007700	0.0003500	-0.0268200	0.0298600	366.6939	0.0011700
Sample Estimate	0.0001428	0.0003189	-0.0290586	0.0280242	125.0259	0.0005807

4.Analysis-Correlation

Correlation is statistic than can be used to measure relationships between data. This analysis may provide inferences that can help better understand data. In our analysis, we are interested in the correlation coefficient, which ranges from a value of \(-1\) to \(+1\). A correlation coefficeint of \(-1\) suggests that the data pairs are perfectly negatively correlated, while a value of \(+1\) suggests perfect positive correlation. One important point to keep in mind is that, just because data pairs are correlated does not mean that change in one variable can have any meaningful change in the other. To put it more succinctly, correlation does not equal causation.

For this section, however, we are looking at bivariate data of stock indices of General Motors and Ford(Figure 23). Industries in the same sector may show similar reaactions to market forces. Analysis of correlations of this bivariate data can therefore prove insightful. The following plot shows us the time-series plot of the adjust close prices of both indices. As can be observed, from 2010 to mid-2015, similar trends of high and low points can be observed. This confirms that the indices react equally to market forces. But it can also be observed that post mid-2015, the stock prices diverge with FOrd displaying an overall downward trend, while General Motors displays an upward trend. Though not immediately apparent, even in the diverging trends, similar pockets of volatility can be observed. This will be dealt with later. A final inference that can be made is that, the trend for both indices seem on the rise in 2019.

Time series plots Adjusted Close Prices of Geneal Motors and Ford.

The following plot in Figure 24 shows the daily log returns of both stock indices. Similar clusters of volatility can be observed for similar time periods(2011-2012)

Daily log prices of General Motors and Ford.

The correlation for our bivariate data consisting of two vectors \(X_i\) and \(X_j\), namely the daily log returns of General Motors and Ford, respectively, is given by[3]: \[\rho_{i,j} = \rho(X_i,X_j) = \frac{cov(X_i,X_j)}{\sqrt{var(X_i)var(X_j)}}\] where, the ordinary pairwise covariance between \(X_i\) and \(X_j\), given by \(cov(X_i,X_j)\) is defined as, \[\sigma_{ij} = cov(X_i,X_j) = E(X_iX_j) - E(X_i)E(X_j)\] The correlation coefficient as calculated by R is,

## [1] "Correlation coeff: 0.324733274495016"

This indicates are weakly positively correlated. An intuitive interpretation of this statistic is that, if the price of either stock were to go up, it would be more than likely that the other would go up as well.

For the next stage in the analysis, we thought to ask the question, “How would the correlation coefficents vary for a rolling time frame window?”. The plot below shows us just that. For our consideration ,the rolling window is taked as \(25\) days. The choice of number is arbitrary. But it allows us to get a plot that can show a lot more variation than if the window size were larger. And if the window size were too small, no meaningful inference might be obtained.

Plot of correlations over a rolling window of 25 days for Daily Close Prices(above) and Daily Log Prices(below).

The plot in Figure 25 shows the varying correlations for the adjusted close prices(top) and the daily log returns (below). At first glance we see that the correlations are relatively high and move towards \(+1\) although it can also be observed that pockets of negative correlations are also present. This seems to imply that the performance of the indices are linked with market forces in such a manner that it pushes the correlation between the stocks to \(1\) with market stress[3]. Although, an interesting inference, stating formal hypotheses that account for relationships between higher volatilities and correlations must be carefully considered. As stated earlier, correlation does not equal causation.

4.1 Correlation and Confidence Intervals

Similar to how we performed bootstrapping analysis on a number of statistical factors, we now consign ourselves to performing a similar analysis on the correlation coefficients. We will be considering a large bootstrap iteration of \(B=10000\). The process, as highlighted in an earlier section, will comprise of sampling with replacement from bivariate data of the daily close prices. The length of these sampled datasets will be limited to the original number of data points. Subsequently, the correlation coefficients are then calculated for all the bootstrap samples. The plot in Figure 26 shows the distribution of the correlation coefficients with a normal distribution plotted(red line) fitted over it. Since the distribution of correlations appear to be well approximated by a normal distribution, the standard 90% confidence intervals can safely be estimated from the distribution.

Distribution of bootstrap estimates for correlation.

Another method we can use to formulate confidence intervals uses Fisher’s transformation. This is defined as[10]: \[z` = \frac{1}{2}ln(\frac{1+\rho}{1 - \rho})\] The 90% confidence intervals can then be calculated using: \[z` \pm 1.65*\sigma_{z`}\] where, \(\sigma_{z`} = 1/\sqrt{N-3}\), with \(N\) being the total number of data points. The upper and lower intervals can then finally be converted to the scale of the correlation coefficients using the formula: \[\rho = \frac{e^{2z`} - 1}{e^{2z`}+1}\] Our estimated correlation coefficent parameter from the earlier section was \(\rho = 0.329377937560887\). And the 90% confidence intervals computed using the standard method and using Fisher’s z-tranform are shows below. Our parameter falls withing these intervals.

90% Confidence Intervals for correlation coefficient.
	5%	95%
Fisher CI	0.2924548	0.3554085
Bootstrap CI	0.2843808	0.3629051

5.Conclusion

We have throughout, the different sections in the dissertation, seen how important it is to accurately represent our data. The assumption of normality is not always the wise one. It can lead to severe underestimations or even overestimations. With regards to financial stock data, we can conclude that, since distributions of this nature exhibit excess kutosis i.e. they are high peaked, a better distribution to assume is the general t-distribution. Stock data also tend to be symmetrically distributed about a mean value of \(0\). Pursuant to our analysis we guagued that the non-central t-distribution was the best fit, accounting for slight skewness that was observed. The BIC and AIC statistics helped in this determination.

Further, to answer the question of quantifying uncertainity in the financial data, bootstrap methodologies were employed. Bootstrapping, though computer intensive, can help us estimate the range of values a parameter estimate can follow with high confidence. This methodology can be very useful because in most circumstances, the true parameter estimates can never be fully predicted with absolute confidence. These estimates from carrying out bootstrapping further provide a greater sense of what to be expected.

And lastly, when evaluating the correlation coefficient, it is always safe to assume the maxim, correlation does not imply causation. This is clearly observed in our analysis, though the stocks of GM and Ford, overall, exhibit weak positive correlation, the trends observed in the time series plots were clearly diverging. Rather, this information should be used in other business stratedgies. And finally, when it comes to evaluating uncertainity, the decisios that go into making the necessary choices should always be supported with sound mathematical and practical foundations. Better methods are available, for example, Monte Carlo(MC) methods and Polynomial Chaos Expansion(PCE). Though, at present a great deal of focus is on MC methods, especially by banks. All in all to say that the area of quantifying uncertainity is rife for additional research and better methodologies. Though the day when we can account for all uncertainitty may never materialise.

Referrences

[1] Irwin Miller & Marylees Miller, John E. Freund’s Mathematical Statistics with Applications, Eighth Edition, Pearson Education Limited (2014)

[2] I. P. Sprent, N.C. Smeeton, Applied Non-Parametric Statistical Methods, 3rd Edition, Chapman & Hall/CRC texts in statistical science series.

[3] Alexander J. McNeil, Rudiger Frey, Paul Embrechts, Quantitative Risk Management: Concepts, Techniques and Tools, Revised Edition, Princeton university Press (2015).

[4] Andrew R. Liddle, Information criteria for astrophysical model selection, Astronomy Centre, University of Sussex, *(https://arxiv.org/PS_cache/astro-ph/pdf/0701/0701113v2.pdf)* (2008)

[5] George Streftaris, Quantitative Risk Analysis, HWU lecture notes (2018).

[6] Christian Walck, Hand-book on Statistical Distributions for experimentalists, Particle Physics Group, Fysikum, University of Stockholm (2007)

[8] Gideon Schwarz, Estimating the Dimension of a Model, Annals of Statistics 6 (2): 461-464. doi:10.1214/aos/1176344136. MR468014 (1978)

[9] Bradely Efron, Robert J. Tibsirani, An Introduction to the Bootstrap, Chapman & Hall/CRC (1998)

[10] Anthony J. Bishara & James B. Hittner, Confidence intervals for correlations when data are not normal, Psychonomic Society, Inc, *(https://link.springer.com/content/pdf/10.3758%2Fs13428-016-0702-8.pdf)*

[11] Marco Taboga, Lectures on Probability Theory and Mathematical Statistics, 3rd Edition (2017)

[12] Estimation methods for Value at Risk, *(https://minerva.it.manchester.ac.uk/~saralees/chap14.pdf)*

[13] A.C. Davison, D.V. Hinkley, Bootstrap Methods and their Applications, Cambridge Series on Statistical and Probablistic Mathematics, Cambridge University Press (1998)

[14] Michael R. Chernick, Bootstrap Methods: A Practitioner’s Guide, Wiley Series in Probabilty and Statistics, John Wiley & sons, Inc (1999)

Appendix

A1.Code

library(quantmod)
library(dplyr) 
library(tidyr)
library(metRology)

#Reading in data#########################
ticker <- "GM" 
getSymbols(ticker, from = "2010-01-01")
options("getSymbols.yahoo.warning"=FALSE)
port.prices.adj <- na.omit(Ad(GM))

#Plotting Time series data for GM
plot(port.prices.adj, type = "l", xlab = "Time", ylab = "Adj Close Prices",
     main = "TIme Series plot of Adj Close Prices.")

#Plotting Daily log prices of GM
adj.logR <- na.omit(diff(log(port.prices.adj)))
plot(adj.logR, ylim = c(-0.2,0.2), type = "l",
      xlab = "Time", ylab = "Daily Returns", 
      main = "Daily Log Returns", lwd = 1)

#Histogram plot of Daily log prices
hist(as.numeric(adj.logR), breaks = 60, col = 8, freq = F,
     xlab = "Log Prices", ylab = "Frequency",
     main = "Histogram plot of price fluctuations.") #Histogram plot

#Tabulating MLE estimates of Normal
mle_n <- mle.normal(as.numeric(adj.logR), pr = 0)
m <- mle_n[1]
s <- mle_n[2]

tmp <- cbind(m,s)
colnames(tmp) <- c("Mean", "Std. Deviation")
knitr::kable(tmp, caption = "MLE Estimates")

##Fitting Normal Distribution over histogram
tmp <- range(as.numeric(adj.logR))

x <- seq(tmp[1], tmp[2], by = 0.001)
lines(x = x, y = dnorm(x, mle_n[1], mle_n[2]), col = 6, lwd = 2.5)
legend("topright", legend = "Normal Distribution",
       lwd = 2.5, lty = 1,
       col =  6)

## QQ plot for Normal
qqnorm(as.numeric(adj.logR))
qqline(y = as.numeric(adj.logR), col = 2, lwd = 2)
abline(h = quantile(as.numeric(adj.logR), c(0.05,0.95)), col = 4, lty = 2)     

## CDF Plot for Normal
tmp <- range(as.numeric(adj.logR))

x <- seq(tmp[1], tmp[2], by = 0.001)
cdfplot(as.numeric(adj.logR))
lines(x, pnorm(x, m, s), col = 6, lty = "dashed", lwd = 2)
legend("bottomright", legend = c("Data","Normal Distribution"),
       lwd = 2.5, lty = c(1,2),
       col =  c(1,6))
title(main = "CDF Plot of Data and Normal Distribution.",
      xlab = "Ordered data points",
      ylab = "Probability")

# TEst for Normality with Chi-squared test
cs <- chi2test.normal(as.numeric(adj.logR), nbin = 100)

#JB Test
colnames(jb_test) <- c("Skewness", "Kurtosis", "JB Statistic", "P-value")
knitr::kable(jb_test, caption = "Jarqu-Bera Test estimates.")

## Rudimentary test of fitting t-dist with various df's
df_c <- c(3,5,7,9,20, 25) #vector of df's
cl <- 0
y <- matrix(0,length(df_c), length(x))
for(i in 1:length(df_c)){
  for(j in 1:ncol(y)){
    y[i,] <- dt.scaled(x, df = df_c[i], mean = m, sd = s, ncp = 0)
    #lines(x = x, y = y[cl] , col = cl, lwd = 2)
  }
}

hist(adj.logR, breaks = 50, col = 8, freq = F,
      main = "Histogram Plot of Daily Log Returns",
      xlab = "Prices",
      ylab = "Frequency") #Histogram plot
matlines(x = x ,y = t(y), col = c(2:length(df_c)), lwd = 2.5,
          lty = "dashed")#fitting t-distribution
legend("topright", legend = df_c, col =  c(2:length(df_c)),
        lty = "dashed", lwd = 2.5)

## QQ plots for various df's
n <- length(as.numeric(adj.logR))
q_grid = (1:n) / (n + 1)

par(mfrow = c(3,2))
for(df in df_c){
   qqplot(as.numeric(adj.logR), qt(q_grid,df),
          main = paste("df = ", df),
          xlab = "Theoretical Quantiles", ylab = "Sample Quantiles" )
   abline(lm(qt(c(0.25, 0.75), df = df) ~
               quantile(as.numeric(adj.logR), c(0.25, 0.75))), col = 2, lwd = 2)
}
par(mfrow = c(1,1)) 

### Fitting T-distribution(optimised)
mle_t <- mle.t(as.numeric(adj.logR), pr = 0)
hist(adj.logR, breaks = 50, col = 8, freq = F,
     main = paste("MLE ests. mu =", round(mle_t[1], digits = 5),
                  "sd =", round(mle_t[2], digits = 5),
                "& df =", round(mle_t[3], digits = 5)), xlab = "Distribution")
lines(x = x, y = dt.scaled(x = x, df = mle_t[3], mean = mle_t[1], 
                           sd = mle_t[2]), col = 4, lwd = 2.5)
legend("topright", legend = "T-Distribution",
       lwd = 2.5, lty = 1,
       col =  4)

#Tabulating t-dist estimates(optimised)
mle_t_tmp <- mle.t(as.numeric(adj.logR), pr = 0)
m <- mle_t_tmp[1]
s <- mle_t_tmp[2]
df <- mle_t_tmp[3]
tmp <- cbind(m,s,df)
colnames(tmp) <- c("Mean", "Std. Deviation", "Dof")
knitr::kable(tmp, caption = "MLE Estimates for T-distribution")

# QQ Plot of t-dist(optimised)
qqt(as.numeric(adj.logR), mle = 1)

abline(lm(quantile(as.numeric(adj.logR),
                   c(0.05, 0.95)) ~ qt(c(0.05, 0.95), df = mle_t[3])), col = 2, lwd = 2)
 title( main = paste("T-Dist. QQ Plot with dof = ", round(mle_t[3], digits = 5)))

 #CDF of t-dist(optimised)
 cdfplot(as.numeric(adj.logR))
lines(x, pt((x-mle_t[1])/mle_t[2], df = mle_t[3]), col = 4,
      lty = "dashed", lwd = 2)
legend("bottomright", legend = c("Data","T-Distribution"),
       lwd = 2.5, lty = c(1,2),
       col =  c(1,4))
title(main = "CDF Plot of Data and Std T-Distribution.",
      xlab = "Ordered data points",
      ylab = "Probability")

## Non-central t-dist Fitting
mle_nct <- mle.nct(as.numeric(adj.logR), pr = 0)
hist(adj.logR, breaks = 50, col = 8, freq = F,
     main = paste("MLE ests. mu =", round(mle_nct[1], digits = 5),
                  "sd =", round(mle_nct[2], digits = 5),
                  "df =", round(mle_nct[3], digits = 5), 
                  "& ncp =", round(mle_nct[4], digits = 5)), xlab = "Distribution")
lines(x = x, y = dt.scaled(x = x, df = mle_nct[3], ncp = mle_nct[4],
                           mean = mle_nct[1], sd = mle_nct[2]), col = 2, lwd = 2.5)
legend("topright", legend = "NCT-Distribution",
       lwd = 2.5, lty = 1, col =  2)

## Tabulating NCT Estimates
mle_nct_tmp <- mle.nct(as.numeric(adj.logR), pr = 0)
m <- mle_nct_tmp[1]
s <- mle_nct_tmp[2]
df <- mle_nct_tmp[3]
ncp <- mle_nct_tmp[4]
tmp <- cbind(m,s,df, ncp)
colnames(tmp) <- c("Mean", "Std. Deviation", "Dof", "NCP")
knitr::kable(tmp, caption = "MLE Estimates for T-distribution")

## QQplot of NCT
qqnct(as.numeric(adj.logR))

abline(lm(quantile(as.numeric(adj.logR), c(0.05, 0.95)) ~
            qt.scaled(c(0.05,0.95), df = mle_nct[3],
                      ncp = mle_nct[4], mean = mle_nct[1], sd = mle_nct[2])),
       col = 2, lwd = 2)
 title( main = paste("NCT Dist. QQ Plot with dof = ", round(mle_nct[3], digits = 5)))

## CDF of NCT 
cdfplot(as.numeric(adj.logR))
lines(x, pt((x-mle_nct[1])/mle_nct[2], df = mle_nct[3], ncp = mle_nct[4]), col = 2,
      lty = "dashed", lwd = 2)
legend("bottomright", legend = c("Data","NCT-Distribution"),
       lwd = 2.5, lty = c(1,2),
       col =  c(1,2))
title(main = "CDF Plot of Data and Std T-Distribution.", xlab = "Ordered data points",
      ylab = "Probability")

# Comparing tails of the 3 fits(Normal, T and NCT)
tmp <-range(as.numeric(adj.logR))
par(mfrow = c(2,1))
##Lower tails
setaxes(x0 = tmp[1], x1 = quantile(as.numeric(adj.logR), 0.05), y0 = 0, y1 = 0.05)
cdfplot(as.numeric(adj.logR), a = 1)
lines(x, pnorm((x-mle_n[1])/mle_n[2]), col = 6,
      lty = "dashed", lwd = 2)
lines(x, pt((x-mle_t[1])/mle_t[2], df = mle_t[3]), col = 2,
      lty = "dashed", lwd = 2)
lines(x, pt((x-mle_nct[1])/mle_nct[2], df = mle_nct[3], ncp = mle_nct[4]), col = 4,
      lty = "dashed", lwd = 2)
legend("topleft", legend = c("Data","NCT-Distribution", "T-Distribution", "Normal"),
       lwd = 2.5, lty = c(1,2, 2,2),
       col =  c(1,2,4,6))
#Upper Tails
setaxes(x0 = quantile(as.numeric(adj.logR), 0.95), x1 = tmp[2], y0 = 0.95, y1 = 1)
cdfplot(as.numeric(adj.logR), a = 1)
lines(x, pnorm((x-mle_n[1])/mle_n[2]), col = 6,
      lty = "dashed", lwd = 2)
lines(x, pt((x-mle_t[1])/mle_t[2], df = mle_t[3]), col = 2,
      lty = "dashed", lwd = 2)
lines(x, pt((x-mle_nct[1])/mle_nct[2], df = mle_nct[3], ncp = mle_nct[4]), col = 4,
      lty = "dashed", lwd = 2)
legend("bottomright", legend = c("Data","NCT-Distribution", "T-Distribution", "Normal"),
       lwd = 2.5, lty = c(1,2, 2,2),
       col =  c(1,2,4,6))

par(mfrow = c(1,1))

#Tabulating MLE estimates and statistics for the
#different distributions.
param_fit <- matrix(0,3,4)
colnames(param_fit) <- c("Mean", "Sigma", "DoF", "NCP")
rownames(param_fit) <- c("Normal","T", "NCT")
param_fit[1,] <- round(c(mle_n[1], mle_n[2],0,0), digits = 5)
param_fit[2,] <- round(c(mle_t[1], mle_t[2], mle_t[3], 0), digits = 5)
param_fit[3,] <- round(c(mle_nct[1], mle_nct[2], mle_nct[3], mle_nct[4]), digits = 5 )
#Tabulating AIC values
AIC_n = 2 * -mle_n[3] + 2 * length(mle_n)
AIC_std1 = 2 * -ll.t(mle_t, as.numeric(adj.logR)) + 2 * length(mle_t)
AIC_ntd1 = 2 * -ll.nct(mle_nct, as.numeric(adj.logR)) + 2 * (length(mle_nct))
AIC_val <- matrix(0,1,3)
colnames(AIC_val) <- c("Normal","NCT", "T")
rownames(AIC_val) <- c("AIC")
#AIC_val[1,] <- c(AIC_ntd, AIC_std)
AIC_val[1,] <- c(AIC_n, AIC_ntd1, AIC_std1)

#BIC
BIC_n <- 2 * -mle_n[3] + length(mle_n)*log(length(as.numeric(adj.logR)))
BIC_t <- 2 * -ll.t(mle_t, as.numeric(adj.logR)) + length(mle_t) *
  log(length(as.numeric(adj.logR)))
BIC_nct <- 2 * -ll.nct(mle_nct, as.numeric(adj.logR)) +
  length(mle_nct) * log(length(as.numeric(adj.logR)))
BIC_val <- matrix(0,1,3)
colnames(BIC_val) <- c("Normal","NCT", "T")
rownames(BIC_val) <- c("BIC")
#AIC_val[1,] <- c(AIC_ntd, AIC_std)
BIC_val[1,] <- c(BIC_n, BIC_nct, BIC_t)

param_fit <- cbind(param_fit, t(AIC_val), t(BIC_val))
knitr::kable(param_fit, caption = "MLE Estimates, AIC and BIC.")

# Calculating the VaR for the differnt distributions
ret_tmp <- (as.numeric(adj.logR) - m)/s
mle_t_tmp <- mle.t(ret_tmp)
mle_nct_tmp <- mle.nct(ret_tmp)

var_tmp_n <- mean(ret_tmp) + (sd(ret_tmp) * 
                                qnorm(0.05, mean(ret_tmp), sd(ret_tmp)))
var_tmp_t <- mle_t_tmp[1] + 
  (mle_t_tmp[2] * qt.scaled(0.05, df = mle_t_tmp[3],
                              mean = mle_t_tmp[1], sd = mle_t_tmp[2]))
var_tmp_nct <- mle_nct_tmp[1] + 
  (mle_nct_tmp[2] * qt.scaled(0.05, df = mle_nct_tmp[3],
                              ncp = mle_nct_tmp[4], mean = mle_nct_tmp[1], 
                              sd = mle_nct_tmp[2]))

v_n <- (var_tmp_n * s) + m
v_t <- (var_tmp_t * s) + m
v_nct <- (var_tmp_nct * s) + m
VAR1 <- cbind(v_n,v_t,v_nct)
colnames(VAR1) <- c("Normal", "Std T", "NCT")

knitr::kable(VAR1, caption = "90% VaR for the fitted distributions.")


# Performing bootstraping on mean, variance, 5% and 95% quantiles,
# coefficient of variation and median
##Vector of bootstrap sample sizes
B <- c(100, 200, 300, 400, 500, 750, 1000,10000)

names.param <- c("Mean", "Variance", "5% quantile", "95% quantile", "CV", "Median")

# Initializing a list to be used to enlist matrix data
# for each bootstrap sample in B
MATS <- list()
a_tmp <- 1
for(tmp in B){
  B.mat <- matrix(0, ncol = 6, nrow = tmp)
  for (j in 1:6){
    for(i in 1:nrow(B.mat)){
      if(j == 1) B.mat[i,j] = mean(sample(as.numeric(adj.logR), 
                                          size=nrow(adj.logR), replace=TRUE))
      if(j == 2) B.mat[i,j] = var(sample(as.numeric(adj.logR),
                                         size=nrow(adj.logR), replace=TRUE))
      if(j == 3) B.mat[i,j] = quantile(sample(as.numeric(adj.logR),
                                              size=nrow(adj.logR), replace=TRUE), 0.05)
      if(j == 4) B.mat[i,j] = quantile(sample(as.numeric(adj.logR),
                                              size=nrow(adj.logR), replace=TRUE), 0.95)
      if(j == 5){
        tmp <- sample(as.numeric(adj.logR), size=nrow(adj.logR), replace=TRUE)
        B.mat[i,j] = sd(tmp)/mean(tmp)
      }
      if(j == 6) B.mat[i,j] = median(sample(as.numeric(adj.logR),
                                            size=nrow(adj.logR), replace=TRUE))
    }
  }
  MATS[[a_tmp]] <- B.mat
  a_tmp = a_tmp +1
}

# str(MATS)
# head(B.mat)

#Changing col names to better understand data
for(j in 1:length(B)){
  colnames(MATS[[j]]) <- names.param
  #head(MATS[[1]])
}

# Parameter estimates from Bootstrap for B samples
param.hat <- matrix(0, nrow = 6, ncol = length(B))
colnames(param.hat) <- B
rownames(param.hat) <- names.param
for(i in 1:length(B)){
  param.hat[,i] <- apply(MATS[[i]], 2, FUN = mean)
}


# Actual parameter values
cv <- sd(as.numeric(adj.logR))/mean(as.numeric(adj.logR))
param.actual <- c(mean(as.numeric(adj.logR)), var(as.numeric(adj.logR)),
                  quantile(as.numeric(adj.logR), 0.05),
                  quantile(as.numeric(adj.logR), 0.95),
                  cv, median(as.numeric(adj.logR)))

# Plotting param estimates in param.hat with varying 
# bootstrap interations in B
plot.hat <- function(){
  par(mfrow = c(2,3))
  for(j in 1:5){
    plot(param.hat[j,], type = "l", 
         ylim = c(range(param.hat[j,])[1]*0.975, range(param.hat[j,])[2]*1.055),
         main = names.param[j], xlab = "Bootstrap Iterations", ylab = "Bootstrap values")
    points(param.hat[j,], pch = 9)
    abline(h = param.actual[j], col = 2)
    }
  plot(param.hat[6,], type = "l", 
       ylim = c(range(param.hat[6,])[1]*1.85, range(param.hat[6,])[2]-0.0005),
       main = names.param[6], xlab = "Bootstrap Iterations", ylab = "Bootstrap values")
  points(param.hat[6,], pch = 9)
  abline(h = param.actual[6], col = 2)
  par(mfrow = c(1,1))
}

plot.hat()

param.est <- cbind(param.hat, param.actual)

# Tabulating Parameter estimates after bootstrapping
knitr::kable(t(param.est), caption = "Values for parameter estimates")

# Calculating bias
bias.plot <- function(){
  bias.est <- matrix(0, nrow = 6, ncol = length(B))
  for(j in 1:length(B)){
    bias.est[,j] <- t(param.est[,9] - param.est[,j])
  }
  rownames(bias.est) <- names.param; colnames(bias.est) <- paste("Bias",B);bias.est
  #matplot(x = B, y = t(bias.est), type = "l", col = c(2,3,4,6))
  #matpoints(x = B, y = t(bias.est), pch = 19, col = c(2,3,4,6))
  bias.est
  #a1_tmp <- numeric(4)
  #for(j in 1:4) a1_tmp[j] <- min(bias.est[j,])
  #a1_tmp
}
tmp_y <- bias.plot()
knitr::kable(t(tmp_y), caption = "Bias for parameter estimates")

# Calculating Standard Errors
names(MATS) <- B
std.err.plot <- function(){
  # Calculating std error
  std.err <- matrix(0,nrow = 6, ncol = length(B))
  for(i in 1:6){
    for(j in 1:length(B)){
      std.err[i,j]<- sqrt((1/(B[j]-1)) * sum((param.est[i,9] - mean(MATS[[j]][,i]))^2))
     # std.err[i,j] <- sd(MATS[[j]][,i])
    }
  }
  colnames(std.err) <- paste("Std Err",B); rownames(std.err)<- names.param; std.err
  #matplot(x = B, y = t(std.err), type = "l", col = c(2,3,4,6))
  #matpoints(x = B, y = t(std.err), pch = 19, col = c(2,3,4,6))
  std.err
  #print(std.err)
  #for(j in 1:4) print(min(std.err[j,]))
}

tmp_x <- std.err.plot()
knitr::kable(t(tmp_x), caption = "Std error for parameter estimates.")

#Bias and Std Error plots
## Bias and Std error of mean
par(mfrow = c(1,2))
plot(x = tmp_x[1,], type = "l", main = "Std Error of Mean.",
     xlab = "Bootstrap Iterations", ylab = "Std. Error")
points(x = tmp_x[1,], pch = 19)
plot(x = tmp_y[1,], type = "l", main = "Bias of mean.",
     xlab = "Bootstrap Iterations", ylab = "Bias")
points(x = tmp_y[1,], pch = 19)
par(mfrow = c(1,1))

par(mfrow = c(1,2))

## Bias and Std error of Variance
plot(x = tmp_x[2,], type = "l", main = "Std Error of variance.",
     xlab = "Bootstrap Iterations", ylab = "Std. Error")
points(x = tmp_x[2,], pch = 19)
plot(x = tmp_y[2,], type = "l", main = "Bias of variance.",
     xlab = "Bootstrap Iterations", ylab = "Bias")
points(x = tmp_y[2,], pch = 19)
par(mfrow = c(1,1))

par(mfrow = c(1,2))

## Bias and Std error of Quantile 5%
plot(x = tmp_x[3,], type = "l", main = "Std Error of 5% quantile.",
     xlab = "Bootstrap Iterations", ylab = "Std Error.")
points(x = tmp_x[3,], pch = 19)
plot(x = tmp_y[3,], type = "l", main = "Bias of 5% quantile.",
     xlab = "Bootstrap Iterations", ylab = "Bias")
points(x = tmp_y[3,], pch = 19)
par(mfrow = c(1,1))

par(mfrow = c(1,2))

## Bias and Std error of Quantile 95%
plot(x = tmp_x[4,], type = "l", main = "Std Error of 95% quantile.",
     xlab = "Bootstrap Iterations", ylab = "Std. Error")
points(x = tmp_x[4,], pch = 19)
plot(x = tmp_y[4,], type = "l", main = "Bias of 95% quantile",
     xlab = "Bootstrap Iterations", ylab = "Bias")
points(x = tmp_y[4,], pch = 19)
par(mfrow = c(1,1))

par(mfrow = c(1,2))

#Bias and Std error of CV
plot(x = tmp_x[5,], type = "l", main = "Std Error of CV.",
     xlab = "Bootstrap Iterations", ylab = "Std. Error")
points(x = tmp_x[5,], pch = 19)
plot(x = tmp_y[5,], type = "l", main = "Bias of CV",
     xlab = "Bootstrap Iterations", ylab = "Bias")
points(x = tmp_y[5,], pch = 19)
par(mfrow = c(1,2))

par(mfrow = c(1,2))

#Bias and Std error of median
plot(x = tmp_x[6,], type = "l", main = "Std Error of median.",
     xlab = "Bootstrap Iterations", ylab = "Std. Error")
points(x = tmp_x[6,], pch = 19)
plot(x = tmp_y[6,], type = "l", main = "Bias of median",
     xlab = "Bootstrap Iterations", ylab = "Bias")
points(x = tmp_y[6,], pch = 19)
par(mfrow = c(1,1))

# CI for parameters
# B=10000 appears to give least std error
# Tabulating estimates of all parameters
qua <- function(x) quantile(x, c(0.05,0.95))
CI.95 <- round(apply(MATS[[8]], 2, FUN = qua), digits = 5)
#round(CI.95, digits = 5)
tmp <- rbind(CI.95, param.actual)
knitr::kable(tmp, caption = "90% Confidence Intervals for
             parameter estimates and actual estimates.")

# Analyzing Correlation between 2 stock indices
ticker1 <- c("GM", "F")
getSymbols(ticker1, from = "2010-01-01")
options("getSymbols.yahoo.warning" = FALSE)

port.prices.adj1 <- na.omit(cbind(GM[,6], F[,6]))
difflog <- function(x) na.omit(diff(log(x)))
adj.logR1<-as.ts(apply(port.prices.adj1, 2, difflog))
colnames(port.prices.adj1) <- ticker1
colnames(adj.logR1) <- ticker1
#head(port.prices.adj1)

# Time series plot of the Adj. CLose prices of GM and Ford
y <- index(port.prices.adj1)
par(mfrow = c(2,1))
plot(port.prices.adj1[,1], main = "GM")
plot(port.prices.adj1[,2], main = "F")
par(mfrow = c(1,1))

# Daily log prices plot of GM and Ford
par(mfrow = c(2,1))
plot(na.omit(diff(log(port.prices.adj1[,1]))), main = "GM")
plot(na.omit(diff(log(port.prices.adj1[,2]))), main = "Ford")
par(mfrow = c(1,1))

# Calculating the Correlation coeff.
cr_port  <- cor(as.matrix(port.prices.adj1))
print(paste("Correlation coeff:",cr_port[1,2]))


# Plotting variation of Correlation coeff. on a rolling window
# of 25 days
## Correlation widow
cor_wind <- function(wind, x, y){
index_tmp1 <- length(index(x))
cr_tmp1 <- numeric(index_tmp1 - wind + 1)

#For adjusted close prices
for(i in 1:(length(cr_tmp1)-1)){
  cr_tmp1[i] <- cor(x[i:(i+wind),])[1,2]
}

cr_tmp1 <- na.omit(cr_tmp1)


index_tmp2 <- length(index(y))
cr_tmp2 <- numeric(index_tmp2- wind + 1)

#For daily log prices
for(i in 1:(length(cr_tmp2)-1)){
  cr_tmp2[i] <- cor(y[i:(i+wind),])[1,2]
}

cr_tmp2 <- na.omit(cr_tmp2)

#print(cat(length(cr_tmp1), length(cr_tmp2), index_tmp1, index_tmp2))

par(mfrow = c(2,1))
plot(x = cr_tmp1, type = "l", ylim = c(-1,1), 
     main = paste("Adj Close Price, Window (days) =", wind),
     xlab = "No. of Windows", ylab = "Correlation", lwd = 2)
plot(cr_tmp2, type = "l", ylim = c(-1,1),
     main = paste("Daily log prices, Window (days) =", wind),
     xlab = "No. of Windows", ylab = "Correlation", lwd = 2)
par(mfrow = c(1,1))
}

cor_wind(25, port.prices.adj1, adj.logR1) 


# Correlation Bootstrapping
B <- 10000
r.B <- numeric(B)
for (i in 1:B) {
  boot.index.sample = sample_n(as.data.frame(port.prices.adj1),
                               length(as.numeric(adj.logR)),replace=T)
  r.B[i] = cor(as.matrix(boot.index.sample))[2]
}

hist(r.B, col = 8, breaks = 35, prob = F)
lines(x = seq(0.25,0.40, by = 0.001), 
      y = dnorm(seq(0.25,0.40, by = 0.001), mean(r.B), sd(r.B)), col = 2, lwd = 2)

# carrying Fisher transformation for estimation of 
# Confidence intervals of correlation coeff
## Fisher transform
w <- 0.5 * log((1+r.B)/(1-r.B))
# calculation of 90% CI
w_l <- w - (1.65/sqrt(nrow(port.prices.adj1) - 3))
w_u <-  w + (1.65/sqrt(nrow(port.prices.adj1) - 3))

# Retranforming back to rho
w_l <- (exp(2*w_l) - 1)/(exp(2*w_l) + 1)
w_u <- (exp(2*w_u) - 1)/(exp(2*w_u) + 1)

ci.w <- cbind(w_l, w_u)

#Tabulating final estimates
ci.w_avg <- apply(ci.w,2,mean)
CI_r <- rbind(ci.w_avg, quantile(r.B, c(0.025,0.975)))
rownames(CI_r) <- c("Fisher CI", "Bootstrap CI")
colnames(CI_r) <- c("5%", "95%")
knitr::kable(CI_r, caption = "90% Confidence Intervals for correlation coefficient.")