Sampling distribution

If you followed the notes from last week, you should have an object called BAC.stats in the workspace. You can get the previous notes from this link. The ‘BAC.stats’ object contains the calculation of the mean, standard deviation, skewness and kurtosis, maximum and minimum.

The code is

BAC <- read.csv("../Data/BAC.csv")
BAC$Date <- as.Date(BAC$Date, format = "%d/%m/%Y")
BAC <- BAC[,c(1, 7)]
colnames(BAC) <- c("Date", "Close")
BAC <- BAC[rev(rownames(BAC)),]
BACR <- diff(BAC$Close) / BAC$Close[-length(BAC$Close)] 
BACr <- diff(log(BAC$Close))
BAC <- cbind(BAC[-1,], BACR, BACr)
head(BAC)

##           Date    Close        BACR        BACr
## 167 2000-02-01 14.73385 -0.05032254 -0.05163287
## 166 2000-03-01 16.98035  0.15247242  0.14190956
## 165 2000-04-03 15.86722 -0.06555424 -0.06780170
## 164 2000-05-01 18.11670  0.14176910  0.13257891
## 163 2000-06-01 14.05218 -0.22435177 -0.25405617
## 162 2000-07-03 15.48191  0.10174426  0.09689461

require(moments)
statNames <- c("mean", "std dev", "skewness", "kurtosis", "max", "min")
BAC.stats <- c(mean(BACr), sd(BACr), skewness(BACr), kurtosis(BACr), 
               max(BACr), min(BACr))
names(BAC.stats) <- statNames
round(BAC.stats, 4)

##     mean  std dev skewness kurtosis      max      min 
##  -0.0001   0.1323  -1.3077  11.8954   0.5489  -0.7607

Therefore, the standard errors of the sampling distribution should be easy to calculate. For example, the standard error of the mean is

\[ SE_\hat{\mu} = \frac{s}{\sqrt{T}}}\]

where, $SE_\hat{\mu}$ is the standard error of the mean; $s$ is the estimated standard deviation; $T$ is the number of variables.

The standard error and the 95% confidence intervals can be calculated using the following

SEmeanBAC <- BAC.stats[1]/length(BACr)^0.5
lowerconfBAC <- BAC.stats[1] - 2 * SEmeanBAC
upperConfBAC <- BAC.stats[1] + 2 * SEmeanBAC

The other confidence intervals for standard deviation, skew and Kurtosis can be calculated in the same way.

Autocorrelation

The covariance or correlation coefficient can be calculated with the acf function. Take a look at the documentation with ?afc.

acf(BACr)

The plot is automatic (unless turned off with the argument plot = FALSE). The confidence intervals for the estimate of the correlation coefficient are the blue dashed lines. The first correlation is with itself. This is of course 1. The function acf will return a number of values. To find out what is available type names(acf(BACr)). These are called Value in the documentation. It lets you know the calculations that have been returned. They can be accessed by using the $ key. To get the correlation coefficients for each of the lags

names(acf(BACr, plot = FALSE))

## [1] "acf"    "type"   "n.used" "lag"    "series" "snames"

acf(BACr, plot = FALSE)$acf

## , , 1
## 
##               [,1]
##  [1,]  1.000000000
##  [2,]  0.169952718
##  [3,]  0.042480048
##  [4,]  0.040962280
##  [5,] -0.150726957
##  [6,] -0.092506445
##  [7,] -0.192226651
##  [8,] -0.106231040
##  [9,]  0.221902901
## [10,] -0.044545803
## [11,]  0.030791710
## [12,]  0.113212097
## [13,] -0.030161143
## [14,] -0.032657273
## [15,] -0.046622596
## [16,] -0.071234876
## [17,]  0.036839168
## [18,] -0.077613283
## [19,]  0.024389485
## [20,]  0.038937806
## [21,] -0.001167009
## [22,]  0.089261313
## [23,] -0.020748271

Bootstrap

If you want to know more about this method.

There is a finance application from Burns Statistics
Quick R

n <- 1000 # sample size
MeanSample <- numeric(n)
for(i in 1:n){
  temp <- BACr[sample(length(BACr), length(BACr), replace = TRUE)]
  MeanSample[i] <- mean(temp, na.rm = TRUE)
}
hist(MeanSample)

quantile(MeanSample, 0.025)

##       2.5% 
## -0.0190047

quantile(MeanSample, 0.975)

##    97.5% 
## 0.020226

n will determine the number of samples. That can be changed to 100,000 if you are willing to wait.
Create a vector called MeanSample with a length of n
Run a loop from 1 to n, with a temporary file called temp. temp will hold the random sample each time it is created by drawing random numbers from 1 to the length of BACr and using these as the index to draw the return value from BACr
Calculate the mean from the sample and place the value in the $i^{th}$ position of MeanSample
Plot the histogram of the calculated means
Calculate the 2.5% and 97.5% values for confidence intervals using the quantile function.

Sampling distribution

Rob Hayward

17 October 2015

Autocorrelation

Bootstrap