require(fBasics)
require(data.table)
da1<-fread("d-3stocks9908.txt",header=T)
basicStats(da1[,2:4])
da2=log(da1[,2:4]+1)
basicStats(da2)
t.test(da2$axp)
One Sample t-test
data: da2$axp
t = -0.31555, df = 2514, p-value = 0.7524
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
-0.0011134506 0.0008047664
sample estimates:
mean of x
-0.0001543421
Since p-value > 5%, we can not reject H0. Thus the true mean is equal to zero.
hist(da2$axp,nclass=40)
plot(density(da2$axp))
Method 1: Jarque-Bera test
normalTest(da2$axp,method="jb")
Title:
Jarque - Bera Normalality Test
Test Results:
STATISTIC:
X-squared: 4466.8422
P VALUE:
Asymptotic p Value: < 2.2e-16
Description:
Wed Jul 10 15:35:59 2019 by user: Pann
Since p value < 5%, we should reject H0. Thus the log return of AXP does not follow normal distribution.
Method 2: Test for symmetry
S1=skewness(da2$axp)/sqrt(6/2515)
print(S1)
[1] -6.892126
Since |S1| > Z_{1-5%/2} = 1.96, we should reject H0. Thus the log return of AXP does not follow normal distribution.
da3<-fread("m-gm3dx7508.txt",header=T)
basicStats(da3[,2:5])
da4=log(da3[,2:5]+1)
basicStats(da4)
t.test(da4$gm)
One Sample t-test
data: da4$gm
t = 0.23206, df = 407, p-value = 0.8166
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
-0.008231636 0.010435276
sample estimates:
mean of x
0.00110182
Since p-value > 5%, we can not reject H0. Thus the true mean is equal to zero.
hist(da4$gm,nclass=40)
plot(density(da4$gm))
Method 1: Jarque-Bera test
normalTest(da4$gm,method="jb")
Title:
Jarque - Bera Normalality Test
Test Results:
STATISTIC:
X-squared: 351.3549
P VALUE:
Asymptotic p Value: < 2.2e-16
Description:
Wed Jul 10 22:40:33 2019 by user: Pann
Since p value is less than 5%, we should reject H0. Thus the log return of GM does not follow normal distribution.
Method 2: Test for tail thickness
K1=(kurtosis(da4$gm)-3)/sqrt(24/408)
print(K1)
[1] 16.72041
Since K > Z_{1-5%/2} = 1.96, we should reject H0. Thus the log return of GM does not follow normal distribution.
t.test(da3$vw)
One Sample t-test
data: da3$vw
t = 4.5341, df = 407, p-value = 7.619e-06
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
0.005731219 0.014504752
sample estimates:
mean of x
0.01011799
Since p-value < 5%, we should reject H0. Thus the mean of value-weighted index is not zero.
S2=skewness(da3$vw)/sqrt(6/408)
print(S2)
[1] -6.146732
Since |S2| > Z_{1-5%/2}=1.96, we should reject H0. Thus the return of value-weighted index does not follow normal distribution.
K2=(kurtosis(da3$vw)-3)/sqrt(24/408)
print(K2)
[1] 11.10727
Since K2 > Z_{1-5%/2} = 1.96, we should reject H0. Thus the return of value-weighted index does not follow normal distribution.
S3=skewness(da2$axp)/sqrt(6/2515)
print(S3)
[1] -6.892126
Since |S3| > Z_{1-5%/2}=1.96, we should reject H0. Thus the skewness measure of the returns is not zero.
K3=(kurtosis(da2$axp)-3)/sqrt(24/2515)
print(K3)
[1] 66.47812
Since K3 > Z_{1-5%/2}=1.96, we should reject H0. Thus the excess kurtosis of the returns is not zero.
da5<-fread("d-exuseu.txt",header=T)
da6<-diff(log(da5$VALUE))
basicStats(da6)
plot(density(da6))
t.test(da6)
One Sample t-test
data: da6
t = 0.24489, df = 3565, p-value = 0.8066
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
-0.0001870737 0.0002404769
sample estimates:
mean of x
2.670158e-05
Since p-value > 5%, we can not reject H0. Thus the true mean is equal to zero.
da7<-fread("SP.csv",header=T)
da8<-fread("IBM.csv",header=T)
da9<-diff(log(da7$SP.Adjusted))
da10<-diff(log(da8$IBM.Adjusted))
da11=data.frame(x=da9,y=da10)
plot(da11)
Model 1: Simple linear regression with intercept
m1 <- lm(y ~ x, da11)
summary(m1)
Call:
lm(formula = y ~ x, data = da11)
Residuals:
Min 1Q Median 3Q Max
-0.086931 -0.006241 0.000360 0.006798 0.093286
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0001795 0.0002345 0.766 0.444
x 0.2314457 0.0109979 21.045 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.01308 on 3111 degrees of freedom
Multiple R-squared: 0.1246, Adjusted R-squared: 0.1243
F-statistic: 442.9 on 1 and 3111 DF, p-value: < 2.2e-16
AIC(m1)
[1] -18160.95
Model 2: Simple linear regression without intercept
m2 <- lm(y ~ -1 + x, da11)
summary(m2)
Call:
lm(formula = y ~ -1 + x, data = da11)
Residuals:
Min 1Q Median 3Q Max
-0.086752 -0.006060 0.000539 0.006977 0.093462
Coefficients:
Estimate Std. Error t value Pr(>|t|)
x 0.2315 0.0110 21.05 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.01308 on 3112 degrees of freedom
Multiple R-squared: 0.1247, Adjusted R-squared: 0.1244
F-statistic: 443.2 on 1 and 3112 DF, p-value: < 2.2e-16
AIC(m2)
[1] -18162.37
idx <- c(1:length(da9))[da9 <= 0]
nsp <- rep(0,length(da9))
nsp[idx] = da9[idx]
c1 <- rep(0,length(da9))
c1[idx] = 1
da12 <- data.frame(x = da9, y = da10, c1, nsp)
head(da12)
Model 3: With different intercepts (alpha) for positive and negative SP log returns
m3 <- lm(y ~ c1+x, da12)
summary(m3)
Call:
lm(formula = y ~ c1 + x, data = da12)
Residuals:
Min 1Q Median 3Q Max
-0.086854 -0.006247 0.000366 0.006818 0.093189
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 9.841e-05 4.065e-04 0.242 0.809
c1 1.614e-04 6.608e-04 0.244 0.807
x 2.341e-01 1.550e-02 15.107 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.01308 on 3110 degrees of freedom
Multiple R-squared: 0.1246, Adjusted R-squared: 0.1241
F-statistic: 221.4 on 2 and 3110 DF, p-value: < 2.2e-16
AIC(m3)
[1] -18159.01
Model 4: With different coefficients (beta) for positive and negative SP log returns
m4 <- lm(y ~ nsp + x, da12)
summary(m4)
Call:
lm(formula = y ~ nsp + x, data = da12)
Residuals:
Min 1Q Median 3Q Max
-0.086771 -0.006218 0.000382 0.006816 0.092673
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.566e-06 3.303e-04 0.008 0.994
nsp -2.357e-02 3.099e-02 -0.760 0.447
x 2.432e-01 1.900e-02 12.803 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.01308 on 3110 degrees of freedom
Multiple R-squared: 0.1248, Adjusted R-squared: 0.1242
F-statistic: 221.7 on 2 and 3110 DF, p-value: < 2.2e-16
AIC(m4)
[1] -18159.53
Model 5: With different intercepts (alpha) and coeff (beta) for positive and negative SP log returns
m5 <- lm(y ~ c1 + nsp + x, da12)
summary(m5)
Call:
lm(formula = y ~ c1 + nsp + x, data = da12)
Residuals:
Min 1Q Median 3Q Max
-0.086690 -0.006189 0.000400 0.006812 0.092568
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -8.292e-05 4.707e-04 -0.176 0.860
c1 1.685e-04 6.609e-04 0.255 0.799
nsp -2.368e-02 3.100e-02 -0.764 0.445
x 2.461e-01 2.202e-02 11.173 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.01308 on 3109 degrees of freedom
Multiple R-squared: 0.1248, Adjusted R-squared: 0.124
F-statistic: 147.8 on 3 and 3109 DF, p-value: < 2.2e-16
AIC(m5)
[1] -18157.6
Since AIC of model 2 is the smallest, the best linear model is y = 0.2315x.