Problem 1

The hypothesis test for the 4 stocks is that the NULL = \(\beta\) = 0

Setting Up the CAPM Model

#Problem one
require(mosaic)
require(MASS)
require(openintro)
setwd("C:\\Users\\J\\Desktop\\R-stuffs\\data-files")

#READ IN DATAFRAME
capm = read.csv("capm.csv")

#CONVERT USTB3M TO DECIMAL
RF = capm$USTB3M/100

#CONVERT PRICE TO CONTINUOUSLY COMPOUNDED RETURNS
SANDP_RETURNS =  diff(log(capm$SANDP))
FORD_RETURNS = diff(log(capm$FORD))
GE_RETURNS = diff(log(capm$GE))
MSOFT_RETURNS = diff(log(capm$MICROSOFT))
ORACLE_RETURNS = diff(log(capm$ORACLE))

#GET RM IN EXCESS RETURNS
SANDP_RETURNS = SANDP_RETURNS - RF
FORD_RETURNS = FORD_RETURNS - RF
GE_RETURNS = GE_RETURNS - RF
MSOFT_RETURNS = MSOFT_RETURNS - RF
ORACLE_RETURNS = ORACLE_RETURNS - RF


#CREATE DATA FRAMES
FORD.DATA = data.frame(SANDP_RETURNS = SANDP_RETURNS, FORD_RETURNS = FORD_RETURNS )
GE.DATA = data.frame(SANDP_RETURNS = SANDP_RETURNS, GE_RETURNS = GE_RETURNS )
MSOFT.DATA = data.frame(SANDP_RETURNS = SANDP_RETURNS, MSOFT_RETURNS = MSOFT_RETURNS )
ORACLE.DATA = data.frame(SANDP_RETURNS = SANDP_RETURNS, oRACLE_RETURNS = ORACLE_RETURNS)

Critical values

significance.level = c(.05, 0.025, .005)
t_crit = qt(1-significance.level, df = 134)

Results

90% = 1.656, 95% = 1.978, 99% = 2.613

Run Regression against the market for each stock to find \(\beta\) And \(\alpha\)

Ford

#FORD
FORDFIT = lm(FORD_RETURNS ~ SANDP_RETURNS, data = FORD.DATA)
summary(FORDFIT)

Results

Residuals:
     Min       1Q   Median       3Q      Max 
-0.49646 -0.06669 -0.00409  0.05178  0.62584 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)    0.01166    0.01117   1.044    0.298    
SANDP_RETURNS  1.97179    0.21822   9.036 1.58e-15 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1253 on 134 degrees of freedom
Multiple R-squared:  0.3786,    Adjusted R-squared:  0.374 
F-statistic: 81.65 on 1 and 134 DF,  p-value: 1.58e-15

Ford Conclusion

Recall that the critical values are 90% = 1.656, 95% = 1.978, and 99% = 2.613. In Fords case, the T-value For \(\beta\) is 9.063. The null dos not fit into the 90%, 95% or 99% confidence interval, so in this case we reject the NULL Hypothesis. However, the \(\alpha\) value is 1.044, so we fail to reject \(\alpha\) at any of the 3 confidence intervals.

GE

#GE
GEFIT = lm(GE_RETURNS ~ SANDP_RETURNS, data = GE.DATA)
summary(GEFIT)

Results


Residuals:
      Min        1Q    Median        3Q       Max 
-0.176746 -0.028076 -0.002496  0.037119  0.214075 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)   0.001015   0.005120   0.198    0.843    
SANDP_RETURNS 1.295586   0.099992  12.957   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.05741 on 134 degrees of freedom
Multiple R-squared:  0.5561,    Adjusted R-squared:  0.5528 
F-statistic: 167.9 on 1 and 134 DF,  p-value: < 2.2e-16

GE Conclusion

Recall that the critical values are 90% = 1.656, 95% = 1.978 and 99% = 2.613. In Fords case, the T-value For \(\beta\) is 12.957. The null dos not fit into the 90%, 95% or 99% confidence interval so in this case we reject the NULL Hypothesis. However, the \(\alpha\) value is 0.198, so we fail to reject \(\alpha\) at any of the 3 confidence intervals.

Microsoft

#MSOFT
MSOFTFIT = lm(MSOFT_RETURNS ~ SANDP_RETURNS, data = MSOFT.DATA)
summary(MSOFTFIT)

Results


Residuals:
      Min        1Q    Median        3Q       Max 
-0.144113 -0.036321 -0.002089  0.029645  0.212258 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)   -0.001683   0.004811   -0.35    0.727    
SANDP_RETURNS  0.996292   0.093957   10.60   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.05394 on 134 degrees of freedom
Multiple R-squared:  0.4563,    Adjusted R-squared:  0.4522 
F-statistic: 112.4 on 1 and 134 DF,  p-value: < 2.2e-16

Microsoft conclusion

Recall that the critical values are 90% = 1.656, 95% = 1.978 and 99% = 2.613. In Fords case, the T-value For \(\beta\) is 10.60. The null dos not fit into the 90%, 95% or 99% confidence interval so in this case we reject the NULL Hypothesis. However, the \(\alpha\) value is -0.35, so we fail to reject \(\alpha\) at any of the 3 confidence intervals.

Oracle


ORACLEFIT = lm(ORACLE_RETURNS ~ SANDP_RETURNS, data = ORACLE.DATA)
summary(ORACLEFIT)

Results

Residuals:
      Min        1Q    Median        3Q       Max 
-0.301732 -0.038969  0.002515  0.036819  0.255207 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)   0.003060   0.006496   0.471    0.638    
SANDP_RETURNS 1.052564   0.126883   8.296 1.01e-13 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.07285 on 134 degrees of freedom
Multiple R-squared:  0.3393,    Adjusted R-squared:  0.3344 
F-statistic: 68.82 on 1 and 134 DF,  p-value: 1.015e-13

Oracle conclusion

Recall that the critical values are 90% = 1.656, 95% = 1.978 and 99% = 2.613. In Fords case, the T-value For \(\beta\) is 8.296.The null dos not fit into the 90%, 95% or 99% confidence interval, so in this case we reject the NULL Hypothesis. However, the \(\alpha\) value is 0.471, so we fail to reject \(\alpha\) at any of the 3 confidence intervals.

Problem 2

We are testing the hypothisis of \(\beta\) = 1 by setting up the real beta to be close to 1 and running symulation

Setting up the Model where \(\beta\) = .97 and the number of loops is 1,000

#Problem 2 
require(mosaic)
require(MASS)
require(openintro)

setwd("C:\\Users\\J\\Desktop\\R-stuffs\\data-files")
set.seed(12345)
# set up the Parameters. this determines the PRF
T = 100
alpha = 0.0
beta = .97
x = runif(n=T)
Null = 1




#run the main simulation loop
M = 1000 #the number of simulations
beta.hat = rep(0, M)
for(i in 1:20)
{
  size = M * i
  u = rnorm(size)
  x = runif(size)
  y = alpha + beta*x+u 
  srf = lm(y ~ x ) #sample regression function
  beta.hat.i = coef(srf)[2]    #this stores the slope coef
}

#standared error
se.beta = coef(summary(srf))[, "Std. Error"][2]

#t-test
t.beta = (beta.hat.i - Null)/se.beta
significance.level = c(.05, 0.025, .005)
t_crit = qt(1-significance.level, df = T - 2)

#call values
summary(srf)
se.beta
t_crit
abs(t.beta)
print(c(i, size, as.character(abs(t.beta) > t_crit))) 
#False = Accept 
#True = Reject

Critical Values for 1,000 data points

#T = 1000
significance.level = c(.05, 0.025, .005)
t_crit = qt(1-significance.level, df = T - 2)

Critical Values

90% = 1.66 , 95% = 1.984 , 99% = 2.627

Now we run the \(\beta\) = .97 test with 1,000 loops.

Results

Residuals:
    Min      1Q  Median      3Q     Max 
-3.8500 -0.6720 -0.0035  0.6782  4.0064 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.002228   0.014141   0.158    0.875    
x           0.961344   0.024529  39.191   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.003 on 19998 degrees of freedom
Multiple R-squared:  0.07133,   Adjusted R-squared:  0.07128 
F-statistic:  1536 on 1 and 19998 DF,  p-value: < 2.2e-16

> se.beta
         x 
0.02452948 
> t_crit
[1] 1.646382 1.962344 2.580765
> abs(t.beta)
       x 
1.575901 
> print(c(i, size, as.character(abs(t.beta) > t_crit))) 
[1] "20"    "20000" "FALSE" "FALSE" "FALSE"
> #False = Fail to Reject 
> #True = Reject

Conclusion for \(\beta\) of .97 and 1,000 loops

Recall that the critical values are 90% = 1.66, 95% = 1.984 and 99% = 2.627. In this case, the T-value is 1.576. We fail to reject at all of the confidence intervals.

Re-run the experiment with 10,000 as the number of simulations.

#run the main simulation loop
M = 10000 #the number of simulations

Results


Residuals:
    Min      1Q  Median      3Q     Max 
-4.6714 -0.6752  0.0011  0.6739  4.6111 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.002468   0.004470   0.552    0.581    
x           0.969184   0.007730 125.372   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.9996 on 199998 degrees of freedom
Multiple R-squared:  0.07287,   Adjusted R-squared:  0.07286 
F-statistic: 1.572e+04 on 1 and 199998 DF,  p-value: < 2.2e-16

> se.beta
          x 
0.007730443 
> t_crit
[1] 1.646382 1.962344 2.580765
> abs(t.beta)
       x 
3.986294 
> print(c(i, size, as.character(abs(t.beta) > t_crit))) 
[1] "20"    "2e+05" "TRUE"  "TRUE"  "TRUE" 
> #False = Accept 
> #True = Reject

Conclusion for \(\beta\) of .97 and 1,000 loops

Recall that the critical values are 90% = 1.661, 95% = 1.985 and 99% = 2.626.in this case, We reject the T-value of 3.986 at the 90%, 95% and 99% Confidence intervals.

Re-run the experiment with \(\beta\) = .98

T = 100
alpha = 0.0
beta = .98
x = runif(n=T)
Null = 1

Results

Residuals:
    Min      1Q  Median      3Q     Max 
-3.8500 -0.6720 -0.0035  0.6782  4.0064 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.002228   0.014141   0.158    0.875    
x           0.971344   0.024529  39.599   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.003 on 19998 degrees of freedom
Multiple R-squared:  0.07271,   Adjusted R-squared:  0.07266 
F-statistic:  1568 on 1 and 19998 DF,  p-value: < 2.2e-16

> se.beta
         x 
0.02452948 
> t_crit
[1] 1.646382 1.962344 2.580765
> abs(t.beta)
       x 
1.168228 
> print(c(i, size, as.character(abs(t.beta) > t_crit))) 
[1] "20"    "20000" "FALSE" "FALSE" "FALSE"
> #False = Accept 
> #True = Reject

Conclusion for \(\beta\) of .98 with 1,000 loops

Recall that the critical values are 90% = 1.66, 95% = 1.984 and 99% = 2.627. In this case, the T-value is 1.168. We fail to reject at all of the confidence intervals.

Re-run the experiment with 10,000 loops

#run the main simulation loop
M = 10000 #the number of simulations

Results

Residuals:
    Min      1Q  Median      3Q     Max 
-4.6714 -0.6752  0.0011  0.6739  4.6111 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.002468   0.004470   0.552    0.581    
x           0.979184   0.007730 126.666   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.9996 on 199998 degrees of freedom
Multiple R-squared:  0.07426,   Adjusted R-squared:  0.07426 
F-statistic: 1.604e+04 on 1 and 199998 DF,  p-value: < 2.2e-16

> se.beta
          x 
0.007730443 
> t_crit
[1] 1.646382 1.962344 2.580765
> abs(t.beta)
       x 
2.692707 
> print(c(i, size, as.character(abs(t.beta) > t_crit))) 
[1] "20"    "2e+05" "TRUE"  "TRUE"  "TRUE" 
> #False = Accept 
> #True = Reject

Conclusion for \(\beta\) .98 and 10,000 Loops

Recall that the critical values are 90% = 1.661, 95% = 1.985 and 99% = 2.626.in this case, We reject the T-value of 2.693 at the 90%, 95% and 99% Confidence intervals.

Re-run the experiment with \(\beta\) = .99

T = 100
alpha = 0.0
beta = .99
x = runif(n=T)
Null = 1

Results

Residuals:
    Min      1Q  Median      3Q     Max 
-3.8500 -0.6720 -0.0035  0.6782  4.0064 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.002228   0.014141   0.158    0.875    
x           0.981344   0.024529  40.007   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.003 on 19998 degrees of freedom
Multiple R-squared:  0.0741,    Adjusted R-squared:  0.07406 
F-statistic:  1601 on 1 and 19998 DF,  p-value: < 2.2e-16

> se.beta
         x 
0.02452948 
> t_crit
[1] 1.646382 1.962344 2.580765
> abs(t.beta)
        x 
0.7605556 
> print(c(i, size, as.character(abs(t.beta) > t_crit))) 
[1] "20"    "20000" "FALSE" "FALSE" "FALSE"
> #False = Accept 
> #True = Reject

Conclusion for \(\beta\) of .99 at 1,000

Recall that the critical values are 90% = 1.66, 95% = 1.984 and 99% = 2.627. In this case, the T-value is .761. We fail to reject at all of the confidence intervals.

Re-run the experiment with 10,000 loops


#run the main simulation loop
M = 10000 #the number of simulations

Results

Residuals:
    Min      1Q  Median      3Q     Max 
-4.6714 -0.6752  0.0011  0.6739  4.6111 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.002468   0.004470   0.552    0.581    
x           0.989184   0.007730 127.960   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.9996 on 199998 degrees of freedom
Multiple R-squared:  0.07567,   Adjusted R-squared:  0.07567 
F-statistic: 1.637e+04 on 1 and 199998 DF,  p-value: < 2.2e-16

> se.beta
          x 
0.007730443 
> t_crit
[1] 1.646382 1.962344 2.580765
> abs(t.beta)
      x 
1.39912 
> print(c(i, size, as.character(abs(t.beta) > t_crit))) 
[1] "20"    "2e+05" "FALSE" "FALSE" "FALSE"
> #False = Accept 
> #True = Reject

Conclusion for \(\beta\) .99 at 10,000 loops

Recall that the critical values are 90% = 1.66, 95% = 1.984 and 99% = 2.627. In this case, the T-value is 1.399. We fail to reject at all of the confidence intervals.

Re-run the experiment with 100,000 loops

#run the main simulation loop
M = 100000 #the number of simulations

Results

Residuals:
    Min      1Q  Median      3Q     Max 
-4.9765 -0.6756  0.0004  0.6754  4.7521 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.0004196  0.0014159   0.296    0.767    
x           0.9881531  0.0024524 402.933   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.001 on 1999998 degrees of freedom
Multiple R-squared:  0.07508,   Adjusted R-squared:  0.07508 
F-statistic: 1.624e+05 on 1 and 1999998 DF,  p-value: < 2.2e-16

> se.beta
          x 
0.002452398 
> t_crit
[1] 1.646382 1.962344 2.580765
> abs(t.beta)= 
       x 
4.830761 
> print(c(i, size, as.character(abs(t.beta) > t_crit))) 
[1] "20"    "2e+06" "TRUE"  "TRUE"  "TRUE" 
> #False = Accept 
> #True = Reject

Conclusion for \(\beta\) of .99 and 100,000 loops

Recall that the critical values are 90% = 1.661, 95% = 1.985 and 99% = 2.626.In this case, We reject the T-value of 4.831 at the 90%, 95% and 99% Confidence intervals.

What we learn from this Experiment.

When a Frequentisist Vews the Results, they might say, “given the data we’ve observed, the probability of drawing a sample with a mean equal to our null is < X%”" so im going to assume that the null hypothesis is correct. Frequentist analysis is heavily dependent on the distribution generating the data, but indirectly so. Because of this it is very easy to misinterperet results and reject the Null hypothisis, when in reality the Null may be really close to the true value.