G👀LE

Advanced Statistical Inference

Alphabet Inc. Stock Analysis

Elena Wenpei Huang

2019年12月17日

星期二

😎

Getting the data needed…

## 'data.frame':    1265 obs. of  7 variables:
##  $ date    : Date, format: "2015-01-02" "2015-01-05" ...
##  $ open    : num  533 527 520 511 502 ...
##  $ high    : num  536 528 521 511 508 ...
##  $ low     : num  528 518 506 504 495 ...
##  $ close   : num  530 519 507 505 507 ...
##  $ volume  : num  1324000 2059100 2722800 2345900 3652700 ...
##  $ adjusted: num  530 519 507 505 507 ...

Some data cleaning…

How does the data frame look like now?

## 'data.frame':    1265 obs. of  14 variables:
##  $ date     : Date, format: "2015-01-02" "2015-01-05" ...
##  $ open     : num  533 527 520 511 502 ...
##  $ high     : num  536 528 521 511 508 ...
##  $ low      : num  528 518 506 504 495 ...
##  $ close    : num  530 519 507 505 507 ...
##  $ volume   : num  1324000 2059100 2722800 2345900 3652700 ...
##  $ adjusted : num  530 519 507 505 507 ...
##  $ weekday  : int  5 1 2 3 4 5 1 2 3 4 ...
##  $ weekdayf : Ord.factor w/ 7 levels "Su"<"Sa"<"F"<..: 3 7 6 5 4 3 7 6 5 4 ...
##  $ week     : num  0 1 1 1 1 1 2 2 2 2 ...
##  $ monthf   : Ord.factor w/ 12 levels "Jan"<"Feb"<"Mar"<..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ yearmonth: Factor w/ 61 levels "Jan 2015","Feb 2015",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ monthweek: num  1 2 2 2 2 2 3 3 3 3 ...
##  $ percent  : num  0.573 1.459 2.663 1.135 1.077 ...

HYPOTHESIS 1

Null Hypothesis: There is no correlation between adjusted stock price and different days within a week.

Alternative Hypothesis: There is a correlation between adjusted stock price and different days in a week.

Anova Test

##               Df   Sum Sq Mean Sq F value Pr(>F)
## weekdayf       4     9853    2463   0.047  0.996
## Residuals   1260 65711693   52152
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = adjusted ~ weekdayf, data = Stock)
## 
## $weekdayf
##          diff       lwr      upr     p adj
## R-F -4.374492 -59.56880 50.81982 0.9995122
## W-F -7.931393 -63.01884 47.15606 0.9949471
## T-F -5.793433 -60.82799 49.24113 0.9985070
## M-F -2.027067 -58.31460 54.26047 0.9999788
## W-R -3.556901 -58.59021 51.47641 0.9997827
## T-R -1.418940 -56.39931 53.56143 0.9999944
## M-R  2.347425 -53.88713 58.58198 0.9999619
## T-W  2.137961 -52.73514 57.01106 0.9999710
## M-W  5.904326 -50.22535 62.03400 0.9985113
## M-T  3.766365 -52.31140 59.84413 0.9997469

RESULTS FROM ANOVA TEST

  • P-value is not significant – very close to 1

  • Difference between each day is observed due to chance

  • Cannot reject null hypothesis

  • Conclude that there is no correlation between adjusted stock price and different days in a week

HYPOTHESIS 2

Null Hypothesis: There is no correlation between percentage change in stock price and trading volume.

Alternative Hypothesis: There is a correlation between percentage change in stock price and trading volume.

Checking if there are non meaningful 0 values…

##       date                 open             high             low        
##  Min.   :2015-01-02   Min.   : 499.2   Min.   : 500.3   Min.   : 490.9  
##  1st Qu.:2016-04-06   1st Qu.: 749.0   1st Qu.: 755.3   1st Qu.: 743.6  
##  Median :2017-07-07   Median : 948.0   Median : 954.2   Median : 941.0  
##  Mean   :2017-07-07   Mean   : 929.5   Mean   : 937.4   Mean   : 921.3  
##  3rd Qu.:2018-10-08   3rd Qu.:1120.2   3rd Qu.:1134.0   3rd Qu.:1112.0  
##  Max.   :2020-01-10   Max.   :1429.5   Max.   :1434.9   Max.   :1419.6  
##                                                                         
##      close            volume            adjusted         weekday     
##  Min.   : 497.1   Min.   :  520600   Min.   : 497.1   Min.   :1.000  
##  1st Qu.: 750.4   1st Qu.: 1312900   1st Qu.: 750.4   1st Qu.:2.000  
##  Median : 948.5   Median : 1636400   Median : 948.5   Median :3.000  
##  Mean   : 929.7   Mean   : 1861701   Mean   : 929.7   Mean   :3.026  
##  3rd Qu.:1122.9   3rd Qu.: 2098000   3rd Qu.:1122.9   3rd Qu.:4.000  
##  Max.   :1429.0   Max.   :12858100   Max.   :1429.0   Max.   :5.000  
##                                                                      
##  weekdayf      week           monthf       yearmonth      monthweek    
##  Su:  0   Min.   : 0.00   Aug    :112   Aug 2016:  23   Min.   :1.000  
##  Sa:  0   1st Qu.:13.00   Oct    :111   Mar 2017:  23   1st Qu.:2.000  
##  F :255   Median :26.00   Mar    :109   Aug 2017:  23   Median :3.000  
##  R :256   Mean   :26.26   Jan    :108   Aug 2018:  23   Mean   :2.955  
##  W :258   3rd Qu.:39.00   May    :107   Oct 2018:  23   3rd Qu.:4.000  
##  T :259   Max.   :53.00   Jun    :107   Oct 2019:  23   Max.   :5.000  
##  M :237                   (Other):611   (Other) :1127                  
##     percent      
##  Min.   :0.0000  
##  1st Qu.:0.2788  
##  Median :0.6233  
##  Mean   :0.8490  
##  3rd Qu.:1.1881  
##  Max.   :5.6368  
## 

open, close, adjusted, and volume – all looking good

Linear Regression

## 
## Call:
## lm(formula = percent ~ volume, data = Stock)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.4050 -0.4468 -0.1284  0.3084  4.0987 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 9.707e-02  4.476e-02   2.169   0.0303 *  
## volume      4.039e-07  2.146e-08  18.822   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7176 on 1263 degrees of freedom
## Multiple R-squared:  0.219,  Adjusted R-squared:  0.2184 
## F-statistic: 354.3 on 1 and 1263 DF,  p-value: < 2.2e-16

Post Hoc Power Analysis

## 
##      Multiple regression power calculation 
## 
##               u = 1
##               v = 1242
##              f2 = 0.28139
##       sig.level = 0.05
##           power = 1

RESULTS FROM LINEAR REGRESSION TEST

  • P-value is significant: < 2.2e-16

  • R-squared: 0.2192

  • Estimated coefficient for predictor (volume): 4.034e-07

  • Power: 1

  • Can rejct the null hypothesis very confidently and conclude that there is a correlation between percentage change in stock price and trading volume

How has the adjusted stock price of ALphabet Inc.

been changing over the past 5 years?

one visualization that explains all!

thank u, next
æ—† :)