This is for my esteemed friend who wants to use analytics to predict stock price. I just want to let him (Sim Boon Hwa) know using price variable alone cannot predict up or down.
Downloaded UOB price April 1 2016 to Aug 29 2016 - weekly price.
Or https://sg.finance.yahoo.com/q/hp?s=U11.SI&a=00&b=3&c=2016&d=07&e=30&f=2016&g=w
## 'data.frame': 35 obs. of 7 variables:
## $ Date : chr "1/4/2016" "1/11/2016" "1/18/2016" "1/25/2016" ...
## $ Open : num 19.7 18.3 17.4 17.9 18.1 ...
## $ High : num 19.7 18.4 17.8 18.2 18.3 ...
## $ Low : num 18.3 17.6 17 17.4 17.1 ...
## $ Close : num 18.4 17.6 17.6 18.1 17.9 ...
## $ Volume : int 3127100 3247100 4308600 2999700 2398600 1688000 4846800 3847600 4693200 2215500 ...
## $ Adj.Close: num 17.7 16.9 17 17.4 17.2 ...
## Date Open High Low Close Volume Adj.Close
## 1 1/4/2016 19.69 19.69 18.26 18.41 3127100 17.72
## 2 1/11/2016 18.30 18.35 17.55 17.60 3247100 16.94
## 3 1/18/2016 17.35 17.83 17.01 17.62 4308600 16.96
## 4 1/25/2016 17.94 18.17 17.42 18.09 2999700 17.42
## 5 2/1/2016 18.09 18.26 17.10 17.87 2398600 17.20
## 6 2/8/2016 17.87 17.87 17.14 17.56 1688000 16.91
## 7 2/15/2016 17.66 18.06 17.10 17.24 4846800 16.60
## 8 2/22/2016 17.17 17.35 16.80 17.05 3847600 16.42
## 9 2/29/2016 17.19 18.64 16.91 18.50 4693200 17.81
## 10 3/7/2016 18.51 18.70 18.05 18.65 2215500 17.96
## 11 3/14/2016 18.79 19.49 18.68 19.29 2427400 18.57
## 12 3/21/2016 19.32 19.32 18.52 18.65 2347200 17.96
## 13 3/28/2016 18.65 19.10 18.39 18.76 2685000 18.06
## 14 4/4/2016 18.85 18.95 18.30 18.53 2694900 17.84
## 15 4/11/2016 18.38 19.75 18.36 19.63 2799600 18.90
## 16 4/18/2016 19.40 20.00 19.35 19.65 2573800 18.92
## 17 4/25/2016 19.55 19.73 18.59 18.60 2400500 18.24
## 18 5/2/2016 18.60 18.80 17.70 17.79 2782600 17.44
## 19 5/9/2016 17.83 17.90 17.45 17.77 2427600 17.42
## 20 5/16/2016 17.72 18.07 17.52 17.94 2060200 17.59
## 21 5/23/2016 18.02 18.28 17.78 18.21 1745400 17.85
## 22 5/30/2016 18.19 18.57 18.04 18.30 1905900 17.94
## 23 6/6/2016 18.47 19.17 18.45 18.61 3091100 18.25
## 24 6/13/2016 18.25 18.34 17.88 17.92 2599100 17.57
## 25 6/20/2016 18.18 18.45 17.74 17.85 2379300 17.50
## 26 6/27/2016 17.74 18.72 17.41 18.52 4006700 18.16
## 27 7/4/2016 18.60 18.69 18.01 18.15 2422900 17.80
## 28 7/11/2016 18.37 18.89 18.28 18.74 2097100 18.37
## 29 7/18/2016 18.80 19.10 18.61 19.05 2099700 18.68
## 30 7/25/2016 19.07 19.11 18.19 18.20 3247900 17.84
## 31 8/1/2016 18.30 18.40 17.88 17.93 4976000 17.58
## 32 8/8/2016 18.02 18.22 17.72 17.92 2872900 17.57
## 33 8/15/2016 17.67 17.85 17.51 17.56 2296000 17.56
## 34 8/22/2016 17.55 18.17 17.51 18.05 2385500 18.05
## 35 8/29/2016 17.97 18.18 17.97 18.00 2798500 18.00
I use linear regression to predict the NEXT 1 to 2 WEEK price using past prices.
# Create the weeks vector since got 35 weeks
weeks <- 1:nrow(prices)
# Fit a linear model to predict
price_lm <- lm(prices$Adj.Close ~ weeks)
#plot original graph
plot(prices$Adj.Close, type="l", col="blue", lwd=2, ylab="Closing Prices", main="Price", xlab = "Weeks")
# Predict next 1 to 2 week price, isn't this what you want, week 36 and week 37
future_weeks <- data.frame(weeks = 36:37)
price_pred <- predict(price_lm, future_weeks)
# Plot historical data and predictions
plot(prices$Adj.Close ~ weeks, type="l", col="blue", lwd=2, ylab="Closing Prices", main="Price", xlab = "Weeks", xlim = c(1, 40))
points(36:37, price_pred, col = "green")
summary(price_lm)
##
## Call:
## lm(formula = prices$Adj.Close ~ weeks)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.12011 -0.38796 -0.06182 0.35486 1.20597
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 17.364202 0.188815 91.964 <2e-16 ***
## weeks 0.021989 0.009148 2.404 0.022 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5466 on 33 degrees of freedom
## Multiple R-squared: 0.149, Adjusted R-squared: 0.1232
## F-statistic: 5.777 on 1 and 33 DF, p-value: 0.02201
suppressWarnings(suppressMessages(library(h2o)))
localH2O <- h2o.init(nthreads = -1)
##
## H2O is not running yet, starting it now...
##
## Note: In case of errors look at the following log files:
## C:\Users\admin\AppData\Local\Temp\RtmpOMHDUV/h2o_admin_started_from_r.out
## C:\Users\admin\AppData\Local\Temp\RtmpOMHDUV/h2o_admin_started_from_r.err
##
##
## Starting H2O JVM and connecting: Connection successful!
##
## R is connected to the H2O cluster:
## H2O cluster uptime: 1 seconds 251 milliseconds
## H2O cluster version: 3.8.3.3
## H2O cluster name: H2O_started_from_R_admin_mum924
## H2O cluster total nodes: 1
## H2O cluster total memory: 7.10 GB
## H2O cluster total cores: 8
## H2O cluster allowed cores: 8
## H2O cluster healthy: TRUE
## H2O Connection ip: localhost
## H2O Connection port: 54321
## H2O Connection proxy: NA
## R Version: R version 3.3.0 (2016-05-03)
h2o.init()
## Connection successful!
##
## R is connected to the H2O cluster:
## H2O cluster uptime: 1 seconds 486 milliseconds
## H2O cluster version: 3.8.3.3
## H2O cluster name: H2O_started_from_R_admin_mum924
## H2O cluster total nodes: 1
## H2O cluster total memory: 7.10 GB
## H2O cluster total cores: 8
## H2O cluster allowed cores: 8
## H2O cluster healthy: TRUE
## H2O Connection ip: localhost
## H2O Connection port: 54321
## H2O Connection proxy: NA
## R Version: R version 3.3.0 (2016-05-03)
#convert to H2O frame
train.h2o <- as.h2o(prices)
##
|
| | 0%
|
|==========================================================================================| 100%
### values below for columns
y.dep <- 7 #interested in adjusted close COLUMNS
x.indep <- c(2:6) # use all varibles COLUMNS
#GBM
gbm.model <- h2o.gbm(y=y.dep, x=x.indep, training_frame = train.h2o, ntrees = 1000, max_depth = 4, learn_rate = 0.01, seed = 1122)
##
|
| | 0%
|
|=========================== | 30%
|
|====================================================== | 60%
|
|======================================================================== | 80%
|
|==========================================================================================| 100%
#see which variables are important.
h2o.varimp(gbm.model)
## Variable Importances:
## variable relative_importance scaled_importance percentage
## 1 Close 386.340607 1.000000 0.751719
## 2 Volume 52.520115 0.135943 0.102191
## 3 High 42.168064 0.109147 0.082048
## 4 Low 23.390825 0.060545 0.045512
## 5 Open 9.523388 0.024650 0.018530
myprice <- data.frame(Close=18.23)
#convert to h20 frame
result <- as.h2o(myprice)
##
|
| | 0%
|
|==========================================================================================| 100%
predict.gbm <- as.data.frame(h2o.predict(gbm.model, result))
##
|
| | 0%
|
|==========================================================================================| 100%