This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
library(ggplot2)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
kodk<-read.csv("/Users/Luke/Documents/BC/Predictive Analytics/KODK.csv")
#format date values
kodk$Date <- as.Date(kodk$Date, format = c('%m/%d/%Y'))
#split into training and test set.
kodk80per<- filter(kodk, Date<"0021-10-25")
kodk20per <-filter(kodk, Date >="0021-10-25")
Simple plot of all data of basic regression
plot1 <- ggplot(kodk, aes(x=Date, y=Close)) +
geom_line() +
xlab("Date")+
geom_smooth(method=lm)
plot1
## `geom_smooth()` using formula 'y ~ x'
Model of daily returns including trading volume
model1 <- lm(formula = Close ~ Date + Volume, data=kodk80per)
summary(model1)
##
## Call:
## lm(formula = Close ~ Date + Volume, data = kodk80per)
##
## Residuals:
## Min 1Q Median 3Q Max
## -16.7323 -1.8079 -0.1932 1.2857 18.0467
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.566e+03 5.384e+02 12.19 <2e-16 ***
## Date 9.214e-03 7.564e-04 12.18 <2e-16 ***
## Volume 6.943e-08 4.874e-09 14.24 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.547 on 401 degrees of freedom
## Multiple R-squared: 0.4366, Adjusted R-squared: 0.4337
## F-statistic: 155.3 on 2 and 401 DF, p-value: < 2.2e-16
Predicted Values
pred.train <-predict(model1, kodk80per, interval ='prediction')
pred.test <- predict(model1, kodk20per, level=0.95, interval='prediction')
preddf<- as.data.frame(pred.test)
kodk20per$fitted <- preddf$fit
Plot of predicted values vs. actual.
plotnew <- ggplot(kodk20per, aes(x = Date))+
geom_line(aes(y = fitted, color="darkred")) +
geom_line(aes(y = Close, color="blue"))+
xlab("Date")+
ylab("Price")+
scale_colour_discrete(
labels=c("Predicted", "Actual"))
plotnew
Note that the
echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.