R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
kodk<-read.csv("/Users/Luke/Documents/BC/Predictive Analytics/KODK.csv")
#format date values
kodk$Date <- as.Date(kodk$Date, format = c('%m/%d/%Y'))

#split into training and test set.
kodk80per<- filter(kodk, Date<"0021-10-25")
kodk20per <-filter(kodk, Date >="0021-10-25")

Simple plot of all data of basic regression

plot1 <- ggplot(kodk, aes(x=Date, y=Close)) +
  geom_line() + 
  xlab("Date")+
  geom_smooth(method=lm)
plot1
## `geom_smooth()` using formula 'y ~ x'

Model of daily returns including trading volume

model1 <- lm(formula = Close ~ Date + Volume, data=kodk80per)
summary(model1)
## 
## Call:
## lm(formula = Close ~ Date + Volume, data = kodk80per)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -16.7323  -1.8079  -0.1932   1.2857  18.0467 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 6.566e+03  5.384e+02   12.19   <2e-16 ***
## Date        9.214e-03  7.564e-04   12.18   <2e-16 ***
## Volume      6.943e-08  4.874e-09   14.24   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.547 on 401 degrees of freedom
## Multiple R-squared:  0.4366, Adjusted R-squared:  0.4337 
## F-statistic: 155.3 on 2 and 401 DF,  p-value: < 2.2e-16

Predicted Values

pred.train <-predict(model1, kodk80per, interval ='prediction')
pred.test <- predict(model1, kodk20per, level=0.95, interval='prediction')
preddf<- as.data.frame(pred.test)
kodk20per$fitted <- preddf$fit

Plot of predicted values vs. actual.

plotnew <- ggplot(kodk20per, aes(x = Date))+
  geom_line(aes(y = fitted, color="darkred")) + 
  geom_line(aes(y = Close, color="blue"))+
  xlab("Date")+
  ylab("Price")+
  scale_colour_discrete(
    labels=c("Predicted", "Actual"))
plotnew

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.