In this project, I predict and plot the revenue of Oregon Campsite using the Neural Net method and R Shiny. This method works well with non-linear pattern relationships between revenue at different time points.
And the steps are shown below,
Step 1: Load and Preprocessing the data and convert to time series format
Step 2: Construct the time series model with the nnetar function and forecast the revenue
Step 3: Visualize and create an interactive plot using R shiny
# Import Data Camping_Revenue as revenue
revenue <- read.csv("/Users/User/Desktop/learning/time series/camping_revenue_97_17.csv", sep = '"', header = F)
# Chopping off the useless quotes at 2 positions
library(tidyr)
revenue <- separate(revenue, col = V2,
sep = -1,into = c("data", "comma"))
#Keep the useful columns only
revenue <- revenue[c("data","V4")]
#Conversion to time series
myts <- ts(as.numeric(revenue$V4),
start = 1997, frequency = 12)
## Warning in is.data.frame(data): NAs introduced by coercion
# data is still not clean (outliers and NAs)
summary(myts)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 3 18980 23218 36912 26816 3334333 4
# all in one cleaning tool
library(forecast)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
myts <- tsclean(myts) #outliers and NAs
# check the data
summary(myts)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 14209 19280 23267 23282 26658 34366
plot(myts)
The plot above is the monthly revenue for Oregon Campsite since 1997. As can be seen, the seasonal pattern is clear; summers tend to have higher revenue than winters. In addition, the revenue increases yearly. Both trend and seasonal patterns should be taken into consideration when predicting revenue.
#set up a Neural Network model
mynnetar <- nnetar(myts)
mynnetar
## Series: myts
## Model: NNAR(3,1,2)[12]
## Call: nnetar(y = myts)
##
## Average of 20 networks, each of which is
## a 4-2-1 network with 13 weights
## options were - linear output units
##
## sigma^2 estimated as 1123929
The model is a Seasonal NNAR(3,1,2) model, a third order auto-corelation model with one seasonal lag term, which is in align with our observation.
#forecasting 3 years' revenue with the model
nnetforecast <- forecast(mynnetar, h = 36,
PI = T # prediction interval
)
#Visualized the foretasted value
library(ggplot2)
autoplot(nnetforecast)
As the pattern prior to 2018, the predicted revenue follows an upward
trend and peaks in summer.
# data we need for the graph
data <- nnetforecast$x
lower <- nnetforecast$lower[,2] # 95% CI
upper <- nnetforecast$upper[,2]
pforecast <- nnetforecast$mean
mydata <- cbind(data, lower, upper,
pforecast)
library(dygraphs)
#Fetch the dataset and create the title
dygraph(mydata, main = "Oregon Campsite Restaurant") %>%
# the zoom-in tool
dyRangeSelector() %>%
#map the time series data
dySeries(name = "data", label = "Revenue Data") %>%
#map the predicted data
dySeries(c("lower","pforecast","upper"), label = "Revenue Forecast") %>%
#Add the legend
dyLegend(show = "always", hideOnMouseOut = FALSE) %>%
#y axis
dyAxis("y", label = "Monthly Revenue USD") %>%
#highlight effect
dyHighlight(highlightCircleSize = 5,
highlightSeriesOpts = list(strokeWidth = 2)) %>%
#axis and grid line color
dyOptions(axisLineColor = "navy", gridLineColor = "grey") %>%
#annotation: the CF flag on the bottome
dyAnnotation("2010-8-1", text = "CF", tooltip = "Camp Festival", attachAtBottom = T)