Math 228/Hon 309 Assignment M12
Please answer the questions in this assignment in the form of an RMarkdown document in HTML. Please submit this document in the Dropbox for Assignment M12. All data files are in the Data Sets folder on the course Moodle site. Name(s):Aiza K
Open the data set US_POP.
US_POP <- read.csv("C:/Users/aizax94/Downloads/US_POP.CSV")
View(US_POP)
plot(US_POP$population ~ US_POP$year, main = "Plot of Population by Year", ylab = "population", xlab = "year")
The plot shows a postive, exponential trend in population as the year inceases.
model <- lm(population ~ year + I(year^2), US_POP)
model
##
## Call:
## lm(formula = population ~ year + I(year^2), data = US_POP)
##
## Coefficients:
## (Intercept) year I(year^2)
## 2.188e+10 -2.431e+07 6.755e+03
The model is Population = 2.188^10 - 2.431^7 * year + 6.755^3 * year^2.
range(US_POP$year)
## [1] 1790 2010
x <- seq(1790,2010)
k <- data.frame(year = x)
y <- predict(model, k)
plot(population ~ year, US_POP,
main = "Plot of Population against Year")
lines(x, y, col = "red")
(d)Use your model to predict the value for Y2020, the population of the U.S. in the year 2020.
library(forecast)
## Warning: package 'forecast' was built under R version 3.2.5
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 3.2.5
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: timeDate
## This is forecast 7.1
pts <- ts(US_POP$population, frequency = 0.1, start = 1790)
mod <- ets(pts, model = "ZZZ")
fcst <- forecast(mod, 1)
fcst
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2020 335221263 329146957 341295568 325931414 344511112
Using the model, we can predict the population of the U.S. in the year 2020 to be 335,221,263.
k <- data.frame(year = 2020)
predict(model, k, interval = "confidence", level = 0.9)
## fit lwr upr
## 1 334607733 331083989 338131477
We can be 90% confident that the population of the US in the year 2020 lies between 331,083,989 and 338,131,477.
Open the data set souvenir.csv. (I have cleaned in up a little.)
souvenir <- read.table("C:/Users/aizax94/Downloads/souvenir.csv", quote="\"", comment.char="")
View(souvenir)
Plot sales. Do connect the points. Carefully describe the components of the plot.
plot(souvenir$V1, type = "l", main = "Plot of monthly souvenir sales ",
ylab = "Sales", xlab = "Month")
The model is additive, shows a trend (about every 10 years), and has seasonality, therefore the components of the model are AAA.
Create a new variable (lsales) by taking the natural logs of the sales. Use the ts function to create a time series object called lsts. Obtain a plot of lsts. Carefully describe the components of the plot.
lsales <- log(souvenir$V1)
lsts <- ts(lsales, start = 1987, frequency = 12)
plot(lsts)
The components of the plot are multiplicative, there is a trend, and there is also seasonality. Overal souvenir sales increase, but there is a repeated trend in the graph.
Fit an appropriate exponential smoothing model to the series lsts.
lsts.5 <- ma(lsts, 5)
d <- data.frame(lsts, lsts.5)
plot(lsts.5)
Use your model to predict log_sales and then sales for the first four months of 1995. Each of your four predictions should be accompanied by an 80% confidence interval. Write a brief summary .
mod <- ets(lsts, model = "ZZZ")
fcst2 <- forecast(mod, 16, interval = "confidence", level = 0.8)
fcst2
## Point Forecast Lo 80 Hi 80
## Jan 1994 9.680343 9.490263 9.870423
## Feb 1994 9.949966 9.740940 10.158991
## Mar 1994 10.446730 10.220256 10.673204
## Apr 1994 10.106603 9.863856 10.349351
## May 1994 10.144243 9.886174 10.402313
## Jun 1994 10.200476 9.927875 10.473076
## Jul 1994 10.381921 10.095460 10.668382
## Aug 1994 10.384274 10.084530 10.684018
## Sep 1994 10.512556 10.200032 10.825079
## Oct 1994 10.609431 10.284572 10.934290
## Nov 1994 11.106421 10.769621 11.443220
## Dec 1994 11.896833 11.548441 12.245225
## Jan 1995 9.993443 9.633785 10.353101
## Feb 1995 10.263066 9.892433 10.633699
## Mar 1995 10.759830 10.378487 11.141173
## Apr 1995 10.419704 10.027894 10.811513
We can predict the log sales in the first four months of 1995 to be 9.99, 10.26, 10.75, and 10.41 respectively. We can be 80% confident log sale lies in between 9.63 and 10.35 for January, for February between 9.89 and 10.63, for March between 10.37 and 11.14, and for April between 10.02 and 10.81.
sales <- ts(souvenir$V1, start = 1987, frequency = 12)
mod <- ets(sales, model = "ZZZ")
fcst3 <- forecast(mod, 16, interval = "confidence", level = 0.8)
fcst3
## Point Forecast Lo 80 Hi 80
## Jan 1994 15100.83 12182.58 18019.08
## Feb 1994 20270.17 15954.06 24586.29
## Mar 1994 31496.11 24225.70 38766.52
## Apr 1994 22761.86 17131.58 28392.14
## May 1994 23130.81 17053.07 29208.56
## Jun 1994 22968.07 16600.80 29335.34
## Jul 1994 27232.77 19310.78 35154.76
## Aug 1994 27483.04 19131.14 35834.95
## Sep 1994 29760.71 20347.68 39173.74
## Oct 1994 32722.20 21984.05 43460.34
## Nov 1994 51418.21 33958.60 68877.82
## Dec 1994 111921.09 72688.52 151153.66
## Jan 1995 16126.42 10302.58 21950.27
## Feb 1995 21639.10 13602.73 29675.47
## Mar 1995 33611.27 20795.05 46427.49
## Apr 1995 24281.95 14789.20 33774.70
We can predict the sales in the first four months of 1995 to be 16126.42, 21639.10, 33611.27 and 24281.95 respectively. We can be 80% confident sales lie in between 10302.58 and 21950.27 for January, for February between 13602.73 and 29675.47, for March between 20795.05 and 46427.49, and for April between 14789.20 and 33774.70.