Instructions: Please ensure that your graphs and visuals have proper titles and axis labels, where necessary. Refer to the output, whenever appropriate, when discussing the results. Lastly, remember that creativity (coupled with relevance) will be rewarded.
Carbon Emissions is an ever-evolving and exciting issue among environmental activists and academics alike.
quantmod
package to pull the Transportation
Carbon Dioxide Emissions, All Fuels for United States
(EMISSCO2TOTVTCTOUSA
) – measured in Million Metric Tons
CO2– from FRED and declare as a ts
object with the
appropriate start date. Be sure to check that the dates are
correct.## [1] "EMISSCO2TOTVTCTOUSA"
## EMISSCO2TOTVTCTOUSA
## 1970-01-01 1131.668
## 1971-01-01 1176.331
## 1972-01-01 1246.494
## 1973-01-01 1312.130
## 1974-01-01 1278.184
## 1975-01-01 1288.565
In the plot below there it is weak stationarity. There is a trend in the data and there is clear variance in the data.
autoplot(co2_ts) + labs(title = "Transportation CO₂ Emissions (All Fuels, US)",
subtitle = "Monthly, in Million Metric Tons, from Jan 2000",x = "Year",y = "CO₂ Emissions (Million Metric Tons)") +theme_minimal()
Perform formal unit root testing. You will need to specify the null hypothesis (hypotheses) and corresponding conclusions of the unit root test(s) used. The null hypothesis is the time series has a unit root. The p-value is greater than .05, so we fail to reject the null. The time series has a unit root.
present the AIC and BIC statistics of your candidate
model(s) in a table using the kable()
function. Columns and
Rows should be properly labeled.
Explain how you arrived at your preferred model.
Both models have the identical AIC and BIC statistics. As a result, we want the simpliest of the two models. This is ARIMA(1,1,1)(0,1,1). The Ljung Box test has a p-value of .9921
I used the Ljung-Box test p-values to determine the preferred model. This dhows there is no significant autocorrelation in the residuals. The residuals behave like white noise.
##
## Augmented Dickey-Fuller Test
##
## data: co2_ts
## Dickey-Fuller = -1.1313, Lag order = 3, p-value = 0.9093
## alternative hypothesis: stationary
library(forecast)
co2_diff <- diff(co2_ts, differences = 1)
ggtsdisplay(co2_diff, main = "1st Difference of Transportation CO₂ Emissions")
fit1 <- auto.arima(co2_ts, seasonal = TRUE)
fit2 <- auto.arima(co2_ts, seasonal = TRUE, stepwise = FALSE, approximation = FALSE)
library(knitr)
model_compare <- data.frame(Model = c("Auto.ARIMA (Stepwise)", "Auto.ARIMA (Full Search)"),AIC = c(fit1$aic, fit2$aic),BIC = c(fit1$bic, fit2$bic))
kable(model_compare, caption = "Model Comparison Table (ARIMA Candidates)")
Model | AIC | BIC |
---|---|---|
Auto.ARIMA (Stepwise) | 565.3253 | 569.189 |
Auto.ARIMA (Full Search) | 565.3253 | 569.189 |
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,1,0)(0,0,1)[12]
## Q* = 1.9593, df = 9, p-value = 0.9921
##
## Model df: 1. Total lags used: 10
## Series: co2_ts
## ARIMA(0,1,0)(0,0,1)[12]
##
## Coefficients:
## sma1
## 0.4232
## s.e. 0.1814
##
## sigma^2 = 3435: log likelihood = -280.66
## AIC=565.33 AICc=565.58 BIC=569.19
model_compare <- data.frame(Model = c("ARIMA(1,1,1)(0,1,1)[12]", "ARIMA(2,1,0)(0,1,1)[12]"),AIC = c(fit1$aic, fit2$aic),BIC = c(fit1$bic,fit2$bic))
kable(model_compare, caption = "Model Comparison: AIC and BIC")
Model | AIC | BIC |
---|---|---|
ARIMA(1,1,1)(0,1,1)[12] | 565.3253 | 569.189 |
ARIMA(2,1,0)(0,1,1)[12] | 565.3253 | 569.189 |
forecast_5yr <- forecast(fit2, h = 60)
autoplot(forecast_5yr) +labs(title = "Forecast: Transportation CO₂ Emissions (Next 5 Years)",x = "Year",y = "CO₂ Emissions (Million Metric Tons)",color="Series", fill="Confidence Interval") +
theme_light()
Using the Box-Jenkins procedure, use an appropriate ARIMA model to
forecast the next 2 weeks of closing prices for the goog
series from the fpp2
package.
data("goog")
autoplot(goog) +labs(title = "GOOG Daily Closing Prices",x = "Time",y = "Price (USD)") +theme_minimal()
##
## Augmented Dickey-Fuller Test
##
## data: goog
## Dickey-Fuller = -2.5417, Lag order = 9, p-value = 0.349
## alternative hypothesis: stationary
goog_diff <- diff(goog)
autoplot(goog_diff) +labs(title = "Differenced GOOG Prices") +theme_minimal()
## Series: goog
## ARIMA(0,1,0) with drift
##
## Coefficients:
## drift
## 0.4213
## s.e. 0.2760
##
## sigma^2 = 76.19: log likelihood = -3581.45
## AIC=7166.89 AICc=7166.9 BIC=7176.7
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.0003924086 8.719767 5.831708 -0.01050304 0.9753774 1.000395
## ACF1
## Training set 0.03871058
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,1,0) with drift
## Q* = 13.134, df = 10, p-value = 0.2163
##
## Model df: 0. Total lags used: 10
forecast_goog <- forecast(fit_goog, h = 10)
autoplot(forecast_goog) +labs(title = "Forecast: GOOG Closing Prices (Next 2 Weeks)",x = "Day",y = "Price (USD)",color = "Series",fill = "Prediction Interval"
) +
theme_light()
You have responsibility for JohnBear’s sales forecasting. The sales
department forwarded data for 115 consecutive months, ending in December
2022. The data is presented in the JohnBear.csv
file on
GitHub. You can navigate to the Data folder using this link: https://github.com/Shamar-Stewart/MS_Forecasting/tree/main/Data
Your tasks are to:
ts()
object with
the appropriate frequency, end date, etc. You can read the file directly
by clicking on the link above, clicking on the file, and then clicking
the “Raw” button. You will then copy the URL and use the
read.csv()
function with the argument
header = FALSE
to read in the data. The URL should have the
following format:
https://raw.githubusercontent.com/...
.url <- "https://raw.githubusercontent.com/Shamar-Stewart/MS_Forecasting/main/Data/JohnBear.csv"
johnbear_data <- read.csv(url, header = FALSE)
johnbear_ts <- ts(johnbear_data$V1, start = c(2013, 8), frequency = 12)
autoplot(johnbear_ts) +labs(title = "JohnBear Monthly Sales (Aug 2013 – Dec 2022)",x = "Year",y = "Sales") +theme_minimal()
Yes, this is quarterly data and it has a seasonal rythm to the data.
How about a trend?
There appears to be an upward trend.
Does it appear to be nonstationary?
Yes, the variance is visible
No formal testing is required here, just a discussion of how you came to this conclusion from the plot.
autoplot(johnbear_ts) +labs(title = "JohnBear Monthly Sales (Aug 2013 – Dec 2022)",x = "Year",y = "Sales") +theme_minimal()
Due to a review of the data, we have identified the data had a strong seasonality, an upward trend, and non-stationary. As a result we developed a 12 month sales forecast based on Seasonal ARIMA (1,0,1)(0,1,1) model with a drift. The coefficient of the AR1 is .8715 which captures the persistent effects of past sales. The coefficient of the MA1 is -.7561 which helps to smooth the short-term fluctuations. The coefficient of the SMA1 is -.6399 which adjusts for recurring annual effects. The coefficient of the Drift is 7.27 which indicates a steady upward momentum in month to month sales. The p-value is above .05 which indicates no significant autocorrelation in the residuals. The MAPE is 4.72% which reflects high forecasting accuracy.Below, we are showing a 12-month forecast at an 80% and 95% confidence interval. This shows the best case and worst case scenarios throughout the year. This data should increase the company’s ability to plan inventory, staffing, and marketing strategies.
fit_johnbear <- auto.arima(johnbear_ts, seasonal = TRUE, stepwise = FALSE, approximation = FALSE)
summary(fit_johnbear)
## Series: johnbear_ts
## ARIMA(1,0,1)(0,1,1)[12] with drift
##
## Coefficients:
## ar1 ma1 sma1 drift
## 0.8715 -0.7561 -0.6399 7.2674
## s.e. 0.0989 0.1186 0.0950 0.7538
##
## sigma^2 = 12344: log likelihood = -632.4
## AIC=1274.8 AICc=1275.42 BIC=1287.97
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 1.471479 103.087 77.63983 -0.8789694 4.720894 0.6131654
## ACF1
## Training set -0.08629146
##
## Ljung-Box test
##
## data: Residuals from ARIMA(1,0,1)(0,1,1)[12] with drift
## Q* = 12.949, df = 20, p-value = 0.8795
##
## Model df: 3. Total lags used: 23
johnbear_forecast <- forecast(fit_johnbear, h = 12)
autoplot(johnbear_forecast) +labs(title = "Forecast: JohnBear Sales for the Next 12 Months",x = "Year",y = "Sales",color = "Series",fill = "Confidence Interval") +theme_minimal()