1 Description of the Dataset

This dataset contains the monthly prices of natural gas in nominal dollars from 2004 to 2020.

data = read.csv("C:/Users/qinfa/Desktop/school/STA 321/monthly_csv.csv")

## get last 200 observations
natural_gas <- data %>% slice_tail(n=200)

2 Create the Time Series Object

gas.ts = ts(natural_gas[,2], frequency = 12, start = c(2004, 1))
par(mar=c(2,2,2,2))
plot(gas.ts, main="Natural Gas Monthly Prices from Jaunary, 2004 and , August 2020", ylab="Monthly Rate", xlab="")

3 Forecasting with Decomposition

The different behaviors of the different methods of decomposition are displayed below.

Classical Decomposition:

cls.decomp = decompose(gas.ts)
par(mar=c(2,2,2,2))
plot(cls.decomp)

STL Decomposition:

stl.decomp=stl(gas.ts, s.window = 12)
par(mar=c(2,2,2,2))
plot(stl.decomp)

4 Training and Testing Data

We will hold back the last 7 periods for testing our forecast model. We will also consider the influence of sample size on the creation of our forecast model by defining training datasets with different sizes. The training sets will be created with sizes of 193, 143, 93, and 43. The same seven periods will be used to calculate the prediction error with each case.

ini.data = natural_gas[,2]
n0 = length(ini.data)
##
train.data01 = natural_gas[1:(n0-7), 2]
train.data02 = natural_gas[51:(n0-7), 2]
train.data03 = natural_gas[101:(n0-7), 2]
train.data04 = natural_gas[151:(n0-7), 2]
## last 7 observations
test.data = natural_gas[(n0-6):n0,2]
##
train01.ts = ts(train.data01, frequency = 12, start = c(2004, 1))
train02.ts = ts(train.data02, frequency = 12, start = c(2008, 3))
train03.ts = ts(train.data03, frequency = 12, start = c(2012, 5))
train04.ts = ts(train.data04, frequency = 12, start = c(2016, 7))
##
stl01 = stl(train01.ts, s.window = 12)
stl02 = stl(train02.ts, s.window = 12)
stl03 = stl(train03.ts, s.window = 12)
stl04 = stl(train04.ts, s.window = 12)
## Forecast with decomposing
fcst01 = forecast(stl01,h=7, method="naive")
fcst02 = forecast(stl02,h=7, method="naive")
fcst03 = forecast(stl03,h=7, method="naive")
fcst04 = forecast(stl04,h=7, method="naive")

Next, we perform error analysis.

## To compare different errors, we will not use the percentage for MAPE
PE01=(test.data-fcst01$mean)/fcst01$mean
PE02=(test.data-fcst02$mean)/fcst02$mean
PE03=(test.data-fcst03$mean)/fcst03$mean
PE04=(test.data-fcst04$mean)/fcst04$mean
###
MAPE1 = mean(abs(PE01))
MAPE2 = mean(abs(PE02))
MAPE3 = mean(abs(PE03))
MAPE4 = mean(abs(PE04))
###
E1=test.data-fcst01$mean
E2=test.data-fcst02$mean
E3=test.data-fcst03$mean
E4=test.data-fcst04$mean
##
MSE1=mean(E1^2)
MSE2=mean(E2^2)
MSE3=mean(E3^2)
MSE4=mean(E4^2)
###
MSE=c(MSE1, MSE2, MSE3, MSE4)
MAPE=c(MAPE1, MAPE2, MAPE3, MAPE4)
accuracy=cbind(MSE=MSE, MAPE=MAPE)
row.names(accuracy)=c("n.193", "n.143", "n. 93", "n. 43")
kable(accuracy, caption="Error comparison between forecast results with different sample sizes")

Error comparison between forecast results with different sample sizes
	MSE	MAPE
n.193	0.0514482	0.0855067
n.143	0.0499262	0.0836555
n. 93	0.0559320	0.0862282
n. 43	0.1332113	0.1871343

plot(1:4, MSE, type="b", col="darkred", ylab="Error", xlab="",
     ylim=c(0,.25),xlim = c(0.5, 4.5), main="Error Curves", axes=FALSE)
labs=c("n=193", "n=143", "n=93", "n=43")
axis(1, at=1:4, label=labs, pos=0)
axis(2)
lines(1:4, MAPE, type="b", col="blue")
text(1:4, MAPE+0.03, as.character(round(MAPE,4)), col="blue", cex=0.7)
text(1:4, MSE-0.03, as.character(round(MSE,4)), col="darkred", cex=0.7)
legend(1.5, 0.63, c("MSE", "MAPE"), col=c("darkred","blue"), lty=1, bty="n", cex=0.7)

5 Comparing Forecast Errors

When training the same algorithm with the sample sizes of 193, 143, 93, and 43, the values of MSE and MAPE show that we yield the best results with a training data set size of 143. However, the sample sizes of 193, 143, and 93 all performed similarly, with the sample size of 43 having a much larger error value than any of the other three values.

STA 321 Assignment 12: Deconstructed Time Series of Natural Gas Prices from 2004 to 2020

Alice Xiang

11/19/2023

1 Description of the Dataset

2 Create the Time Series Object

3 Forecasting with Decomposition

4 Training and Testing Data

5 Comparing Forecast Errors