This R is the assignment of week 3 discussion for the course of Predictive Analytics. The task is : Go to any data website of your choice (e.g. the one last week or a financial website or the Census Bureau or…) Pick a time series dataset of interest to you. Using this data (be sure to sort in ascending order), build two or three ETS models. Which performs better? What explanation might explain that?
The purpose of this section is to preprocess and describe the dataset. The dataset used was Air quality data set that contains the responses of a gas multisensor device deployed on the field in an Italian city. Hourly responses averages are recorded along with gas concentrations references from a certified analyzer.
data <- read_excel("C:/Users/MMENDEZ/Downloads/AirQualityUCI/AirQualityUCI.xlsx")
summary(data)
## Date Time CO(GT)
## Min. :2004-03-10 00:00:00 Min. :1899-12-31 00:00:00 Min. :-200.00
## 1st Qu.:2004-06-16 00:00:00 1st Qu.:1899-12-31 05:00:00 1st Qu.: 0.60
## Median :2004-09-21 00:00:00 Median :1899-12-31 11:00:00 Median : 1.50
## Mean :2004-09-21 04:30:05 Mean :1899-12-31 11:29:55 Mean : -34.21
## 3rd Qu.:2004-12-28 00:00:00 3rd Qu.:1899-12-31 18:00:00 3rd Qu.: 2.60
## Max. :2005-04-04 00:00:00 Max. :1899-12-31 23:00:00 Max. : 11.90
## PT08.S1(CO) NMHC(GT) C6H6(GT) PT08.S2(NMHC)
## Min. :-200 Min. :-200.0 Min. :-200.000 Min. :-200.0
## 1st Qu.: 921 1st Qu.:-200.0 1st Qu.: 4.005 1st Qu.: 711.0
## Median :1052 Median :-200.0 Median : 7.887 Median : 894.5
## Mean :1049 Mean :-159.1 Mean : 1.866 Mean : 894.5
## 3rd Qu.:1221 3rd Qu.:-200.0 3rd Qu.: 13.636 3rd Qu.:1104.8
## Max. :2040 Max. :1189.0 Max. : 63.741 Max. :2214.0
## NOx(GT) PT08.S3(NOx) NO2(GT) PT08.S4(NO2)
## Min. :-200.0 Min. :-200.0 Min. :-200.00 Min. :-200
## 1st Qu.: 50.0 1st Qu.: 637.0 1st Qu.: 53.00 1st Qu.:1185
## Median : 141.0 Median : 794.2 Median : 96.00 Median :1446
## Mean : 168.6 Mean : 794.9 Mean : 58.14 Mean :1391
## 3rd Qu.: 284.2 3rd Qu.: 960.2 3rd Qu.: 133.00 3rd Qu.:1662
## Max. :1479.0 Max. :2682.8 Max. : 339.70 Max. :2775
## PT08.S5(O3) T RH AH
## Min. :-200.0 Min. :-200.000 Min. :-200.00 Min. :-200.0000
## 1st Qu.: 699.8 1st Qu.: 10.950 1st Qu.: 34.05 1st Qu.: 0.6923
## Median : 942.0 Median : 17.200 Median : 48.55 Median : 0.9768
## Mean : 975.0 Mean : 9.777 Mean : 39.48 Mean : -6.8376
## 3rd Qu.:1255.2 3rd Qu.: 24.075 3rd Qu.: 61.88 3rd Qu.: 1.2962
## Max. :2522.8 Max. : 44.600 Max. : 88.72 Max. : 2.2310
Inicially we plot the time serie.
Seems to have a seasonality and trend per year.
Here the type of the model is not specified.
model1<-ets(data)
model1
## ETS(A,N,A)
##
## Call:
## ets(y = data)
##
## Smoothing parameters:
## alpha = 0.4019
## gamma = 0.0001
##
## Initial states:
## l = 89.1428
## s = -7.5895 -5.2046 -141.3016 -1.2466 7.6667 14.9726
## 18.831 25.1657 24.3907 36.9894 22.9984 4.3278
##
## sigma: 50.4785
##
## AIC AICc BIC
## 6758.912 6759.945 6821.550
plot(model1)
Here is used a AAA Aditive error, Aditive trend and Aditive seasonality.
model2<-ets(data, 'AAA')
model2
## ETS(A,A,A)
##
## Call:
## ets(y = data, model = "AAA")
##
## Smoothing parameters:
## alpha = 0.3979
## beta = 0.0001
## gamma = 0.0014
##
## Initial states:
## l = 56.7906
## b = 0.8667
## s = -6.9189 -4.9511 -139.511 0.7708 5.1454 14.8143
## 23.3837 25.0197 23.4346 36.2292 21.618 0.9653
##
## sigma: 50.742
##
## AIC AICc BIC
## 6765.857 6767.179 6836.847
plot(model2)
Here is used a ANN model, Aditive error, None trend, None seasonality.
model3<-ets(data, 'ANN')
model3
## ETS(A,N,N)
##
## Call:
## ets(y = data, model = "ANN")
##
## Smoothing parameters:
## alpha = 0.3585
##
## Initial states:
## l = 104.0471
##
## sigma: 69.3249
##
## AIC AICc BIC
## 7052.317 7052.367 7064.845
plot(model3)