HORA12
The inputs variables includs the MONTH,DAY,WEEKDAY,HORA.x SEASON, O3.MAXY1(miximum ozone level one day before),and O3N(ozone level) ,NOx,NO2,RH,TMP,WDR,WSP.CO and SO2 at 12 oclock in the mid of day.
load("~/prepareData/H12.RData")
library(knitr)
library(caret)
library(caretEnsemble)
source("~/function/NormalizeAndTrainingFunction.r")
H12Target1 <- H12[, "O3"]
H12Inputs1 <- H12[, c("MONTH", "DAY", "WEEKDAY", "HORA.x", "SEASON", "O3MAXY1",
"O3N", "NOx", "NO2", "RH", "TMP", "WDR", "WSP", "CO", "SO2")]
library(RSNNS)
PreData("H12", H12Inputs1, H12Target1)
load("H12 TrainingAndTesting.RData")
Training(inputsTrain, targetsTrain, inputsTest, targetsTest)
load("lmFit.RData")
load("svmFit.RData")
load("rfFit.RData")
load("nnetFit.RData")
load("linearFit.RData")
load("greedyFit.RData")
load("modelsErrorsTotal.RData")
modelsErrorsTotal
## lmFit svmFit rfFit nnetFit linearFit greedyFit
## MAE 0.05124 0.04681 0.04954 0.04624 0.04256 0.04362
## RMSE 0.07180 0.06186 0.06314 0.06054 0.05561 0.05666
## RELE 0.39886 0.32348 0.28732 0.23695 0.23092 0.24987
summary(lmFit)
##
## Call:
## lm(formula = modFormula, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.4890 -0.0645 -0.0121 0.0482 0.4281
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.322619 0.029942 -10.77 < 2e-16 ***
## MONTH 0.009564 0.031230 0.31 0.7595
## DAY -0.003614 0.008062 -0.45 0.6540
## WEEKDAY -0.021926 0.007108 -3.08 0.0021 **
## HORA.x 0.763402 0.034623 22.05 < 2e-16 ***
## SEASON -0.000549 0.026758 -0.02 0.9836
## O3MAXY1 0.222975 0.017366 12.84 < 2e-16 ***
## O3N 0.759930 0.024246 31.34 < 2e-16 ***
## NOx 0.160609 0.075603 2.12 0.0338 *
## NO2 -0.060209 0.079787 -0.75 0.4506
## RH 0.040185 0.016189 2.48 0.0131 *
## TMP -0.063827 0.018558 -3.44 0.0006 ***
## WDR -0.006906 0.008213 -0.84 0.4005
## WSP -0.191733 0.030885 -6.21 6.6e-10 ***
## CO 0.067347 0.033212 2.03 0.0427 *
## SO2 -0.022508 0.019688 -1.14 0.2531
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.101 on 1934 degrees of freedom
## Multiple R-squared: 0.665, Adjusted R-squared: 0.662
## F-statistic: 256 on 15 and 1934 DF, p-value: <2e-16
R square is 0.665, the godness of fit of linear least square model is good. From the repects of errors of MAE,RMSE and RELE,nnet model is more accurate than others.
Uses the Akaike Information Criterion to perform model search. the search uses backward elimination by default
lmH12 <- lm(O3 ~ ., data = H12[, c("O3", "MONTH", "DAY", "WEEKDAY", "HORA.x",
"SEASON", "O3MAXY1", "O3N", "NOx", "NO2", "RH", "TMP", "WDR", "WSP", "CO",
"SO2")])
summary(lmH12)
##
## Call:
## lm(formula = O3 ~ ., data = H12[, c("O3", "MONTH", "DAY", "WEEKDAY",
## "HORA.x", "SEASON", "O3MAXY1", "O3N", "NOx", "NO2", "RH",
## "TMP", "WDR", "WSP", "CO", "SO2")])
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.15150 -0.02046 -0.00365 0.01422 0.14719
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -9.67e-02 1.04e-02 -9.30 < 2e-16 ***
## MONTH 4.12e-04 8.46e-04 0.49 0.6258
## DAY -6.01e-05 8.26e-05 -0.73 0.4665
## WEEKDAY -9.03e-04 3.50e-04 -2.58 0.0100 **
## HORA.x 9.82e-03 4.36e-04 22.52 < 2e-16 ***
## SEASON -1.43e-04 2.65e-03 -0.05 0.9571
## O3MAXY1 2.03e-01 1.38e-02 14.77 < 2e-16 ***
## O3N 9.07e-01 2.77e-02 32.78 < 2e-16 ***
## NOx 2.02e-01 8.84e-02 2.29 0.0221 *
## NO2 -9.38e-02 1.08e-01 -0.87 0.3835
## RH 1.59e-04 5.52e-05 2.88 0.0040 **
## TMP -6.99e-04 2.48e-04 -2.82 0.0049 **
## WDR -5.60e-06 6.77e-06 -0.83 0.4082
## WSP -9.80e-03 1.22e-03 -8.01 1.9e-15 ***
## CO 2.26e-03 7.74e-04 2.92 0.0035 **
## SO2 -7.60e-02 6.39e-02 -1.19 0.2346
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.0328 on 2279 degrees of freedom
## Multiple R-squared: 0.68, Adjusted R-squared: 0.678
## F-statistic: 323 on 15 and 2279 DF, p-value: <2e-16
step(lmH12)
## Start: AIC=-15668
## O3 ~ MONTH + DAY + WEEKDAY + HORA.x + SEASON + O3MAXY1 + O3N +
## NOx + NO2 + RH + TMP + WDR + WSP + CO + SO2
##
## Df Sum of Sq RSS AIC
## - SEASON 1 0.000 2.45 -15670
## - MONTH 1 0.000 2.45 -15670
## - DAY 1 0.001 2.45 -15669
## - WDR 1 0.001 2.45 -15669
## - NO2 1 0.001 2.45 -15669
## - SO2 1 0.002 2.46 -15668
## <none> 2.45 -15668
## - NOx 1 0.006 2.46 -15664
## - WEEKDAY 1 0.007 2.46 -15663
## - TMP 1 0.009 2.46 -15662
## - RH 1 0.009 2.46 -15661
## - CO 1 0.009 2.46 -15661
## - WSP 1 0.069 2.52 -15606
## - O3MAXY1 1 0.235 2.69 -15460
## - HORA.x 1 0.546 3.00 -15209
## - O3N 1 1.157 3.61 -14783
##
## Step: AIC=-15670
## O3 ~ MONTH + DAY + WEEKDAY + HORA.x + O3MAXY1 + O3N + NOx + NO2 +
## RH + TMP + WDR + WSP + CO + SO2
##
## Df Sum of Sq RSS AIC
## - DAY 1 0.001 2.45 -15671
## - WDR 1 0.001 2.45 -15671
## - NO2 1 0.001 2.45 -15671
## - SO2 1 0.002 2.46 -15670
## <none> 2.45 -15670
## - MONTH 1 0.003 2.46 -15669
## - NOx 1 0.006 2.46 -15666
## - WEEKDAY 1 0.007 2.46 -15665
## - TMP 1 0.009 2.46 -15664
## - RH 1 0.009 2.46 -15663
## - CO 1 0.009 2.46 -15663
## - WSP 1 0.069 2.52 -15608
## - O3MAXY1 1 0.235 2.69 -15462
## - HORA.x 1 0.546 3.00 -15211
## - O3N 1 1.158 3.61 -14784
##
## Step: AIC=-15671
## O3 ~ MONTH + WEEKDAY + HORA.x + O3MAXY1 + O3N + NOx + NO2 + RH +
## TMP + WDR + WSP + CO + SO2
##
## Df Sum of Sq RSS AIC
## - WDR 1 0.001 2.46 -15672
## - NO2 1 0.001 2.46 -15672
## - SO2 1 0.001 2.46 -15672
## <none> 2.45 -15671
## - MONTH 1 0.003 2.46 -15670
## - NOx 1 0.006 2.46 -15668
## - WEEKDAY 1 0.007 2.46 -15666
## - TMP 1 0.009 2.46 -15665
## - RH 1 0.009 2.46 -15665
## - CO 1 0.009 2.46 -15665
## - WSP 1 0.069 2.52 -15610
## - O3MAXY1 1 0.235 2.69 -15463
## - HORA.x 1 0.545 3.00 -15213
## - O3N 1 1.160 3.61 -14785
##
## Step: AIC=-15672
## O3 ~ MONTH + WEEKDAY + HORA.x + O3MAXY1 + O3N + NOx + NO2 + RH +
## TMP + WSP + CO + SO2
##
## Df Sum of Sq RSS AIC
## - NO2 1 0.001 2.46 -15674
## - SO2 1 0.001 2.46 -15673
## <none> 2.46 -15672
## - MONTH 1 0.003 2.46 -15672
## - NOx 1 0.006 2.46 -15669
## - WEEKDAY 1 0.007 2.46 -15668
## - TMP 1 0.008 2.46 -15667
## - RH 1 0.009 2.46 -15666
## - CO 1 0.009 2.46 -15666
## - WSP 1 0.071 2.53 -15609
## - O3MAXY1 1 0.234 2.69 -15465
## - HORA.x 1 0.552 3.01 -15209
## - O3N 1 1.159 3.61 -14787
##
## Step: AIC=-15674
## O3 ~ MONTH + WEEKDAY + HORA.x + O3MAXY1 + O3N + NOx + RH + TMP +
## WSP + CO + SO2
##
## Df Sum of Sq RSS AIC
## - SO2 1 0.001 2.46 -15675
## <none> 2.46 -15674
## - MONTH 1 0.003 2.46 -15673
## - WEEKDAY 1 0.007 2.46 -15669
## - TMP 1 0.009 2.46 -15668
## - RH 1 0.009 2.47 -15667
## - CO 1 0.009 2.47 -15667
## - NOx 1 0.023 2.48 -15655
## - WSP 1 0.071 2.53 -15610
## - O3MAXY1 1 0.235 2.69 -15466
## - HORA.x 1 0.551 3.01 -15211
## - O3N 1 1.442 3.90 -14616
##
## Step: AIC=-15675
## O3 ~ MONTH + WEEKDAY + HORA.x + O3MAXY1 + O3N + NOx + RH + TMP +
## WSP + CO
##
## Df Sum of Sq RSS AIC
## <none> 2.46 -15675
## - MONTH 1 0.003 2.46 -15674
## - TMP 1 0.008 2.46 -15669
## - WEEKDAY 1 0.008 2.46 -15669
## - CO 1 0.008 2.47 -15669
## - RH 1 0.010 2.47 -15667
## - NOx 1 0.024 2.48 -15654
## - WSP 1 0.072 2.53 -15610
## - O3MAXY1 1 0.234 2.69 -15468
## - HORA.x 1 0.550 3.01 -15213
## - O3N 1 1.441 3.90 -14618
##
## Call:
## lm(formula = O3 ~ MONTH + WEEKDAY + HORA.x + O3MAXY1 + O3N +
## NOx + RH + TMP + WSP + CO, data = H12[, c("O3", "MONTH",
## "DAY", "WEEKDAY", "HORA.x", "SEASON", "O3MAXY1", "O3N", "NOx",
## "NO2", "RH", "TMP", "WDR", "WSP", "CO", "SO2")])
##
## Coefficients:
## (Intercept) MONTH WEEKDAY HORA.x O3MAXY1
## -0.099125 0.000357 -0.000933 0.009820 0.201393
## O3N NOx RH TMP WSP
## 0.895550 0.133907 0.000167 -0.000627 -0.009963
## CO
## 0.001890
Finally,MONTH + WEEKDAY + HORA.x + O3MAXY1 + O3N + NOx + RH + TMP + WSP + CO as new matrix for modelling. I will compare the different matrixs in another knit for furthuring analysis.
Build a database by using functions “timeDifferentFunction”
source("~/function/timeDifferentFunction.r")
timeDifTest(H12Inputs1, 12)
load("timeDifTestData.RData")