HORA12

The inputs variables includs the MONTH,DAY,WEEKDAY,HORA.x SEASON, O3.MAXY1(miximum ozone level one day before),and O3N(ozone level) ,NOx,NO2,RH,TMP,WDR,WSP.CO and SO2 at 12 oclock in the mid of day.

load("~/prepareData/H12.RData")
library(knitr)
library(caret)
library(caretEnsemble)
source("~/function/NormalizeAndTrainingFunction.r")
H12Target1 <- H12[, "O3"]
H12Inputs1 <- H12[, c("MONTH", "DAY", "WEEKDAY", "HORA.x", "SEASON", "O3MAXY1", 
    "O3N", "NOx", "NO2", "RH", "TMP", "WDR", "WSP", "CO", "SO2")]
library(RSNNS)
PreData("H12", H12Inputs1, H12Target1)
load("H12 TrainingAndTesting.RData")
Training(inputsTrain, targetsTrain, inputsTest, targetsTest)
load("lmFit.RData")
load("svmFit.RData")
load("rfFit.RData")
load("nnetFit.RData")
load("linearFit.RData")
load("greedyFit.RData")
load("modelsErrorsTotal.RData")
modelsErrorsTotal
##        lmFit  svmFit   rfFit nnetFit linearFit greedyFit
## MAE  0.05124 0.04681 0.04954 0.04624   0.04256   0.04362
## RMSE 0.07180 0.06186 0.06314 0.06054   0.05561   0.05666
## RELE 0.39886 0.32348 0.28732 0.23695   0.23092   0.24987
summary(lmFit)
## 
## Call:
## lm(formula = modFormula, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.4890 -0.0645 -0.0121  0.0482  0.4281 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.322619   0.029942  -10.77  < 2e-16 ***
## MONTH        0.009564   0.031230    0.31   0.7595    
## DAY         -0.003614   0.008062   -0.45   0.6540    
## WEEKDAY     -0.021926   0.007108   -3.08   0.0021 ** 
## HORA.x       0.763402   0.034623   22.05  < 2e-16 ***
## SEASON      -0.000549   0.026758   -0.02   0.9836    
## O3MAXY1      0.222975   0.017366   12.84  < 2e-16 ***
## O3N          0.759930   0.024246   31.34  < 2e-16 ***
## NOx          0.160609   0.075603    2.12   0.0338 *  
## NO2         -0.060209   0.079787   -0.75   0.4506    
## RH           0.040185   0.016189    2.48   0.0131 *  
## TMP         -0.063827   0.018558   -3.44   0.0006 ***
## WDR         -0.006906   0.008213   -0.84   0.4005    
## WSP         -0.191733   0.030885   -6.21  6.6e-10 ***
## CO           0.067347   0.033212    2.03   0.0427 *  
## SO2         -0.022508   0.019688   -1.14   0.2531    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.101 on 1934 degrees of freedom
## Multiple R-squared:  0.665,  Adjusted R-squared:  0.662 
## F-statistic:  256 on 15 and 1934 DF,  p-value: <2e-16

R square is 0.665, the godness of fit of linear least square model is good. From the repects of errors of MAE,RMSE and RELE,nnet model is more accurate than others.

Uses the Akaike Information Criterion to perform model search. the search uses backward elimination by default

lmH12 <- lm(O3 ~ ., data = H12[, c("O3", "MONTH", "DAY", "WEEKDAY", "HORA.x", 
    "SEASON", "O3MAXY1", "O3N", "NOx", "NO2", "RH", "TMP", "WDR", "WSP", "CO", 
    "SO2")])
summary(lmH12)
## 
## Call:
## lm(formula = O3 ~ ., data = H12[, c("O3", "MONTH", "DAY", "WEEKDAY", 
##     "HORA.x", "SEASON", "O3MAXY1", "O3N", "NOx", "NO2", "RH", 
##     "TMP", "WDR", "WSP", "CO", "SO2")])
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.15150 -0.02046 -0.00365  0.01422  0.14719 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -9.67e-02   1.04e-02   -9.30  < 2e-16 ***
## MONTH        4.12e-04   8.46e-04    0.49   0.6258    
## DAY         -6.01e-05   8.26e-05   -0.73   0.4665    
## WEEKDAY     -9.03e-04   3.50e-04   -2.58   0.0100 ** 
## HORA.x       9.82e-03   4.36e-04   22.52  < 2e-16 ***
## SEASON      -1.43e-04   2.65e-03   -0.05   0.9571    
## O3MAXY1      2.03e-01   1.38e-02   14.77  < 2e-16 ***
## O3N          9.07e-01   2.77e-02   32.78  < 2e-16 ***
## NOx          2.02e-01   8.84e-02    2.29   0.0221 *  
## NO2         -9.38e-02   1.08e-01   -0.87   0.3835    
## RH           1.59e-04   5.52e-05    2.88   0.0040 ** 
## TMP         -6.99e-04   2.48e-04   -2.82   0.0049 ** 
## WDR         -5.60e-06   6.77e-06   -0.83   0.4082    
## WSP         -9.80e-03   1.22e-03   -8.01  1.9e-15 ***
## CO           2.26e-03   7.74e-04    2.92   0.0035 ** 
## SO2         -7.60e-02   6.39e-02   -1.19   0.2346    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0328 on 2279 degrees of freedom
## Multiple R-squared:  0.68,   Adjusted R-squared:  0.678 
## F-statistic:  323 on 15 and 2279 DF,  p-value: <2e-16
step(lmH12)
## Start:  AIC=-15668
## O3 ~ MONTH + DAY + WEEKDAY + HORA.x + SEASON + O3MAXY1 + O3N + 
##     NOx + NO2 + RH + TMP + WDR + WSP + CO + SO2
## 
##           Df Sum of Sq  RSS    AIC
## - SEASON   1     0.000 2.45 -15670
## - MONTH    1     0.000 2.45 -15670
## - DAY      1     0.001 2.45 -15669
## - WDR      1     0.001 2.45 -15669
## - NO2      1     0.001 2.45 -15669
## - SO2      1     0.002 2.46 -15668
## <none>                 2.45 -15668
## - NOx      1     0.006 2.46 -15664
## - WEEKDAY  1     0.007 2.46 -15663
## - TMP      1     0.009 2.46 -15662
## - RH       1     0.009 2.46 -15661
## - CO       1     0.009 2.46 -15661
## - WSP      1     0.069 2.52 -15606
## - O3MAXY1  1     0.235 2.69 -15460
## - HORA.x   1     0.546 3.00 -15209
## - O3N      1     1.157 3.61 -14783
## 
## Step:  AIC=-15670
## O3 ~ MONTH + DAY + WEEKDAY + HORA.x + O3MAXY1 + O3N + NOx + NO2 + 
##     RH + TMP + WDR + WSP + CO + SO2
## 
##           Df Sum of Sq  RSS    AIC
## - DAY      1     0.001 2.45 -15671
## - WDR      1     0.001 2.45 -15671
## - NO2      1     0.001 2.45 -15671
## - SO2      1     0.002 2.46 -15670
## <none>                 2.45 -15670
## - MONTH    1     0.003 2.46 -15669
## - NOx      1     0.006 2.46 -15666
## - WEEKDAY  1     0.007 2.46 -15665
## - TMP      1     0.009 2.46 -15664
## - RH       1     0.009 2.46 -15663
## - CO       1     0.009 2.46 -15663
## - WSP      1     0.069 2.52 -15608
## - O3MAXY1  1     0.235 2.69 -15462
## - HORA.x   1     0.546 3.00 -15211
## - O3N      1     1.158 3.61 -14784
## 
## Step:  AIC=-15671
## O3 ~ MONTH + WEEKDAY + HORA.x + O3MAXY1 + O3N + NOx + NO2 + RH + 
##     TMP + WDR + WSP + CO + SO2
## 
##           Df Sum of Sq  RSS    AIC
## - WDR      1     0.001 2.46 -15672
## - NO2      1     0.001 2.46 -15672
## - SO2      1     0.001 2.46 -15672
## <none>                 2.45 -15671
## - MONTH    1     0.003 2.46 -15670
## - NOx      1     0.006 2.46 -15668
## - WEEKDAY  1     0.007 2.46 -15666
## - TMP      1     0.009 2.46 -15665
## - RH       1     0.009 2.46 -15665
## - CO       1     0.009 2.46 -15665
## - WSP      1     0.069 2.52 -15610
## - O3MAXY1  1     0.235 2.69 -15463
## - HORA.x   1     0.545 3.00 -15213
## - O3N      1     1.160 3.61 -14785
## 
## Step:  AIC=-15672
## O3 ~ MONTH + WEEKDAY + HORA.x + O3MAXY1 + O3N + NOx + NO2 + RH + 
##     TMP + WSP + CO + SO2
## 
##           Df Sum of Sq  RSS    AIC
## - NO2      1     0.001 2.46 -15674
## - SO2      1     0.001 2.46 -15673
## <none>                 2.46 -15672
## - MONTH    1     0.003 2.46 -15672
## - NOx      1     0.006 2.46 -15669
## - WEEKDAY  1     0.007 2.46 -15668
## - TMP      1     0.008 2.46 -15667
## - RH       1     0.009 2.46 -15666
## - CO       1     0.009 2.46 -15666
## - WSP      1     0.071 2.53 -15609
## - O3MAXY1  1     0.234 2.69 -15465
## - HORA.x   1     0.552 3.01 -15209
## - O3N      1     1.159 3.61 -14787
## 
## Step:  AIC=-15674
## O3 ~ MONTH + WEEKDAY + HORA.x + O3MAXY1 + O3N + NOx + RH + TMP + 
##     WSP + CO + SO2
## 
##           Df Sum of Sq  RSS    AIC
## - SO2      1     0.001 2.46 -15675
## <none>                 2.46 -15674
## - MONTH    1     0.003 2.46 -15673
## - WEEKDAY  1     0.007 2.46 -15669
## - TMP      1     0.009 2.46 -15668
## - RH       1     0.009 2.47 -15667
## - CO       1     0.009 2.47 -15667
## - NOx      1     0.023 2.48 -15655
## - WSP      1     0.071 2.53 -15610
## - O3MAXY1  1     0.235 2.69 -15466
## - HORA.x   1     0.551 3.01 -15211
## - O3N      1     1.442 3.90 -14616
## 
## Step:  AIC=-15675
## O3 ~ MONTH + WEEKDAY + HORA.x + O3MAXY1 + O3N + NOx + RH + TMP + 
##     WSP + CO
## 
##           Df Sum of Sq  RSS    AIC
## <none>                 2.46 -15675
## - MONTH    1     0.003 2.46 -15674
## - TMP      1     0.008 2.46 -15669
## - WEEKDAY  1     0.008 2.46 -15669
## - CO       1     0.008 2.47 -15669
## - RH       1     0.010 2.47 -15667
## - NOx      1     0.024 2.48 -15654
## - WSP      1     0.072 2.53 -15610
## - O3MAXY1  1     0.234 2.69 -15468
## - HORA.x   1     0.550 3.01 -15213
## - O3N      1     1.441 3.90 -14618
## 
## Call:
## lm(formula = O3 ~ MONTH + WEEKDAY + HORA.x + O3MAXY1 + O3N + 
##     NOx + RH + TMP + WSP + CO, data = H12[, c("O3", "MONTH", 
##     "DAY", "WEEKDAY", "HORA.x", "SEASON", "O3MAXY1", "O3N", "NOx", 
##     "NO2", "RH", "TMP", "WDR", "WSP", "CO", "SO2")])
## 
## Coefficients:
## (Intercept)        MONTH      WEEKDAY       HORA.x      O3MAXY1  
##   -0.099125     0.000357    -0.000933     0.009820     0.201393  
##         O3N          NOx           RH          TMP          WSP  
##    0.895550     0.133907     0.000167    -0.000627    -0.009963  
##          CO  
##    0.001890

Finally,MONTH + WEEKDAY + HORA.x + O3MAXY1 + O3N + NOx + RH + TMP + WSP + CO as new matrix for modelling. I will compare the different matrixs in another knit for furthuring analysis.

Build a database by using functions “timeDifferentFunction”

source("~/function/timeDifferentFunction.r")
timeDifTest(H12Inputs1, 12)
load("timeDifTestData.RData")