Statistical Validation and Testing Procedures

Introduction

The following examples describe some of the commonly used statistical testing procedures by using R Programming.

We will use several data sets as example for each statistical testing procedures.

library (dynlm)
## Warning: package 'dynlm' was built under R version 3.5.3
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 3.5.3
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
library (foreign)
## Warning: package 'foreign' was built under R version 3.5.3
library (car)
## Warning: package 'car' was built under R version 3.5.3
## Loading required package: carData
## Warning: package 'carData' was built under R version 3.5.3
library (lmtest)
## Warning: package 'lmtest' was built under R version 3.5.3

If there is no package called ‘dynlm’, ‘foreign’, ‘car’ and/or ‘lmtest’, install the packages needed, internet connection is a must to install the packages;

install.packages ('dynlm')
install.packages ('foreign')
install.packages ('car')
install.packages ('lmtest')

‘dynlm’ package is needed to incoprate the lag term in the model.
‘foreign’ package is needed to call the data from the URL.
‘car’ for vif () function to test the multicollinearity.
‘lmtest’ is needed to perform the Durbin-Watson test.

Model Validation and Testing

data1<-longley
colnames (longley)

## [1] "GNP.deflator" "GNP"          "Unemployed"   "Armed.Forces" "Population"  
## [6] "Year"         "Employed"

reg1<-lm(Employed~GNP.deflator + GNP + Unemployed + Armed.Forces + Population, data=data1)
summary(reg1)

## 
## Call:
## lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + 
##     Population, data = data1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.55324 -0.36478  0.06106  0.20550  0.93359 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  92.461308  35.169248   2.629   0.0252 *
## GNP.deflator -0.048463   0.132248  -0.366   0.7217  
## GNP           0.072004   0.031734   2.269   0.0467 *
## Unemployed   -0.004039   0.004385  -0.921   0.3788  
## Armed.Forces -0.005605   0.002838  -1.975   0.0765 .
## Population   -0.403509   0.330264  -1.222   0.2498  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4832 on 10 degrees of freedom
## Multiple R-squared:  0.9874, Adjusted R-squared:  0.9811 
## F-statistic: 156.4 on 5 and 10 DF,  p-value: 3.699e-09

1. General Fitness of the Model (F-Statistics)

Based on the output above, F-statistics is equal to 156.6 and is highly significant (p-value = 3.699e-09 < \(\alpha\) = 0.05). Overall, estimated model fits the data well.

2. Regression Coefficients (t-statistics)

Next, we carry out individual significance test of the estimated model by using the p-value for t-test. The null hypothesis for t-test implied that the individual variable should not be included in the data, \(\beta\)=0.

Based on the output above, only GNP is significant as the p-value = 0.0467 which is less than \(\alpha\) = 0.05, while other variables produce insignificant value at 5% significance level (fail to reject the null hypothesis).

3. Goodness of Fit (R-squared and Adjusted R-Squared)

We can measure the goodness of fit of the model by using the R-squared or/and the Adjusted R-squared value i which, the value is bounded between 0 and 1.

The value R-squared is interpreted as the total variation in \(y\) that is explained by the independent variable (s). In this example, 98.74% of the total variation in Employed is explained by the all independent variables, while remaining 1.26% is explained by other factors.

It is advisable to evaluate the goodness fit of the model based on the value of adjusted R-squared. Closer to 1, meaning that the model is a good fit. For this example, the adjusted R-squared = 0.9811, suggesting the estimated models fits the data well.

Note that, based on the information of previous t-statistics, they might be only one variable that contribute to the high value of R-squared and adjusted R-squared, thus further investigation is needed when we dealing with this validation and testing procedure.

4. Multicollinearity

vif (reg1)

## GNP.deflator          GNP   Unemployed Armed.Forces   Population 
##   130.829201   639.049777    10.786858     2.505775   339.011693

tol <- 1/vif(reg1)

Collinearity <- data.frame (VIF = vif (reg1), Tolerance = tol)
Collinearity

##                     VIF   Tolerance
## GNP.deflator 130.829201 0.007643554
## GNP          639.049777 0.001564823
## Unemployed    10.786858 0.092705399
## Armed.Forces   2.505775 0.399078059
## Population   339.011693 0.002949751

For multicollinearity, it can be detected if

i. the largest VIF is greater than 10.
ii. the tolerance statistics is below than 0.1

Based on the output, there are serious multicollinearity problem where most of the VIF is greater than 10 (tolerance below than 0.1) suggesting that the remedial action is needed to improve the model fitting.

5. Serial Correlation

Serial correlation also known as autocorrelation. For this example, we will use phillps data to show example of serial correlation

phillips<-read.dta("http://fmwww.bc.edu/ec-p/data/wooldridge/phillips.dta")
tsdata<-ts(phillips, start=1948) #define yearly time series data in 1948
reg.s<-dynlm(inf~unem, data=tsdata, end=1996) #estimation of static Phillips curve
reg.ea<-dynlm(d(inf)~unem, data=tsdata, end=1996) #same with expectations-augmented Phillips curve

Durbin-Watson test for Serial Correlation

dwtest(reg.s)

## 
##  Durbin-Watson test
## 
## data:  reg.s
## DW = 0.8027, p-value = 7.552e-07
## alternative hypothesis: true autocorrelation is greater than 0

dwtest(reg.ea)

## 
##  Durbin-Watson test
## 
## data:  reg.ea
## DW = 1.7696, p-value = 0.1783
## alternative hypothesis: true autocorrelation is greater than 0

The null hypothesis for Durbin-Watson test is there is no serial correlation in the model. Since reg.s is lower than α = 0.05, there is the presence of serial correlation for model estimation based on static Phillips curve. However, when the second regression, reg.ea is estimated, the p-value is greater than α = 0.05 which means the regression has no autocorrelation. If only D-Watson statistic value is given, we can use the rule of thumbs as discussed during class session.

Reference

Mohd. Alias Lazim. (2007). Introductory Business Forecasting a practical approach. University Publication Centre (UPENA).