In this topic you will learn;
3.1 Introduction to R Programming
3.2 Econometric Model Definition
3.3 Model Construction Issues
3.4 The Assumption
3.5 Validation and Testing and
3.6 Model Estimation Procedure.
If you view by using web, please choose the tab accordingly;
R is a language and environment for statistical computing and graphics. It’s an open source solution to data analysis that is supported by a large and active worldwide research community. Why R?
An object basically anything that can be assigned as a value, as example;
Example 1
x <- 3
x
## [1] 3
We assigned an object, x, with value equal to 3.
Example 2
y <- c(2, 1, 3)
y
## [1] 2 1 3
We created a vector named y containing three values, 2, 1 and 3.
R uses the symbol ‘<-’ for assignment rather than typical ‘=’. Nevertheless, R allows the ‘=’ symbol to be used for object assignment, but it’s not a standard syntax. There are some situations in which it won’t work.
Workspace is our current R working environment and includes any user-defined objects. At the end of an R session, we can save an image of the current workspace that’s automatically reloaded next time R starts.
The current working directory is the directory from which R will read files and to which it will save results by default. We can find out what is the current working directory by using getwd() function.
getwd()
## [1] "C:/Users/Asmui/Documents/time series notes"
We can set the current working directory by using the setwd() function or by navigating Session > Set Working Directory > Choose Directory in RStudio tabs. Note that if we need to input (or read) a file that is not in the current working directory, we have to use the full pathname in the call.
Packages are collections of R functions, data, and compiled code in a well-defined format. The directory where packages are stored on our computer is called library.
R comes with a standard set of packages (including base, utils, stats, graphics, and many more). These standard packages are already included in our R installation. The packages does not need to be loaded by library() function.
In addition, R packages also available as extension, more than thousands user-contributed packagegs that extend the R capabilities. These packages are available for download and installation. Once installed, they must be loaded into the session (every new session) in order to be used.
To install a package for the first time;
install.packages ("tseries")
or you can use RStudio tabs; Tools > Install Packages
Load the installed package;
library (tseries)
A scalar: holds only a single atomic value at a time.
h <- 3.5
Vector: one-dimensional arrays that can hold numeric data, character data, or logical data. The combine function c() is used to form vector.
a <- c(1, 2, 4, -1)
b <- c("blue", "yellow", "green")
c <- c(TRUE, TRUE, FALSE, FALSE, TRUE)
The outputs are given as below;
a
## [1] 1 2 4 -1
b
## [1] "blue" "yellow" "green"
c
## [1] TRUE TRUE FALSE FALSE TRUE
Matrix: two-dimensional array in which each element has the same mode (numeric, character, or logical).
y1 <- matrix (1:20, nrow=5, ncol=4)
y1
## [,1] [,2] [,3] [,4]
## [1,] 1 6 11 16
## [2,] 2 7 12 17
## [3,] 3 8 13 18
## [4,] 4 9 14 19
## [5,] 5 10 15 20
y2 <- matrix (1:20, nrow=5, ncol=4, byrow=TRUE)
y2
## [,1] [,2] [,3] [,4]
## [1,] 1 2 3 4
## [2,] 5 6 7 8
## [3,] 9 10 11 12
## [4,] 13 14 15 16
## [5,] 17 18 19 20
Data frame: more general than a matrix (two-dimensional) in that different columns can contain different modes of data (numeric, character, and so on). The most common data structure that we deal with in R.
age <- c(25, 34, 28, 52)
bloodtype <- c("O", "A", "B", "AB")
blood.data <- data.frame (age, bloodtype)
blood.data
## age bloodtype
## 1 25 O
## 2 34 A
## 3 28 B
## 4 52 AB
Time series: Represent data which has been sampled chronologically with equal discrete time interval. Function ts() is used to create time-series objects.
sales <- c(18, 33, 41, 7, 34, 35, 24, 25, 24, 21, 25, 20, 22, 31,
40, 29, 25, 21, 22, 54, 31, 25, 26, 35)
sales.ts <- ts(sales, start=c(2018, 1), frequency=12)
sales.ts
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 2018 18 33 41 7 34 35 24 25 24 21 25 20
## 2019 22 31 40 29 25 21 22 54 31 25 26 35
We can plot the time series object, sales.ts;
plot(sales.ts, type="o", pch=19, col="blue")
Some important functions related to time series in R Programming;
| Function | Package | Use |
|---|---|---|
| ts() | stats | Creates a time-series object. |
| plot() | graphics | Plots a time series. |
| start() | stats | Returns the starting time of a time series. |
| end() | stats | Returns the ending time of a time series. |
| frequency() | stats | Returns the period of a time series. |
| window() | stats | Subsets a time-series object. |
| ma() | forecast | Fits a simple moving-average model. |
| st1() | stats | Decompose a time series into seasonal, trend and irregular components using loess. |
| monthplot() | stats | Plots the seasonal components of a time series. |
| seasonplot() | forecast | Generates a season plot. |
| HoltWinters() | stats | Fits an exponential smoothing model. |
| forecast() | forecast | Forecasts future values of a time series. |
| accuracy() | forecast | Reports fit measures for a time-series model. |
| ets() | forecast | Fits an exponential smoothing model. Includes the ability to automate the selection of a model. |
| lag() | stats | Returns a lagged version of a time series. |
| Acf() | forecast | Estimates the autocorrelation function. |
| Pacf() | forecast | Estimates the partial autocorrelation function. |
| diff() | base | Returns lagged and iterated differences. |
| ndiffs() | forecast | Determines the level of differencing needed to remove trends in a time series. |
| adf.test() | tseries | Computes an Augmented Dickey-Fuller test that a time series is stationary. |
| arima() | stats | Fits autoregressive integrated moving-average models. |
| Box.test() | stats | Computes a Ljung-Box test that the residuals of a time serues are independent. |
| bds.test() | tseries | Computes the BDS test that a series consist of independent, identically, distributed random variables. |
| auto.arima() | forecast | Automates the selection of an ARIMA model. |
The variables used may be categorized as;
dependent/endogenous/response variable and
independent/exogenous/explanatory variables.
For example, let say, we try to investigate the factors that affecting the sales, represent as;
\[y_t=f(x_{1t}, x_{2t}, ..., x_{mt})\tag{1}\] where \(y_t\) is total sales in time, \(t\), \(x_{1t},x_{2t},x_{3t}, ..., x_{mt}\) are the possible \(m\) factors that affecting the total sales, which may include price of product, consumers’ income level, interest rate, and so forth at time \(t\).
Equation 1 states that the the factors, \(y_t\), is influenced by the factors \(x_{1t},x_{2t},x_{3t}, ..., x_{mt}\) which are defined as independent variables, and the relationship between these variables is established based on the historical data.
Dependent Variable;
\(y_t\)
Independent Variable;
\(x_{1t}, x_{2t},x_{3t},...,x_{mt}\)
\(x_{j(t-p)}\) (lag \(p\) of the \(j^{th}\)) variable for \(j=1,2,3,...,m\)
\(y_{t-1},y_{t-2},...,y_{t-p}\), for \(p=1,2,3,...\) (lag of dependent variable used as independent variables)
and it can be put in a compact form; \[ y_t=\beta_0+\sum_{j=1}^m\beta_jx_{jt}+\varepsilon_t \tag{2}\]
A general model with lag variables used as independent variables can be written as;
\[ y_t=\beta_0+\sum_{k=1}^K\beta_kx_{kt}+\sum_{j=1}^{P}\phi_jy_{t-j}+\sum_{k=1}^K\sum_{j=1}^q\omega_{kj}x_{k(t-j)}+\varepsilon_t \tag{3}\] Equation (2) assumes that the relationship between \(y_t\) and \(x_{jt}\)’s is linear and that the matrix of all \(x_{jt}\)’s variables are non-stochastic (non-random with specified fixed values) and \(y_t\) is a random variable.
To explain the concept, let us consider a regression model with one independent variable, \(x_{1t}\),
\[y_t=\beta_{0}+\beta_1x_{1t}+\varepsilon_t \tag{4}\]
where \(\beta_0\) and \(\beta_1\) are unknown parameters to be estimated, and \(\varepsilon_t\) is identically, independently and normally distributed with mean zero and variance, \(\sigma_{\varepsilon}^2\).
Next, it follows that the estimated regression equation can be written as; \[\hat{y}_t=\hat{\beta}_0+\hat{\beta}_1x_{1t}\tag{5}\] and that \(\hat{\beta}_0\) and \(\hat{\beta}_1\) are unbiased estimator of \(\beta_0\) and \(\beta_1\) respectively. From Equation (4), therefore, we have;
\[ \begin {aligned} e_t&=y_t-\hat{y}_t\\ &=y_t-(\hat{\beta}_0+\hat{\beta}_1x_{1t}) \end {aligned} \] and that, \[ \sum_{t=1}^ne_t^2=\sum_{t=1}^n[y_t-(\hat{\beta}_0+\hat{\beta}_1x_{1t})]^2\tag{6}\] thus, minimising \(\sum_{t-1}^ne_t^2\) is also minimising \(\sum_{t=1}^n[y_t-(\hat{\beta}_0+\hat{\beta}_1x_{1t})]^2\). Further explanation in textbook page 175-178.
The goal for OLS is to select the model parameters (intercept and slopes) that minimise the difference between actual response values and those predicted by the model.
Generally, to properly interpret the coefficients of the OLS model, we must statisfy a number of statistical assumptions which will discuss later.
Since we will formulating model that assume the dependent variable, \(y_t\) is a linear function of a set of independent variables, we will discuss on how to fit multiple linear regression model by using R.
Basic funtion for fitting a linear model in lm (). The format is
model1 <- lm(formula, data)
where formula describes the model to be fit and data is the data frame containing the data to be used in fitting the model. The resulting object (model1) is a list that contains extensive information about the model. The formula is typically written as; \[Y \sim X_1+X_2+...+X_k \] where the ~ separates the response variable on the left from the predictor variables on the right, and the predictor variables are separated by + signs.
Example by using built-in dataset state.x77. It is always a good idea to examine the relationships among the variables two at a time. Bivariate correlations are provided by the cor () function, and scatter plots are generated from the scatterplotMatrix() function on the car package.
states <- as.data.frame(state.x77[, c("Murder", "Population", "Illiteracy", "Income", "Frost")])
cor(states)
## Murder Population Illiteracy Income Frost
## Murder 1.0000000 0.3436428 0.7029752 -0.2300776 -0.5388834
## Population 0.3436428 1.0000000 0.1076224 0.2082276 -0.3321525
## Illiteracy 0.7029752 0.1076224 1.0000000 -0.4370752 -0.6719470
## Income -0.2300776 0.2082276 -0.4370752 1.0000000 0.2262822
## Frost -0.5388834 -0.3321525 -0.6719470 0.2262822 1.0000000
library(car)
## Loading required package: carData
scatterplotMatrix (states, spread=FALSE, smoother.args=list(lty=2), main="Scatter Plot Matrix")
Next, we will fit the multiple regression model by using lm() function;
fit <- lm(Murder ~ Population + Illiteracy + Income + Frost, data = states)
summary(fit)
##
## Call:
## lm(formula = Murder ~ Population + Illiteracy + Income + Frost,
## data = states)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.7960 -1.6495 -0.0811 1.4815 7.6210
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.235e+00 3.866e+00 0.319 0.7510
## Population 2.237e-04 9.052e-05 2.471 0.0173 *
## Illiteracy 4.143e+00 8.744e-01 4.738 2.19e-05 ***
## Income 6.442e-05 6.837e-04 0.094 0.9253
## Frost 5.813e-04 1.005e-02 0.058 0.9541
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.535 on 45 degrees of freedom
## Multiple R-squared: 0.567, Adjusted R-squared: 0.5285
## F-statistic: 14.73 on 4 and 45 DF, p-value: 9.133e-08
| Function | Action |
|---|---|
| summary () | Displays detailed results for the fited model |
| coefficients () | Lists the model parameters (intercept and slopes) for the fitted model |
| confint () | Provides confidence intervals for the model paramaters (95% by default) |
| fitted () | Lists the predicted vaues in a fitted model |
| residuals () | Lists the residuals values in a fitted model |
| anova () | Generates an ANOVA table for a firtted model, or an ANOVA table comparing two or more fitted models |
| vcov () | Lists the covariance matrix for model parameters |
| AIC () | Prints Akaike’s Information Criterion |
| plot () | Generates diagnostic plots for evaluating the fit of a model |
| predict () | Uses a fitted model to predict response values for a new dataset |
A good model builder will initially adress several important issues prior to actually starting to develop model. Process of building an econometric model for forecasting purposes is not simply the act of finding the dependent variable to be forecasted and then determining the independent variable(s) to explain it. These are some issues that need to be determined before we construct the model.
The following examples describe some of the commonly used statistical testing procedures by using R Programming.
We will use several data sets as example for each statistical testing procedures.
library (dynlm)
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
library (foreign)
library (car)
library (lmtest)
If there is no package called ‘dynlm’, ‘foreign’, ‘car’ and/or ‘lmtest’, install the packages needed, internet connection is a must to install the packages;
install.packages ('dynlm')
install.packages ('foreign')
install.packages ('car')
install.packages ('lmtest')
‘dynlm’ package is needed to incoprate the lag term in the model.
‘foreign’ package is needed to call the data from the URL.
‘car’ for vif () function to test the multicollinearity.
‘lmtest’ is needed to perform the Durbin-Watson test.
data1<-longley
colnames (longley)
## [1] "GNP.deflator" "GNP" "Unemployed" "Armed.Forces" "Population"
## [6] "Year" "Employed"
reg1<-lm(Employed~GNP.deflator + GNP + Unemployed + Armed.Forces + Population, data=data1)
summary(reg1)
##
## Call:
## lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
## Population, data = data1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.55324 -0.36478 0.06106 0.20550 0.93359
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 92.461308 35.169248 2.629 0.0252 *
## GNP.deflator -0.048463 0.132248 -0.366 0.7217
## GNP 0.072004 0.031734 2.269 0.0467 *
## Unemployed -0.004039 0.004385 -0.921 0.3788
## Armed.Forces -0.005605 0.002838 -1.975 0.0765 .
## Population -0.403509 0.330264 -1.222 0.2498
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4832 on 10 degrees of freedom
## Multiple R-squared: 0.9874, Adjusted R-squared: 0.9811
## F-statistic: 156.4 on 5 and 10 DF, p-value: 3.699e-09
Based on the output above, F-statistics is equal to 156.6 and is highly significant (p-value = 3.699e-09 < \(\alpha = 0.05\)). Overall, estimated model fits the data well.
Next, we carry out individual significance test of the estimated model by using the p-value for t-test. The null hypothesis for t-test implied that the individual variable should not be included in the data, \(\beta=0\).
Based on the output above, only GNP is significant as the p-value = 0.0467 which is less than \(\alpha = 0.05\), while other variables produce insignificant value at 5% significance level (fail to reject the null hypothesis).
We can measure the goodness of fit of the model by using the R-squared or/and the Adjusted R-squared value i which, the value is bounded between 0 and 1.
The value R-squared is interpreted as the total variation in \(y\) that is explained by the independent variable (s). In this example, 98.74% of the total variation in Employed is explained by the all independent variables, while remaining 1.26% is explained by other factors.
It is advisable to evaluate the goodness fit of the model based on the value of adjusted R-squared. Closer to 1, meaning that the model is a good fit. For this example, the adjusted \(R^2\) = 0.9811, suggesting the estimated models fits the data well.
Note that, based on the information of previous t-statistics, they might be only one variable that contribute to the high value of R-squared and adjusted R-squared, thus further investigation is needed when we dealing with this validation and testing procedure.
Usual practice in econometric modelling is to assume the error variance is constan over all times and locations (homocedasticity).
If we do not have constant variance, then we have heterocedasticity. If there is such issue, parameters obtained by OLS method are no longer minimum variance unbiased estimator. Over time, the estimates of the dependent variance become less and less predictable.
Procedure to eliminate heterocedasticity problem is to take a suitable transformation, such as log transformation.
In addition with plot, the White test and the Breusch-Pagan test are commonly used to determine the existence of heterocedasticity.
bptest (reg1)
##
## studentized Breusch-Pagan test
##
## data: reg1
## BP = 3.9203, df = 5, p-value = 0.5609
Based on the regression model we fitted before this, the Breusch-Pagan test produce p-value more than 0.05, thus we accept the null hypothesis that the variance of the residuals is constants. Thus, the model is homoscedastic.
When one independent variables are related to each other, multicollinearity among the variables is said to exist.
vif (reg1)
## GNP.deflator GNP Unemployed Armed.Forces Population
## 130.829201 639.049777 10.786858 2.505775 339.011693
tol <- 1/vif(reg1)
Collinearity <- data.frame (VIF = vif (reg1), Tolerance = tol)
Collinearity
## VIF Tolerance
## GNP.deflator 130.829201 0.007643554
## GNP 639.049777 0.001564823
## Unemployed 10.786858 0.092705399
## Armed.Forces 2.505775 0.399078059
## Population 339.011693 0.002949751
For multicollinearity, it can be detected if;
the largest VIF is greater than 10.
the tolerance statistics is below than 0.1.
Based on the output, there are serious multicollinearity problem where most of the VIF is greater than 10 (tolerance below than 0.1) suggesting that the remedial action is needed to improve the model fitting.
Remedial action:
If the presence of multicollinearity does not affect forecasting performance, retain the variables.
The usual procedure when multicollinearity exists is to drop the offending variable or;
Alternatively to drop the variable that provides lesser contribution towards model improvement.
Increase the sample size since larger data set is presumed to provide more accurate estimates.
Serial correlation also known as autocorrelation. This is the case when the error terms, \(\varepsilon_t\), corresponding to different period of time are related to each other.
For this example, we will use phillps data to show example of serial correlation
phillips<-read.dta("http://fmwww.bc.edu/ec-p/data/wooldridge/phillips.dta")
tsdata<-ts(phillips, start=1948) #define yearly time series data in 1948
reg.s<-dynlm(inf~unem, data=tsdata, end=1996) #estimation of static Phillips curve
reg.ea<-dynlm(d(inf)~unem, data=tsdata, end=1996) #same with expectations-augmented Phillips curve
Durbin-Watson test for Serial Correlation
dwtest(reg.s)
##
## Durbin-Watson test
##
## data: reg.s
## DW = 0.8027, p-value = 7.552e-07
## alternative hypothesis: true autocorrelation is greater than 0
dwtest(reg.ea)
##
## Durbin-Watson test
##
## data: reg.ea
## DW = 1.7696, p-value = 0.1783
## alternative hypothesis: true autocorrelation is greater than 0
The null hypothesis for Durbin-Watson test is there is no serial correlation in the model. Since reg.s is lower than \(\alpha = 0.05\), there is the presence of serial correlation for model estimation based on static Phillips curve.
However, when the second regression, reg.ea is estimated, the p-value is greater than \(\alpha = 0.05\) which means the regression has no autocorrelation. If only D-Watson statistic value is given, we can use the rule of thumbs as discussed during class session.
library (dynlm)
Download Example6_3.csv datasets
Read Example6_3.csv data set into R Console
example6.3 <- read.csv (file.choose (), header = TRUE)
and choose Example6_3 data set in your downloaded folder.
Or, Copy and Paste the data set into your file directory and perform this code in R console.
example6.3 <- read.csv ("Example6_3.csv", header = TRUE)
example6.3 <- ts (example6.3, start = 1962, frequency = 1)
str (example6.3) # check the structure
## Time-Series [1:36, 1:9] from 1962 to 1997: 1962 1963 1964 1965 1966 ...
## - attr(*, "dimnames")=List of 2
## ..$ : NULL
## ..$ : chr [1:9] "Year" "Cars" "UnemplRa" "GDP" ...
head (example6.3) # check the first 6 observations
## Year Cars UnemplRa GDP Export PopSize AvCarLan PerCapIn CPI
## [1,] 1962 11.9 7.9 10426 2626 7.4 7.5 1409 31.3
## [2,] 1963 14.1 7.8 13077 2705 8.9 7.2 1469 32.2
## [3,] 1964 16.5 7.8 13932 2781 9.2 7.2 1514 32.1
## [4,] 1965 18.1 7.9 15400 3103 9.4 7.5 1638 32.9
## [5,] 1966 17.6 7.8 16376 3120 9.7 7.5 1688 33.4
## [6,] 1967 16.3 7.8 16612 3723 10.0 7.4 1661 34.8
List of all variables name
colnames (example6.3)
## [1] "Year" "Cars" "UnemplRa" "GDP" "Export" "PopSize" "AvCarLan"
## [8] "PerCapIn" "CPI"
Plotting the variables except ‘Year’ variable;
plot (example6.3[,-1], main = "Plotting for All Variables over Times (1962-1997)")
Comment: (Answer 1)
We will demonstrate on how to perform Model Estimation Procedure (General-to-Specific Approach) by using R Programming.
reg1 <- dynlm(Cars~UnemplRa+GDP+Export+PopSize+AvCarLan+PerCapIn+CPI+L(Cars), data = example6.3)
summary (reg1)
##
## Time series regression with "ts" data:
## Start = 1963, End = 1997
##
## Call:
## dynlm(formula = Cars ~ UnemplRa + GDP + Export + PopSize + AvCarLan +
## PerCapIn + CPI + L(Cars), data = example6.3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -44.487 -7.154 -0.404 8.736 42.431
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -68.490721 120.329724 -0.569 0.574111
## UnemplRa 6.041032 7.192849 0.840 0.408640
## GDP -0.005416 0.004463 -1.213 0.235877
## Export 0.001423 0.001180 1.206 0.238688
## PopSize -2.310819 16.809494 -0.137 0.891718
## AvCarLan -5.795693 5.671624 -1.022 0.316255
## PerCapIn 0.106478 0.057699 1.845 0.076396 .
## CPI -0.064020 1.438768 -0.044 0.964849
## L(Cars) 0.932328 0.215747 4.321 0.000201 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 18.83 on 26 degrees of freedom
## Multiple R-squared: 0.963, Adjusted R-squared: 0.9517
## F-statistic: 84.67 on 8 and 26 DF, p-value: < 2.2e-16
Comment: (Answer 2)
reg2 <- dynlm(Cars~UnemplRa+Export+PopSize+AvCarLan+PerCapIn+CPI+L(Cars), data = example6.3)
summary (reg2)
##
## Time series regression with "ts" data:
## Start = 1963, End = 1997
##
## Call:
## dynlm(formula = Cars ~ UnemplRa + Export + PopSize + AvCarLan +
## PerCapIn + CPI + L(Cars), data = example6.3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -40.352 -8.066 0.741 8.500 44.422
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.345e+01 5.198e+01 1.221 0.232760
## UnemplRa 6.086e+00 7.255e+00 0.839 0.408962
## Export 6.707e-05 3.822e-04 0.175 0.862023
## PopSize -1.615e+01 1.245e+01 -1.297 0.205589
## AvCarLan -6.531e+00 5.688e+00 -1.148 0.261001
## PerCapIn 7.724e-02 5.288e-02 1.461 0.155663
## CPI -7.812e-01 1.323e+00 -0.590 0.559821
## L(Cars) 8.779e-01 2.129e-01 4.124 0.000318 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 18.99 on 27 degrees of freedom
## Multiple R-squared: 0.9609, Adjusted R-squared: 0.9508
## F-statistic: 94.89 on 7 and 27 DF, p-value: < 2.2e-16
Comment: (Answer 3)
reg3 <- dynlm(Cars~UnemplRa+Export+AvCarLan+PerCapIn+CPI+L(Cars), data = example6.3)
summary (reg3)
##
## Time series regression with "ts" data:
## Start = 1963, End = 1997
##
## Call:
## dynlm(formula = Cars ~ UnemplRa + Export + AvCarLan + PerCapIn +
## CPI + L(Cars), data = example6.3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -41.216 -4.940 -0.819 8.499 53.431
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 47.5385857 51.1243344 0.930 0.360390
## UnemplRa 0.1373883 5.6903397 0.024 0.980909
## Export 0.0003430 0.0003214 1.067 0.294909
## AvCarLan -6.6973355 5.7557271 -1.164 0.254408
## PerCapIn 0.0230321 0.0327941 0.702 0.488275
## CPI -0.9711533 1.3309798 -0.730 0.471663
## L(Cars) 0.8708293 0.2153761 4.043 0.000374 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 19.22 on 28 degrees of freedom
## Multiple R-squared: 0.9585, Adjusted R-squared: 0.9496
## F-statistic: 107.8 on 6 and 28 DF, p-value: < 2.2e-16
Comment: (Answer 4)
reg4 <- dynlm(Cars~Export+AvCarLan+PerCapIn+CPI+L(Cars), data = example6.3)
summary (reg4)
##
## Time series regression with "ts" data:
## Start = 1963, End = 1997
##
## Call:
## dynlm(formula = Cars ~ Export + AvCarLan + PerCapIn + CPI + L(Cars),
## data = example6.3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -41.278 -4.893 -0.847 8.518 53.452
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 48.3930460 36.2537524 1.335 0.192
## Export 0.0003476 0.0002544 1.366 0.182
## AvCarLan -6.6159001 4.5828373 -1.444 0.160
## PerCapIn 0.0227653 0.0303402 0.750 0.459
## CPI -0.9679251 1.3012286 -0.744 0.463
## L(Cars) 0.8670720 0.1463046 5.926 1.95e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 18.89 on 29 degrees of freedom
## Multiple R-squared: 0.9585, Adjusted R-squared: 0.9514
## F-statistic: 134 on 5 and 29 DF, p-value: < 2.2e-16
Comment: (Answer 5)
reg5 <- dynlm(Cars~Export+AvCarLan+CPI+L(Cars), data = example6.3)
summary (reg5)
##
## Time series regression with "ts" data:
## Start = 1963, End = 1997
##
## Call:
## dynlm(formula = Cars ~ Export + AvCarLan + CPI + L(Cars), data = example6.3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -39.970 -6.370 0.206 7.429 56.067
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 66.0337831 27.3952319 2.410 0.02227 *
## Export 0.0004972 0.0001570 3.167 0.00353 **
## AvCarLan -8.0146917 4.1559314 -1.928 0.06330 .
## CPI -0.0268940 0.3443225 -0.078 0.93826
## L(Cars) 0.9015169 0.1379005 6.537 3.14e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 18.75 on 30 degrees of freedom
## Multiple R-squared: 0.9577, Adjusted R-squared: 0.9521
## F-statistic: 169.8 on 4 and 30 DF, p-value: < 2.2e-16
Comment: (Answer 6)
reg6 <- dynlm(Cars~Export+AvCarLan+L(Cars), data = example6.3)
summary (reg6)
##
## Time series regression with "ts" data:
## Start = 1963, End = 1997
##
## Call:
## dynlm(formula = Cars ~ Export + AvCarLan + L(Cars), data = example6.3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -40.084 -6.157 0.384 7.248 55.736
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 66.1549333 26.9092562 2.458 0.019741 *
## Export 0.0004911 0.0001340 3.663 0.000922 ***
## AvCarLan -8.1770225 3.5407782 -2.309 0.027751 *
## L(Cars) 0.8998982 0.1341310 6.709 1.66e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 18.45 on 31 degrees of freedom
## Multiple R-squared: 0.9577, Adjusted R-squared: 0.9536
## F-statistic: 233.9 on 3 and 31 DF, p-value: < 2.2e-16
Comment: (Answer 7)
Comment on each part;
Answer 1:
Answer 2:
Answer 3:
Answer 4:
Answer 5:
Answer 6:
Answer 7:
Reference
Notes compiled by;
Muhammad Asmu’i Abdul Rahim
Email: asmui@tmsk.uitm.edu.my
Updated On: 11Nov2020