October 2, 2017
\[ \mathsf{E}\{y_t | \boldsymbol{z}_t\} = \alpha' \boldsymbol{z}_t + g(\boldsymbol{z}_t) \label{eqn:first} \] where \(y_t\) is the dependent variable, \(\boldsymbol{z}_t\) the explanatory vector including lagged \(y_t\).
The model for the conditional mean is said to be linear if \(g(\boldsymbol{z}_t) \equiv 0\).
This definition is used in Lee, White and Granger (1993).
Linear model: lack of symmetry, which is a form of nonlinearity
The theory is not very specific about how nonlinear the data will turn out to be.
The amount of nonlinearity may be reduced from high-frequency to low-frequency.
For nonlinear models, conditional quantities like the conditional distribution, the conditional mean, and variance will be much more important than the unconditional ones.
In most models in the book, strict stationarity and existence of at least second‐order moments are assumed.
Possible transformation from non-stationary to stationary.
\[ y_t = \varepsilon_t + \sum_{j=1}^q \theta_j \varepsilon_{t-j} \]
Note that \(q\) not necessarily finite.
But it is an identity in the mean square sense.
A formal nonlinear generalization, Volterra expansion
\[ y_t = \sum_{i=0}^q \theta_i \varepsilon_{t-i} + \sum_{i=0}^q \sum_{j=i}^q \theta_{ij} \varepsilon_{t-i} \varepsilon_{t-j} + \sum_{i=0}^q \sum_{j=i}^q \sum_{k=j}^q \theta_{ijk} \varepsilon_{t-i} \varepsilon_{t-j} \varepsilon_{t-k} + ... \]
\[ D_t = \alpha_0' x_t^D + \alpha_1 p_t + \varepsilon_t^D \] \[ S_t = \beta_0' x_t^S + \beta_1 p_t + \varepsilon_t^S, \] with the "min-condtion" \[ D_t^{obs} = \min(D_t, S_t). \]
\[ p_t - p_{t-1} = \gamma (D_t - S_t). \]
\[ \ln y = \ln \gamma + \alpha_1 \ln x_1 + \alpha_2 \ln x_2 + \alpha_{11}(\ln x_1)^2 + \alpha_{22} (\ln x_2)^2 + \alpha_{12} (\ln x_1 \ln x_2). \]
\[ y_t = \sum_{j=1}^r (\phi_j' \boldsymbol{z}_t + \varepsilon_{jt}) I(c_{j-1} < s_t \leq c_j) \]
A special case, 2 regime SR model \[ y_t = (\phi_1' \boldsymbol{z}_t + \varepsilon_{1t}) I(s_t \leq c_1) + (\phi_2' \boldsymbol{z}_t + \varepsilon_{2t}) I(s_t > c_1) \]
When \(\boldsymbol{z}_t\) only contains the intercept and the lagged \(y_t\), and \(s_t = y_{t-d}\), the model becomes the self‐exciting threshold autoregressive (SETAR, or TAR for short) model.
\[ y_t = \sum_{j=1}^r \phi_{0j}' I(c_{j-1} < s_t \leq c_j) + \phi' \tilde{\boldsymbol{w}}_t + \varepsilon_t \]
Estimation of SR models can be carried out by conditional least squares.
Asymptotic distribution for \(c\), Chan (1993) and Hansen (2000).
The observable regime indicator \(s_t\) in SR model is replaced by an unobservable discrete stochastic variable \(\theta_t\).
The sequence \(\{\theta_t\}\) is assumed to be a sequence of iid variables or to follow a Markov chain, typically of order one, with transition probabilities
\[ p_{ij} = \mathsf{Pr} \{ \theta_t = \nu_j | \theta_{t-1} = \nu_i \}, \quad i,j = 1, ..., r. \] * The Markov‐switching (MS) or hidden Markov regression model
\[ y_t = \sum_{j=1}^r (\phi_j' \boldsymbol{z}_t + \varepsilon_{jt}) I(\theta_t = \nu_j) \]
The SR models has been criticized for its lack of smoothness in its transition mechanism.
Bacon and Watts (1971) considered two regression lines and devised a model in which the transition from one line to the other is smooth.
Goldfeld and Quandt (1972) independently presented a STR model, suggested that the step function \(I\) be replaced by a normal cdf.
Maddala (1977) recommended the logistic function instead of the normal cdf, and this has become the prevailing standard.
The logistic STR (LSTR) model
\[ y_t = \{ \phi + \psi G(\gamma, c, s_t) \}' \boldsymbol{z}_t + \varepsilon_t \] with \[ G(\gamma, c, s_t) = \left( 1 + \exp \left\{ - \gamma \prod_{k=1}^K (s_t - c_k) \right\} \right)^{-1} \] where \(\gamma > 0\).
Van Dijk and Franses (1999) introduced the additive STR model \[ y_t = \phi_1' \boldsymbol{z}_t + \sum_{j=2}^n \phi_j' \boldsymbol{z}_t G(\gamma_j, c_j, s_{jt}) + \varepsilon_t \]
They also considered the multiple regime STAR model
\[ y_t = \phi_0' \boldsymbol{w}_t + \phi_1' \boldsymbol{w}_t G(\gamma_1, c_1, s_{1t}) + \phi_2' \boldsymbol{w}_t G(\gamma_2, c_2, s_{2t}) \] \[ + \phi_{12}' \boldsymbol{w}_t G(\gamma_1, c_1, s_{1t}) G(\gamma_2, c_2, s_{2t}) + \varepsilon_t \]
\[ y_t = \sum_{i=0}^\infty \theta_i x_{t-i} + \sum_{i=0}^\infty \sum_{j=i}^\infty \theta_{ij} x_{t-i} x_{t-j} \] \[ + \sum_{i=0}^\infty \sum_{j=i}^\infty \sum_{k=j}^\infty \theta_{ij} x_{t-i} x_{t-j} x_{t-k} + ... \]
The RHS is called the Volterra series expansion.
If the lag‐length, and thus the number of sums is finite, it is called the Kolmogorov–Gabor polynomial.
The Kolmogorov–Gabor polynomial is a universal approximator.
Granger and Hyung (2006) introduced the min-max model \[ y_{1t} = \max ( \alpha y_{1,t-1} + a, \quad \beta y_{2,t-1} + b ) + \varepsilon_{1t} \] \[ y_{2t} = \min ( \gamma y_{1,t-1} + c, \quad \delta y_{2,t-1} + d ) + \varepsilon_{2t} \]
The authors are particularly interested in the special case \(\alpha=\beta=\gamma=\delta=1\).
They show that when \(a-d<0\), the process \[ u_t = y_{1t} - y_{2t} \] is geometrically ergodic, and thus \((1,-1)\) may be viewed as the cointegration vector.
Application in the US interest rates of different frequencies.
Nonlinear moving average models: threshold effects on the parameters of the MA models.
Bilinear models: autoregressive and moving average terms are combined in such a way that the models are nonlinear in variables but linear in parameters.
Time-Varying Parameters and State Space Models
Random Coefficient Models
Volatility Models
Consider the following additive nonlinear model \[ y_{t}=\mathbf{\beta }^{\prime }\mathbf{z}_{t}+G(\mathbf{z}_{t};\mathbf{\gamma })+\varepsilon _{t} \]
Assume that \(G(\mathbf{z}_{t};\mathbf{0})=0\) and \(G(\mathbf{z}_{t};\mathbf{\gamma })\neq 0\) for \(\mathbf{\gamma }\neq \mathbf{0}\).
It appears that the best way of testing the hypothesis is to apply the Lagrange multiplier (LM) or score principle because that only requires the estimation of the linear model.
The log-likelihood function \[ L_{T}(\mathbf{\theta })=c-(T/2)\ln \sigma ^{2}-(1/2\sigma ^{2})\sum_{t=1}^{T}(y_{t}-\mathbf{\beta }^{\prime }\mathbf{z}_{t}-G(\mathbf{z}_{t};\mathbf{\gamma }))^{2}. \]
where \(\mathbf{Z}=(\mathbf{z}_{1}^{\prime }\mathbf{,...,z}_{T}^{\prime})^{\prime },\)\(\mathbf{H=((h}_{1}^{0}\mathbf{)}^{\prime }\mathbf{,...,(h}_{T}^{0}\mathbf{)}^{\prime })^{\prime }\) and \(\widetilde{\mathbf{\varepsilon }}=(\widetilde{\varepsilon }_{1},...,\widetilde{\varepsilon }_{T})^{\prime }\).
Under H\(_{0},\) the statistic has an asymptotic \(\chi ^{2}\) distribution with \(n\) degrees of freedom.
It is exactly the same statistic as the one obtained for testing the null hypothesis \(\mathbf{\delta =0}\) in the linear model \[ \mathbf{y}=\mathbf{Z\beta }+\mathbf{H\delta }+\mathbf{\varepsilon } \]
Another way of viewing the test is that it has been obtained after a linearization by a Taylor expansion around the null hypothesis. This suggests that there may be several nonlinear models with the same LM test of linearity.
or the F-version \[ LM_{F}=\frac{(SSR_{0}-SSR_{1})/n}{SSR_{1}/\{T-(k+p+1)-n\}}. \label{lmf} \]
Under the null hypothesis the latter statistic has an approximate F-distribution with \(n\) and \(T-(k+p+1)-n\) degrees of freedom.
Robustifying against conditional heteroskedasticity.
Godfrey (1988) or Gouriéroux and Monfort (1990)
Let \[ y_{t}=\mathbf{\beta }^{\prime }\mathbf{z}_{t}+G_{1}(\mathbf{z}_{t}\mathbf{% ;\alpha })+\varepsilon _{1t},\;\{\varepsilon _{1t}\}\sim \text{iid}(0,\sigma ^{2}) \label{nl-1} \] and \[ y_{t}=\mathbf{\beta }^{\prime }\mathbf{z}_{t}+G_{2}(\mathbf{z}_{t}\mathbf{% ;\gamma })+\varepsilon _{2t},\;\{\varepsilon _{2t}\}\sim \text{iid}(0,\sigma ^{2}) \label{nl-2} \] be two additive nonlinear models.
Assume that the two equations are linear for \(\mathbf{\alpha =0}\) and \(\mathbf{\gamma =0.}\)
It follows from the second condition that the LM tests derived for testing H\(_{01}\) and H\(_{02}\) are identical.
LM test based on Taylor expansion.
Consider again the following additive nonlinear model \[ y_{t}=\mathbf{\beta }_{0}^{\prime }\mathbf{z}_{t}\mathbf{+\beta }% _{1}^{\prime }\mathbf{z}_{t}G(\mathbf{\gamma };\mathbf{s}_{t})+\varepsilon _{t}=(\mathbf{\beta }_{0}+\mathbf{\beta }_{1}G(\mathbf{\gamma };\mathbf{s}% _{t}))^{\prime }\mathbf{z}_{t}+\varepsilon _{t} \label{add-nlmodel} \]
\(\beta_1\) is not identified when \(\gamma=0\).
\(\gamma\) is not identified when \(\beta_1=0\).
The model is only identified under the alternative.
The sup test.
The average test.
The exponential test.
Taylor expansion: approximate \(G\) locally around the null hypothesis.
Parameter constancy (stability) is a crucial assumption, which should be tested after a model has been estimated.
The Chow (1960) test, single break
The Bai (1999) test, multiple breaks
Testing against smoothly changing parameters
The idea is to modify the smooth transition regression model to fit this situation.
No need to know where the break-point is in advance.