Lab 12: Autoregressive Distributed Lag (ARDL) Models

These study notes cover the essential material from Lecture 11: Distributed Lag (DL) and Autoregressive Distributed Lag (ARDL) Models. These models are ==the workhorses of time series analysis==, used to understand how economic variables react to changes over time.

1 Chapter Overview

The central question of this chapter is: How do we model the dynamic relationship between variables?

In basic cross-sectional econometrics, we assume \(x\) affects \(y\) immediately. However, in the real world—especially with time series data—actions have delayed effects. For example, an increase in the money supply today might not affect inflation until several months later. (在基础的横截面计量经济学中，我们假设 \(x\) 对 \(y\) 产生即时影响。然而，在现实世界中——尤其是涉及时间序列数据时——行动往往具有滞后效应。例如，今天的货币供应量增加可能要到几个月后才会对通货膨胀产生影响。)

This chapter introduces:

Distributed Lag (DL) models: Where current \(y\) is explained by current and past values of \(x\).
Autoregressive Distributed Lag (ARDL) models: Where current \(y\) is explained by past values of \(y\) AND current/past values of \(x\).
Long-term Equilibrium: How to calculate the “steady state” where a system settles after a shock.
==分布滞后==（DL）模型： 当前的 \(y\) 由 \(x\) 的当前值和过去值共同解释。
==自回归分布滞后==（ARDL）模型： 当前的 \(y\) 由过去的 \(y\) 值以及当前/过去的 \(x\) 值共同解释。
长期均衡： 如何计算系统在经历冲击后所达到的“稳态”。

2 Core Concepts

2.1 1. Lags and Dynamics

A “lag” is simply the value of a variable from a previous time period (\(t-1, t-2, \text{etc.}\)). Time series models use these to capture persistence (the tendency of a variable to stay where it is) and delayed reactions. 所谓“滞后”，即指前一时间段（\(t-1, t-2, \text{等}\)）中某个变量的数值。时间序列模型利用这些滞后项来捕捉滞后性（变量保持当前水平的倾向）和延迟反应。

2.2 2. Stationarity: The Prerequisite（平稳性：前提条件）

Before estimating these models, variables must be stationary (their mean and variance do not change over time).（在估计这些模型之前，变量必须是平稳的（其均值和方差不随时间变化））

If a series is non-stationary, you must “difference” it (calculate the change from one period to the next) until it becomes stationary.（如果一个序列是非平稳的，你必须”差分”它（计算一个时期与下一个时期的变化），直到它变成平稳的）
If variables are stationary, you can proceed with simple OLS estimation.（如果变量是平稳的，你可以进行简单的OLS估计）

2.3 3. The Steady State (Equilibrium)（稳态（均衡））

This is a theoretical situation where nothing changes—there are no shocks, and variables remain constant (\(y_t = y_{t-1} = y^*\)). It allows economists to see the “long-run” relationship between variables once all temporary fluctuations have died out.（这是一个理论上的情况，其中没有任何变化——没有冲击，变量保持不变。它使经济学家能够看到变量之间的”长期”关系，一旦所有临时波动都消退）

2.4 4. Short-term vs. Long-term Multipliers（短期乘数与长期乘数）

Short-term (Immediate) Multiplier: The impact of a one-unit change in \(x\) on \(y\) in the exact same time period.（短期（即期）乘数： 在完全相同的时间段内，\(x\)增加一个单位对\(y\)的影响）
Long-term (Total) Multiplier: The total cumulative impact of that change on \(y\) after all delayed effects have occurred.（长期（总）乘数： 在所有延迟效应都发生后，该变化对\(y\)的总累积影响）

3 Important Models or Methods（重要的模型或方法）

3.1 1. Distributed Lag (DL) Model（分布滞后（DL）模型）

Formula: \(y_t = \mu + \beta_0 x_t + \beta_1 x_{t-1} + ... + \beta_p x_{t-p} + \epsilon_t\).

Intuition: The current value of \(y\) depends on what \(x\) is doing now and what it did in the past.（直观理解： 当前的\(y\)值取决于\(x\)现在正在做什么以及它过去做过什么）
Limitation: If you need many lags to explain the relationship, you run out of degrees of freedom and encounter multicollinearity between the lags.（局限性： 如果需要许多滞后项来解释关系，你会耗尽自由度，并遇到滞后项之间的多重共线性）

3.2 2. Autoregressive Distributed Lag (ARDL) Model（自回归分布滞后（ARDL）模型）

Formula: \(y_t = \alpha_1 y_{t-1} + ... + \alpha_p y_{t-p} + \mu + \beta_0 x_t + ... + \beta_s x_{t-s} + \epsilon_t\).

Intuition: By adding the past of \(y\) (the Autoregressive part), the model becomes much more efficient. One or two lags of \(y\) can often replace a dozen lags of \(x\), resulting in a better-fitting model with fewer variables.（直观理解： 通过添加\(y\)的过去值（自回归部分），模型变得更加高效。\(y\)的一个或两个滞后项通常可以替代\(x\)的十几个滞后项，从而产生一个拟合更好、变量更少的模型）
Benefit: ARDL models are highly effective at removing autocorrelation in the residuals, which is a major problem in time series.（优势： ARDL模型在消除残差中的自相关方面非常有效，这是时间序列中的主要问题）

4 How to Choose the Right Model（如何选择合适的模型）

In an exam, follow this logic to determine the appropriate model:（在考试中，按照以下逻辑来确定合适的模型）

Check Stationarity: Use tests like ADF (Augmented Dickey-Fuller). If the variables are stationary, you can use OLS for DL or ARDL.（检查平稳性： 使用ADF（增强的迪基-富勒）等测试。如果变量是平稳的，你可以对DL或ARDL使用OLS）
Evaluate the Residuals: Start with a simple model. Run the Breusch-Godfrey test for autocorrelation.（评估残差： 从简单的模型开始。运行Breusch-Godfrey测试来检测自相关）
- If there is no autocorrelation, a simple DL model may be sufficient.（如果没有自相关，简单的DL模型可能就足够了）
- If there is autocorrelation, you must add lags—usually lags of the dependent variable \(y\)—moving you into an ARDL framework.（如果存在自相关，你_必须_添加滞后项——通常是因变量\(y\)的滞后项——将你推入ARDL框架）
Evaluate Sample Size:（评估样本量：）
- Large Sample (Hundreds/Thousands of observations): You are safe using OLS with stationary variables and robust standard errors.（大样本（数百/数千个观测值）： 使用OLS与平稳变量和稳健标准误是安全的）
- Small Sample: You may need to rely on the Strict Exogeneity assumption (where current shocks don’t affect future values of \(x\)), which is harder to prove and has no specific test.（小样本： 你可能需要依赖严格外生性假设（当前冲击不影响\(x\)的未来值），这更难证明，没有特定的检验）

5 Interpreting Model Output (Important)（解释模型输出（重要））

5.1 1. Individual Coefficients（个别系数）

In time series, interpreting a single lagged coefficient (like \(\beta_2\)) in isolation makes “little sense” because a change in \(x_{t-2}\) usually implies \(x\) was also changing in \(t-1\) and \(t\). Instead, focus on the multipliers.（在时间序列中，单独解释单个滞后系数（如\(\beta_2\)）“意义不大”，因为\(x_{t-2}\)的变化通常意味着\(x\)在\(t-1\)和\(t\)也在变化。相反，应该关注乘数）

5.2 2. Short-term (Impact) Multiplier（短期（冲击）乘数）

This is simply the coefficient of the current-period variable (\(\beta_0\)).（这只是当期变量的系数（\(\beta_0\)））
Interpretation: “If \(x\) increases by 1 unit today, \(y\) increases by \(\beta_0\) units today, holding all else constant”.（解释： “如果今天\(x\)增加1个单位，那么今天\(y\)增加\(\beta_0\)个单位，其他条件不变”）

5.3 3. Long-term (Cumulative) Multiplier (LTM)（长期（累积）乘数（LTM））

This is the most likely exam question.（这是最可能出现的考试题目）

For DL Model: \(LTM = \sum \beta_i\) (the sum of all \(x\) coefficients).（对于DL模型： \(LTM = \sum \beta_i\)（所有\(x\)系数的和））
For ARDL Model: \(LTM = \frac{\sum \beta_i}{1 - \sum \alpha_i}\) (sum of \(x\) coefficients divided by 1 minus the sum of \(y\) coefficients).（对于ARDL模型： \(LTM = \frac{\sum \beta_i}{1 - \sum \alpha_i}\)（\(x\)系数的和除以1减去\(y\)系数的和））
Interpretation: This represents the total reaction of the system.（解释： 这表示系统的总反应）
- Scenario A (One-period shock): If \(x\) increases for only one period and then returns to normal, the LTM is the total “excessive” reaction \(y\) will have before it also returns to normal.（场景A（单期冲击）： 如果\(x\)仅增加一个时期然后恢复正常，LTM是\(y\)在也恢复正常之前的总”过度”反应）
- Scenario B (Permanent shock): If \(x\) increases and stays at the new level, the LTM is the difference between the new steady state and the old steady state.（场景B（永久冲击）： 如果\(x\)增加并_保持_在新的水平，LTM是新稳态和旧稳态之间的差异）

5.4 4. Average Reaction Lag（平均反应滞后）

Formula: \(\bar{\omega} = \frac{\sum (i \cdot \beta_i)}{LTM}\).（公式： \(\bar{\omega} = \frac{\sum (i \cdot \beta_i)}{LTM}\)）
Interpretation: Measures how long it takes for \(y\) to react to \(x\). A smaller value means a faster reaction.（解释： 衡量\(y\)对\(x\)反应需要多长时间。较小的值意味着反应更快）

6 Model Comparison（模型比较）

6.1 1. Ramsey RESET Test（Ramsey RESET测试）

Null Hypothesis (\(H_0\)): The functional form is correct (the model reflects reality).（零假设（\(H_0\)）： 函数形式是正确的（模型反映现实））
Decision: If \(p > 0.05\), the model is “okay.” If \(p < 0.05\), the model is invalid, and you should add more lags or variables.（决策： 如果\(p > 0.05\)，模型是”可以的”。如果\(p < 0.05\)，模型无效，你应该添加更多滞后项或变量）

6.2 2. Breusch-Godfrey (BG) Test（Breusch-Godfrey（BG）测试）

Null Hypothesis (\(H_0\)): No autocorrelation in the residuals.（零假设（\(H_0\)）： 残差中没有自相关）
Importance: In ARDL models, autocorrelation leads to endogeneity, meaning your estimates will be biased. You must achieve a non-significant BG test (\(p > 0.05\)).（重要性： 在ARDL模型中，自相关导致内生性，意味着你的估计会产生偏差。你_必须_获得非显著的BG测试（\(p > 0.05\)））

6.3 3. Information Criteria (AIC/BIC)（信息准则（AIC/BIC））

Used to choose the number of lags. Usually, we choose the model that provides the lowest AIC or BIC value.（用于选择滞后项的数量。通常，我们选择提供最低AIC或BIC值的模型）

7 Key Takeaways for Exams（考试的关键要点）

DL vs. ARDL: DL has lags of \(x\); ARDL has lags of \(x\) AND \(y\). ARDL is generally better at handling autocorrelation.（DL对比ARDL： DL有\(x\)的滞后项；ARDL有\(x\)和\(y\)的滞后项。ARDL通常更善于处理自相关）
Autocorrelation is the Enemy: If you have it, add more lags or use robust standard errors.（自相关是敌人： 如果有自相关，添加更多滞后项或使用稳健标准误）
LTM Calculation: Remember the formula difference for ARDL (\(\frac{\sum \beta}{1 - \sum \alpha}\)) vs. DL (\(\sum \beta\)).（LTM计算： 记住ARDL（\(\frac{\sum \beta}{1 - \sum \alpha}\)）与DL（\(\sum \beta\)）的公式差异）
Steady State Logic: To calculate a steady state value (\(y^*\)), assume all \(y_t = y_{t-1} = y^*\) and all \(x_t = x_{t-1} = x^*\), then solve the equation algebraically.（稳态逻辑： 要计算稳态值（\(y^*\)），假设所有\(y_t = y_{t-1} = y^*\)和所有\(x_t = x_{t-1} = x^*\)，然后用代数方法求解方程）
Strict Exogeneity: Only relevant for small samples where you cannot ensure stationarity. It means shocks today don’t affect \(x\) in the future (e.g., rainfall is strictly exogenous; hiring police officers after a crime spike is not).（严格外生性： 仅与小样本相关，在小样本中你无法确保平稳性。它意味着今天的冲击不会影响未来的\(x\)（例如，降雨是严格外生的；犯罪激增后雇用警察则不是））

8 学习笔记：分布滞后 (DL) 与自回归分布滞后 (ARDL) 模型

8.1 第一部分：讲义核心内容回顾

8.1.1 1. 核心概念与动机

动态关系：在时间序列中，变量的影响往往具有滞后效应（例如：货币供应量对通胀的影响）。
滞后项 (Lag)：指前一时间段（\(t-1, t-2\)等）的数值，用于捕捉持久性和延迟反应。
平稳性 (Stationarity)：估计模型的前提。非平稳序列需通过“差分”处理。
稳态 (Steady State)：不存在冲击、变量保持不变的理论状态（\(y_t = y_{t-1} = y^*\)），用于观察长期均衡关系。

8.1.2 2. 模型定义

分布滞后 (DL) 模型：\(y_t = \mu + \beta_0 x_t + \beta_1 x_{t-1} + \dots + \beta_p x_{t-p} + \epsilon_t\) 。
自回归分布滞后 (ARDL) 模型：\(y_t = \alpha_1 y_{t-1} + \dots + \alpha_p y_{t-p} + \mu + \beta_0 x_t + \dots + \beta_s x_{t-s} + \epsilon_t\) 。
- 优势：ARDL通过加入\(y\)的滞后项，能更高效地捕捉动态，减少变量数量并消除残差自相关。

8.1.3 3. 乘数 (Multipliers) 与解释

短期（即期）乘数：当期系数 \(\beta_0\) 。
长期（累积）乘数 (LTM)：
- DL模型：\(LTM = \sum \beta_i\) 。
- ARDL模型：\(LTM = \frac{\sum \beta_i}{1 - \sum \alpha_i}\) 。
平均反应滞后：\(\bar{\omega} = \frac{\sum (i \cdot \beta_i)}{LTM}\)，衡量反应速度。

8.2 第二部分：课堂板书补充内容 (整理版)

本部分根据白板照片整理，对模型估计、诊断检验及复杂场景进行了深入补充。

8.2.1 1. 模型估计的前提条件与路径选择

根据样本量和平稳性，估计路径分为“简单路径”与“困难路径”：

8.2.1.1 A. 简单路径：基于平稳性与大样本 (Whiteboard 1728, 1715)

当所有变量均为平稳 (Stationary) 时，使用 OLS (普通最小二乘法) 进行估计。需满足：

函数形式设定正确：通过 Ramsey RESET 检验。
同方差性 (Homoskedasticity)：通过 Breusch-Pagan (BP) 检验。
无自相关 (Lack of autocorrelation)：
- DL模型使用 Breusch-Godfrey (BG) 检验。
- ARDL模型使用 Ljung-Box (Portmanteau) 检验。
大样本：拥有足够多的观测值。

8.2.1.2 B. 困难路径：小样本或非平稳性 (Whiteboard 1730)

如果无法确保平稳性或处于小样本环境，估计变得更加困难，需依赖严格外生性 (Strict Exogeneity)：

当期外生性 (Contemporaneous Exogeneity)：\(Cov(x_t, \epsilon_t) = 0\)。
强/严格外生性 (Strong/Strict Exogeneity)：\(\epsilon_t\) 与任何时期的 \(x\) 变量都不相关。
适用场景：即使没有平稳性假设或样本量较小，若满足严格外生性，模型逻辑依然成立。

8.2.2 2. 多变量 ARDL 模型与稳态代数求解 (Whiteboard 1727)

在处理多个解释变量（如 \(x, z, w\)）时，模型形式如下：

\[y_t = \beta_0 + (\beta_1 x_t + \beta_2 x_{t-1}) + (\alpha_1 z_t + \alpha_2 z_{t-1} + \alpha_3 z_{t-2}) + (\delta_1 w_{t-1} + \delta_2 w_{t-12}) + (\gamma_1 y_{t-1} + \dots + \gamma_s y_{t-s}) + \epsilon_t\]

稳态 (Steady State) 求解逻辑：

假设系统达到均衡，所有时间下标消失：\(y_t = y_{t-1} = y^*\)，\(x_t = x_{t-1} = x^*\)，以此类推。

\[y^* = \beta_0 + \beta_1 x^* + \beta_2 x^* + \alpha_1 z^* + \alpha_2 z^* + \alpha_3 z^* + \delta_1 w^* + \delta_2 w^* + \gamma_1 y^* + \dots + \gamma_s y^*\]

通过代数移项，即可解出长期均衡值 \(y^*\)。

8.2.3 3. 多变量长期乘数 (LTM) 详尽公式 (Whiteboard 1723)

针对永久冲击 (Permanent Shock)，即变量增加并保持在水平值（如 \(x_{15}=x_{16}=\dots=7\)），各变量的 LTM 计算如下：

模型类型	变量 x 的 LTM	变量 z 的 LTM
DL 模型	\(LTM_x = \beta_1 + \beta_2\)	\(LTM_z = \alpha_1 + \alpha_2 + \alpha_3\)
ARDL 模型	\(LTM_x = \frac{\beta_1 + \beta_2}{1 - (\gamma_1 + \gamma_2 + \dots + \gamma_s)}\)	\(LTM_z = \frac{\alpha_1 + \alpha_2 + \alpha_3}{1 - (\gamma_1 + \dots + \gamma_s)}\)

8.2.4 4. 诊断检验与补救措施 (Whiteboard 1716, 1717)

检验项目	工具	零假设 (H0)	失败后的补救措施
函数形式	Ramsey RESET	模型设定正确/反映现实	增加变量、添加滞后项、取对数
同方差性	Breusch-Pagan	误差项为同方差	使用对数形式、使用稳健标准误
自相关	BG / Ljung-Box	残差项无自相关	增加滞后项（尤其是 \(y\) 的滞后项）

关键结论：

在 ARDL 模型中，自相关会导致内生性偏误，必须通过增加滞后项使 \(p-value > 0.05\)。
如果无法完美解决同方差或自相关问题，应使用 稳健协方差矩阵估计量 (Robust Standard Errors) 来确保推断的有效性。

9 Example Code of US Macro Data Analysis with DL/ARDL Models

###########################################################################
#       Advanced Econometrics                                                 #
#   Spring semester                                                       #
#   dr Marcin Chlebus, dr Rafał Woźniak                                   #
#   University of Warsaw, Faculty of Economic Sciences                    #
#                                                                         #
#                                                                         #
#                 Lab 12: (AR)DL Linear regression                        #
#                                                                         #
###########################################################################

if(!require(lmtest)){install.packages("lmtest")}
library(lmtest) 

if(!require(dynlm)){install.packages("dynlm")}
library(dynlm)

# install.packages("devtools")
# library(devtools)
# install_github("fcbarbi/ardl")
# library(ardl)

if(!require(AER)){install.packages("AER")}
library(AER)

if(!require(zoo)){install.packages("zoo")}
library(zoo)

if(!require(xts)){install.packages("xts")}
library(xts)

if(!require(fBasics)){install.packages("fBasics")}
library(fBasics)

Sys.setenv(LANG = "en")
options(scipen=999)

##########################################################
# Expolratory data analysis
#########################################################

# Quarterly US macroeconomic data from 1950(1) – 2000(4)
# provided by USMacroG, a “ts” time series. Contains 
# dpi --> disposable income
# consumption --> consumption (in billion USD).

data(USMacroG)

data(package = "AER")

USA<-as.zoo(USMacroG)
USA[1:5,]

##            gdp consumption invest government    dpi  cpi     m1 tbill unemp
## 1950 Q1 1610.5      1058.9  198.1      361.0 1186.1 70.6 110.20  1.12   6.4
## 1950 Q2 1658.8      1075.9  220.4      366.4 1178.1 71.4 111.75  1.17   5.6
## 1950 Q3 1723.0      1131.0  239.7      359.6 1196.5 73.2 112.95  1.23   4.6
## 1950 Q4 1753.9      1097.6  271.8      382.5 1210.0 74.9 113.93  1.35   4.2
## 1951 Q1 1773.5      1122.8  242.9      421.9 1207.9 77.3 115.08  1.40   3.5
##         population inflation interest
## 1950 Q1    149.461        NA       NA
## 1950 Q2    150.260    4.5071  -3.3404
## 1950 Q3    151.064    9.9590  -8.7290
## 1950 Q4    151.871    9.1834  -7.8301
## 1951 Q1    152.393   12.6160 -11.2160

#plotting DPI and consuption
colors_ <- c("black", "red")

plot(USA[,c("dpi","consumption")], plot.type = "single",
     col = colors_,   # colors for subsequent lines
     ylab = "dpi/consumption", xlab = "time", # axes labels
     main = "DPI & Consumption in the USA 1950 - 2000") # title above the plot

# separately we add a legend definition
legend("topleft",     # legend position - combination of top, bottom and left, right
       names(USA[,c("dpi","consumption")]), # legend elements (names of the series)
       text.col = colors_)     # colors (the same as in plot() function above)

#Conclusions?
  # 1. Those 2 variables follows probably the same trend, in regression we would find this relation, but probably nothing more
  # What to do?
    # differentiation - is there relation between deviance from the trend?
    # we want to work on stationary TS

#plotting first difference of DPI and consuption
plot(diff(USA[,c("dpi","consumption")]), 
     plot.type = "single",
     col = colors_,   # colors for subsequent lines
     ylab = "dpi/consumtion", xlab = "time", # axes labels
     main = "DPI & Consumption in the USA 1950 - 2000") # title above the plot

# separately we add a legend definition
legend("topleft",     # legend position - combination of top, bottom and left, right
       names(USA[,c("dpi","consumption")]), # legend elements (names of the series)
       text.col = colors_)     # colors (the same as in plot() function above)

参数说明：

plot.type = "single" 核心参数。指定绘图类型为“单个窗口”。这意味着 dpi 和 consumption 的差分曲线将被绘制在同一个坐标系（同一张图）中。如果不设置该参数（默认是 “multiple”），R 会为两个变量分别绘制上下排列的两个独立图表。
USA[,c("dpi","consumption")]：从名为 USA 的数据集（通常是一个时间序列矩阵或数据框）中，筛选出 dpi（可支配个人收入）和 consumption（消费）这两列数据。
diff(...)：对筛选出的两列数据计算一阶差分 (First Difference)。在线性回归和时间序列分析（如讲义中提到的分布式滞后模型 DL/ARDL ）中，差分常用于将非平稳数据转化为平稳数据 (Stationary Data) 。

9.1 First-differences seem to have increasing variance

统计学上的方差（Variance）衡量的是数据偏离其均值的平方和（即波动的剧烈程度）。在这张图里，数据的波动幅度随着时间推移明显“越变越大”，在图形上呈现出向右逐渐张开的态势。这就是典型的方差随时间递增。

为什么取对数（log）能解决这个问题这正好印证了你后面那段代码的逻辑。

数据的原始一阶差分（如上图所示）虽然围绕着一个恒定的均值（大约在 25 左右）上下起伏，看似平稳，但它不满足同方差（Homoscedasticity）假设。
当经济在 1950 到 2000 年间不断总量扩张时，DPI 的绝对数值变大了，因此哪怕只变动 1%，其绝对增加额（一阶差分值）也会在后期变得非常巨大。
如果你对原始数据先取 log() 再做 diff()，图形就会变成“百分比增长率”的波动。取对数能成功压制这种由于总量变大而导致的绝对波动放大，把后期剧烈的震荡“拉平”到和前期差不多的幅度，从而完美修复同方差假设，得到质量更高的平稳序列。

#stationarity testing

source("function_testdf2.R")
options(scipen=999)
testdf2(variable = USA[,c("dpi")], # vector tested
   test.type = "c", # test.type = "c",
   max.augmentations = 5, # maximum number of augmentations added
   max.order=5) # maximum order of residual lags for BG test

##   augmentations      adf p_adf p_bg.p.value.1 p_bg.p.value.2 p_bg.p.value.3
## 1             0 4.520147  0.99      0.4757627      0.1466866      0.2377082
## 2             1 4.417310  0.99      0.9225736      0.1784467      0.2924392
## 3             2 3.639854  0.99      0.9354447      0.9963847      0.7998528
## 4             3 3.342271  0.99      0.9843572      0.9972283      0.9836444
## 5             4 3.069740  0.99      0.9103627      0.9851835      0.9822840
## 6             5 3.715439  0.99      0.9635154      0.9980164      0.9962116
##   p_bg.p.value.4 p_bg.p.value.5
## 1      0.3438192     0.01927990
## 2      0.4280401     0.02312387
## 3      0.8876616     0.08151424
## 4      0.9893585     0.10515041
## 5      0.9961649     0.10115539
## 6      0.9919605     0.99814270

According to the results of the testdf2 function, when 2 or more augmentations are introduced (\(\text{augmentations} \ge 2\)), the p-values of the Breusch–Godfrey (BG) test are all noticeably greater than 0.05 (for instance, when \(\text{augmentations} = 2\) and the residual lag order is 5, the p-value is 0.0815). This indicates that the issue of residual autocorrelation has been effectively eliminated. Under this valid premise, the p-value of the Augmented Dickey-Fuller (ADF) test is 0.99, failing to reject the null hypothesis of the presence of a unit root, which demonstrates that the original time series of the US DPI is non-stationary.

根据 testdf2 的检验结果，当引入 2 阶或以上的增强滞后项（augmentations \(\ge 2\)）时，BG 检验的 p 值均显著大于 0.05（例如在 augmentations = 2 且滞后 5 阶时，p 值为 0.0815），表明残差自相关得到了有效消除。在此有效前提下，ADF 检验的 p 值为 0.99，无法拒绝存在单位根的原假设，表明美国 dpi 原始时间序列是非平稳的。”

testdf2(variable = diff(USA[,c("dpi")]), # vector tested
       test.type = "c", # test.type = "c",
       max.augmentations = 5, # maximum number of augmentations added
       max.order=5) # maximum order of residual lags for BG test

##   augmentations        adf p_adf p_bg.p.value.1 p_bg.p.value.2 p_bg.p.value.3
## 1             0 -13.460411  0.01      0.8720424     0.01570519      0.0149188
## 2             1  -7.813974  0.01      0.7532990     0.89567380      0.3446841
## 3             2  -6.038376  0.01      0.8871046     0.98923951      0.9932058
## 4             3  -5.022296  0.01      0.8511516     0.97342644      0.9908236
## 5             4  -5.491999  0.01      0.9090418     0.99051048      0.9943576
## 6             5  -4.799076  0.01      0.9367024     0.98509353      0.9882262
##   p_bg.p.value.4 p_bg.p.value.5
## 1     0.01496486    0.004641123
## 2     0.32977415    0.113457768
## 3     0.90527668    0.239907589
## 4     0.99687858    0.225223810
## 5     0.99900087    0.998999470
## 6     0.99643513    0.998707633

According to the results of the testdf2 function applied to the first difference of the series, when 1 or more augmentations are introduced (\(\text{augmentations} \ge 1\)), the p-values of the Breusch–Godfrey (BG) test across all tested lag orders are greater than 0.05. This indicates that the issue of residual autocorrelation has been effectively eliminated. Under this valid premise (for instance, at \(\text{augmentations} = 1\)), the Augmented Dickey-Fuller (ADF) test statistic is -7.814, and the corresponding p-value (p_adf) is 0.01. Since the p-value is less than or equal to the 1% significance level, we strongly reject the null hypothesis (\(H_0\)) of a unit root. Consequently, we can conclude that the first-differenced data, diff(USA[,c("dpi")]), is stationary.

testdf2(variable = USA[,c("consumption")], # vector tested
      test.type = "c", # test.type = "c",
      max.augmentations = 5,  # maximum number of augmentations added
      max.order=5) # maximum order of residual lags for BG test

##   augmentations      adf p_adf p_bg.p.value.1 p_bg.p.value.2 p_bg.p.value.3
## 1             0 8.296398  0.99    0.005546307   0.0001103848 0.000008381582
## 2             1 5.771917  0.99    0.485949219   0.0208831243 0.000140733984
## 3             2 4.367355  0.99    0.755707867   0.9043717071 0.006860732824
## 4             3 3.066683  0.99    0.750462543   0.9496740461 0.832868347784
## 5             4 3.252579  0.99    0.991992033   0.9729771490 0.908205859670
## 6             5 3.202110  0.99    0.854199944   0.9826726715 0.998104122232
##   p_bg.p.value.4 p_bg.p.value.5
## 1  0.00001907399  0.00003700224
## 2  0.00041767804  0.00065979348
## 3  0.01362072624  0.02641457522
## 4  0.79058700070  0.83992186062
## 5  0.96783873221  0.97249863759
## 6  0.99938185774  0.99929469234

testdf2(variable = diff(USA[,c("consumption")]), # vector tested
        test.type = "c", # test.type = "c",
        max.augmentations = 5,  # maximum number of augmentations added
        max.order=5) # maximum order of residual lags for BG test

##   augmentations       adf      p_adf p_bg.p.value.1 p_bg.p.value.2
## 1             0 -9.184518 0.01000000     0.04956141    0.003167801
## 2             1 -5.237949 0.01000000     0.30523939    0.246359385
## 3             2 -3.629948 0.01000000     0.87947985    0.926376184
## 4             3 -3.554668 0.01000000     0.97024924    0.906962022
## 5             4 -3.607831 0.01000000     0.86171475    0.984715987
## 6             5 -3.126966 0.02712406     0.98008540    0.998966222
##    p_bg.p.value.3  p_bg.p.value.4  p_bg.p.value.5
## 1 0.0000001672182 0.0000001920137 0.0000006372337
## 2 0.0017024608681 0.0042263851347 0.0089973989528
## 3 0.9548186200858 0.9089565420640 0.9419401977419
## 4 0.9599028563715 0.9609444552864 0.9748349856447
## 5 0.9981523313452 0.9903607166502 0.9970624226973
## 6 0.9995951693166 0.9996078091443 0.9999463124934

10 Use `Log` to fix the problem with increasing variance

# first-differences seem to have increasing variance
# thus, let us analyse the first difference of the log-transformed
# first differences

testdf2(variable = diff(log(USA[,c("dpi")])), # vector tested
        test.type = "c", # test.type = "nc",
        max.augmentations = 5, # maximum number of augmentations added
        max.order=5) # maximum order of residual lags for BG test

##   augmentations        adf p_adf p_bg.p.value.1 p_bg.p.value.2 p_bg.p.value.3
## 1             0 -13.218923  0.01      0.9662664      0.6590377      0.4960841
## 2             1  -8.916091  0.01      0.9100174      0.9923266      0.6236955
## 3             2  -6.824515  0.01      0.9450289      0.9383219      0.9883654
## 4             3  -6.336003  0.01      0.9707415      0.9714525      0.9958945
## 5             4  -7.033210  0.01      0.8707947      0.9863307      0.9962280
## 6             5  -5.947960  0.01      0.9861110      0.9986554      0.9985395
##   p_bg.p.value.4 p_bg.p.value.5
## 1      0.6648265     0.04837356
## 2      0.7793006     0.07764483
## 3      0.9973459     0.14685535
## 4      0.9989003     0.17154874
## 5      0.9994993     0.99966040
## 6      0.9998475     0.99914533

testdf2(variable = diff(log(USA[,c("consumption")])), # vector tested
        test.type = "c", # test.type = "nc",
        max.augmentations = 5,  # maximum number of augmentations added
        max.order=5) # maximum order of residual lags for BG test

##   augmentations        adf p_adf p_bg.p.value.1 p_bg.p.value.2 p_bg.p.value.3
## 1             0 -13.791807  0.01     0.71451406   0.0002341137   0.0007533067
## 2             1  -8.134460  0.01     0.07989654   0.1829103993   0.1138040669
## 3             2  -6.268350  0.01     0.71253996   0.4629639320   0.1105162682
## 4             3  -6.777003  0.01     0.88973894   0.1969371662   0.0370375602
## 5             4  -6.450499  0.01     0.56548398   0.8323252837   0.8093948218
## 6             5  -6.174957  0.01     0.91795679   0.9945301814   0.9996765731
##   p_bg.p.value.4 p_bg.p.value.5
## 1   0.0004893063   0.0008670902
## 2   0.0405561599   0.0464505336
## 3   0.0493164293   0.0478830419
## 4   0.0730829520   0.0966908739
## 5   0.9059957973   0.9334284979
## 6   0.9997289343   0.9857120180

11 Autoregressive Distributed Lagged Model

##########################################################
# Autoregressive Distributed Lagged Model
#########################################################
  
#  Distirbuted Lagged Model
# As regressors we use lags of explanatory variables
# start = c(1960, 1) is set to make comparison between models with different lag order possible

dl_1 <- dynlm(d(consumption) ~ d(dpi) + L(d(dpi)), data = USA, start = c(1960, 1))
summary(dl_1)

## 
## Time series regression with "zooreg" data:
## Start = 1960 Q1, End = 2000 Q4
## 
## Call:
## dynlm(formula = d(consumption) ~ d(dpi) + L(d(dpi)), data = USA, 
##     start = c(1960, 1))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -64.413 -12.797  -1.915  12.594  53.529 
## 
## Coefficients:
##             Estimate Std. Error t value         Pr(>|t|)    
## (Intercept) 12.60067    2.69478   4.676 0.00000615612291 ***
## d(dpi)       0.37915    0.04942   7.673 0.00000000000152 ***
## L(d(dpi))    0.18116    0.04957   3.655         0.000348 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 21.06 on 161 degrees of freedom
## Multiple R-squared:  0.3074, Adjusted R-squared:  0.2988 
## F-statistic: 35.73 on 2 and 161 DF,  p-value: 0.0000000000001437

11.1 ❤️ 理论模型公式 (Theoretical Model Formula)

根据代码中的变量及其对应关系：

d(consumption) 对应 \(\Delta Y_t\) 或 \(\Delta \text{consumption}_t\)
d(dpi) 对应 \(\Delta X_t\) 或 \(\Delta \text{dpi}_t\)
L(d(dpi)) 对应 \(\Delta X_{t-1}\) 或 \(\Delta \text{dpi}_{t-1}\)

总体理论公式可以写为：

\[\Delta \text{consumption}_t = \mu + \beta_0 \Delta \text{dpi}_t + \beta_1 \Delta \text{dpi}_{t-1} + \epsilon_t\]

其中：

\(\mu\) 为截距项（Intercept）。
\(\beta_0\) 为当期收入变动对当期消费变动的影响（短期反应/影响乘数）。
\(\beta_1\) 为前一期收入变动对当期消费变动的滞后影响。
\(\epsilon_t\) 为随机误差项（Residuals / Error term）。

11.2 代码与参数解释

11.2.1 核心代码

dl_1 <- dynlm(d(consumption) ~ d(dpi) + L(d(dpi)), data = USA, start = c(1960, 1))
summary(dl_1)

11.2.2 函数与参数详解

dynlm(...)：来自于 dynlm 包的核心函数，专门用于拟合动态线性回归模型（Dynamic Linear Regression Models）。它支持在时间序列公式中直接使用差分和滞后算子。
formula = d(consumption) ~ d(dpi) + L(d(dpi))：定义了回归方程的结构：
- d(consumption)：因变量（Dependent Variable）。d() 代表一阶差分（First Difference），即 \(\Delta \text{consumption}_t\)，表示消费的季度变动量量。
- d(dpi)：自变量/解释变量（Independent/Explanatory Variable） 的当期一阶差分，即 \(\Delta \text{dpi}_t\)，表示可支配个人收入的当期变动量。
- L(d(dpi))：自变量差分后的一阶滞后项（First Lag），即 \(\Delta \text{dpi}_{t-1}\)。L() 代表滞后算子（Lag Operator）。
data = USA：指定使用名为 USA 的时间序列数据集。
start = c(1960, 1)：指定回归估计的起始时间（Start Time）为 1960 年第 1 季度。通过手动固定所有竞争模型的起始时间，可以确保后续在对比不同滞后阶数的模型时，使用的样本观测值数量完全一致。

11.3 代码注释解释

##########################################################
# Autoregressive Distributed Lagged Model
#########################################################
  
  #  Distirbuted Lagged Model
  # As regressors we use lags of explanatory variables
  # start = c(1960, 1) is set to make comparison between models with different lag order possible

# Distirbuted Lagged Model：提示此段代码正在建立一个“分布式滞后模型（DL 模型）”。
# As regressors we use lags of explanatory variables：指出模型的构建特点，即“使用解释变量（自变量）的滞后项作为回归量（Regressors）”。
# start = c(1960, 1) is set to make...：阐述了设置 start 参数的计量经济学目的——“使具有不同滞后阶数的模型之间进行相互比较成为可能”。因为每多增加一阶滞后，序列头部就会缺失一个观测值；统一将起始点固定在 1960 年 Q1，可以确保所有模型基于相同的样本区间进行公平对比（如通过 AIC 或 \(R^2\)）。

11.4 输出结果解释

执行 summary(dl_1) 后得到的回归分析报告，其核心部分的解读如下：

11.4.1 样本基本信息

Start = 1960 Q1, End = 2000 Q4：回归所使用的实际样本区间为 1960 年第 1 季度至 2000 年第 4 季度。
degrees of freedom (DF)：自由度为 161。这意味着有效观测值数量 \(n = 164\)（161 个自由度 + 3 个估计参数）。

11.4.2 残差统计量 (Residuals)

提供了残差的分布情况（最小值 Min、四分位数 1Q/Median/3Q、最大值 Max），用于初步判断残差是否符合正态分布且均值为 0 的假设。

11.4.3 系数表格 (Coefficients)

表格包含四个核心列：估计值（Estimate）、标准误（Std. Error）、t 统计量（t value） 和 P 值（Pr(>|t|)。

(Intercept)（截距项 / 常数项）：
- 估计值为 12.60067。
- P 值极其微小（带有 ***），在 0.1% 的水平下显著（Significant）。
d(dpi)（当期收入变动）：
- 估计值为 0.37915。
- 短期反应 / 影响乘数（==Short-term reaction== / Impact multiplier）：表示在当前季度，可支配收入每增加 1 个单位，消费会立即增加约 0.38 个单位。其 P 值接近 0，极其显著。
L(d(dpi))（前一期收入变动）：
- 估计值为 0.18116。
- 滞后反应（ ==Lagged Effect== ）：表示上一季度的可支配收入每增加 1 个单位，会对当前季度的消费产生约 0.18 个单位的延迟拉动作用。其 P 值为 0.000348，同样极其显著。

延伸结合讲义公式 ：该模型的长期反应 / 长期乘数（Long-term reaction / Long-term multiplier） 为自变量所有系数之和：

\[\beta_{LT} = \beta_0 + \beta_1 = 0.37915 + 0.18116 = 0.56031\]

这意味着，从长期来看，可支配收入每增加 1 个单位，将带来总计约 0.56 个单位的消费增长。

11.4.4 模型整体拟合诊断

Residual standard error: 21.06：残差标准误，衡量模型预测值偏离真实值的平均程度。
Multiple R-squared: 0.3074：判定系数（\(R^2\)） 。表明该模型中的收入变动及其滞后项解释了消费变动中约 30.74% 的方差。
Adjusted R-squared: 0.2988：调整后的判定系数（\(R_{adj}^2\)） 。惩罚了变量个数后的拟合优度，通常用于不同变量数量的模型之间的比较。
F-statistic: 35.73 ... p-value: < 0.0000...：F 统计量整体显著性检验。P 值远小于 0.05，表明联合拒绝“所有自变量系数均为0”的原假设，模型整体非常显著。

12 Alternative model for log-transformed variables

# alternative model for log-transformed variables
USA$ln_consumption = log(USA$consumption)
USA$ln_dpi = log(USA$dpi)
dl_ln1 <- dynlm(d(ln_consumption) ~ d(ln_dpi) + L(d(ln_dpi)), data = USA, start = c(1960, 1))
summary(dl_ln1)

## 
## Time series regression with "zooreg" data:
## Start = 1960 Q1, End = 2000 Q4
## 
## Call:
## dynlm(formula = d(ln_consumption) ~ d(ln_dpi) + L(d(ln_dpi)), 
##     data = USA, start = c(1960, 1))
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0202245 -0.0035871 -0.0002189  0.0040243  0.0132598 
## 
## Coefficients:
##               Estimate Std. Error t value          Pr(>|t|)    
## (Intercept)  0.0038889  0.0007819   4.973 0.000001672323833 ***
## d(ln_dpi)    0.4165784  0.0530758   7.849 0.000000000000554 ***
## L(d(ln_dpi)) 0.1655767  0.0530807   3.119           0.00215 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.005956 on 161 degrees of freedom
## Multiple R-squared:  0.3121, Adjusted R-squared:  0.3035 
## F-statistic: 36.52 on 2 and 161 DF,  p-value: 0.00000000000008359

12.1 理论模型公式 (Theoretical Model Formula)

\[\Delta \ln(\text{consumption}_t) = \mu + \beta_0 \Delta \ln(\text{dpi}_t) + \beta_1 \Delta \ln(\text{dpi}_{t-1}) + \epsilon_t\]

其中：

\(\mu\) 为截距项（Intercept）。
\(\beta_0\) 为当期收入增长率对当期消费增长率的即期影响（短期弹性/影响乘数）。
\(\beta_1\) 为前一期收入增长率对当期消费增长率的滞后影响。
\(\epsilon_t\) 为随机误差项（Error term）。

13 Autocorrelation Function `acf`

acf(dl_1$residuals, type='correlation')

第 0 阶不用看：Lag = 0 的线永远最高（等于1），那是自己和自己比，纯属参考，直接忽略。
完美模型看蓝线：如果模型完美，从 Lag = 1 开始往后的所有黑线，都应该乖乖缩在两条蓝线以内（代表残差纯随机）。
黑线破线要升级：只要有黑线冲破蓝线，说明残差有规律（自相关），模型不合格，必须升级为 ARDL 模型。
在蓝线内，说明bgtest的p-value > 0.05，残差无自相关，模型合格。
如果有黑线突破蓝线，说明bgtest的 p-value < 0.05，残差有自相关，模型不合格。

#in order to draw a conclusions based on the model we would like to sort out a problem with autocorrelation
#has this model sorted out autocorrelation issue? - nope
resids_ <- as.data.frame(dl_1$residuals)
colnames(resids_)<-c("res")
bgtest(res~1,data=resids_, order = 1)

## 
##  Breusch-Godfrey test for serial correlation of order up to 1
## 
## data:  res ~ 1
## LM test = 2.8539, df = 1, p-value = 0.09116

bgtest(res~1,data=resids_, order = 2)

## 
##  Breusch-Godfrey test for serial correlation of order up to 2
## 
## data:  res ~ 1
## LM test = 11.199, df = 2, p-value = 0.0037

bgtest(res~1,data=resids_, order = 3)

## 
##  Breusch-Godfrey test for serial correlation of order up to 3
## 
## data:  res ~ 1
## LM test = 22.658, df = 3, p-value = 0.00004758

bgtest(res~1,data=resids_, order = 4)

## 
##  Breusch-Godfrey test for serial correlation of order up to 4
## 
## data:  res ~ 1
## LM test = 23.717, df = 4, p-value = 0.00009101

bgtest(res~1,data=resids_, order = 5)

## 
##  Breusch-Godfrey test for serial correlation of order up to 5
## 
## data:  res ~ 1
## LM test = 24.111, df = 5, p-value = 0.0002067

使用 c(1:5) 来显示 5 个lags.

dl_5_all <- dynlm(d(consumption) ~ d(dpi) + L(d(dpi),c(1:5)), data = USA,start = c(1960, 1))
summary(dl_5_all)

## 
## Time series regression with "zooreg" data:
## Start = 1960 Q1, End = 2000 Q4
## 
## Call:
## dynlm(formula = d(consumption) ~ d(dpi) + L(d(dpi), c(1:5)), 
##     data = USA, start = c(1960, 1))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -64.047 -11.994  -1.912  12.038  49.236 
## 
## Coefficients:
##                    Estimate Std. Error t value        Pr(>|t|)    
## (Intercept)         5.64149    3.56385   1.583         0.11544    
## d(dpi)              0.36923    0.05050   7.311 0.0000000000127 ***
## L(d(dpi), c(1:5))1  0.16208    0.04976   3.257         0.00138 ** 
## L(d(dpi), c(1:5))2  0.08255    0.05033   1.640         0.10295    
## L(d(dpi), c(1:5))3  0.05801    0.05077   1.143         0.25497    
## L(d(dpi), c(1:5))4  0.04099    0.05024   0.816         0.41585    
## L(d(dpi), c(1:5))5  0.08489    0.05103   1.664         0.09821 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 20.74 on 157 degrees of freedom
## Multiple R-squared:  0.3451, Adjusted R-squared:  0.3201 
## F-statistic: 13.79 on 6 and 157 DF,  p-value: 0.000000000001476

令 \(t\) 代表当前季度，\(\Delta\) 代表一阶差分（即代码中的 d()），正确的理论公式应当是：

\[\Delta \text{consumption}_t = \beta_0 + \beta_1 \Delta \text{dpi}_t + \beta_2 \Delta \text{dpi}_{t-1} + \beta_3 \Delta \text{dpi}_{t-2} + \beta_4 \Delta \text{dpi}_{t-3} + \beta_5 \Delta \text{dpi}_{t-4} + \beta_6 \Delta \text{dpi}_{t-5} + \epsilon_t\]

dl_5 <- dynlm(d(consumption) ~ d(dpi) + L(d(dpi),c(1,2,5)), data = USA,start = c(1960, 1))
summary(dl_5)

## 
## Time series regression with "zooreg" data:
## Start = 1960 Q1, End = 2000 Q4
## 
## Call:
## dynlm(formula = d(consumption) ~ d(dpi) + L(d(dpi), c(1, 2, 5)), 
##     data = USA, start = c(1960, 1))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -65.371 -12.682  -1.721  11.804  48.567 
## 
## Coefficients:
##                        Estimate Std. Error t value         Pr(>|t|)    
## (Intercept)             7.58732    3.25397   2.332          0.02097 *  
## d(dpi)                  0.37793    0.05009   7.545 0.00000000000328 ***
## L(d(dpi), c(1, 2, 5))1  0.17365    0.04902   3.543          0.00052 ***
## L(d(dpi), c(1, 2, 5))2  0.08627    0.04967   1.737          0.08436 .  
## L(d(dpi), c(1, 2, 5))5  0.09302    0.05030   1.849          0.06626 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 20.73 on 159 degrees of freedom
## Multiple R-squared:  0.3372, Adjusted R-squared:  0.3205 
## F-statistic: 20.22 on 4 and 159 DF,  p-value: 0.0000000000001754

14 Anova Test

#check which model is better:
anova(dl_5_all,dl_5)

## Analysis of Variance Table
## 
## Model 1: d(consumption) ~ d(dpi) + L(d(dpi), c(1:5))
## Model 2: d(consumption) ~ d(dpi) + L(d(dpi), c(1, 2, 5))
##   Res.Df   RSS Df Sum of Sq      F Pr(>F)
## 1    157 67530                           
## 2    159 68350 -2   -819.71 0.9529 0.3879

#conclusion? both are statistically equal so we prefer with more DF (less regressors)

14.1 诊断结论与终极简化总结

我们可以用一套简单的标准来定生死：

14.1.1 统计学假设

原假设 (\(H_0\))：被删掉的变量（第 3 阶和第 4 阶滞后项）的系数联合起来等于 0（即它们是多余的，对预测消费变动没有实质性贡献）。
备择假设 (\(H_1\))：它们不等于 0（它们是有用的，不能删）。

14.1.2 看 P 值做决定（通俗口诀）

如果 P 值 < 0.05：说明误差上升得非常显著！说明删掉的变量很重要，不能删，应该选择大模型（Model 1）。
如果 P 值 > 0.05：说明剔除变量后，误差虽然有一点点上升，但在统计学上完全可以忽略不计（不显著）。说明这两个变量是多余的，应该遵循精简原则选择小模型（Model 2）。

14.1.3 最终结论

在这段输出中，p-value = 0.3879 显著大于 0.05。我们无法拒绝原假设。这证明第 3 阶和第 4 阶的收入滞后项在模型中是联合不显著的。因此，小模型 dl_5（Model 2）更好、更精简，我们应该放心地把第 3 阶和第 4 阶剔除掉。

acf(dl_5$residuals, type='correlation')

#has this model sorted out autocorrelation issue? - nope
resids_ <- as.data.frame(dl_5$residuals)
colnames(resids_)<-c("res")
bgtest(res~1,data=resids_, order = 1)

## 
##  Breusch-Godfrey test for serial correlation of order up to 1
## 
## data:  res ~ 1
## LM test = 1.2403, df = 1, p-value = 0.2654

bgtest(res~1,data=resids_, order = 2)

## 
##  Breusch-Godfrey test for serial correlation of order up to 2
## 
## data:  res ~ 1
## LM test = 8.7087, df = 2, p-value = 0.01285

bgtest(res~1,data=resids_, order = 3)

## 
##  Breusch-Godfrey test for serial correlation of order up to 3
## 
## data:  res ~ 1
## LM test = 16.047, df = 3, p-value = 0.001109

bgtest(res~1,data=resids_, order = 4)

## 
##  Breusch-Godfrey test for serial correlation of order up to 4
## 
## data:  res ~ 1
## LM test = 16.918, df = 4, p-value = 0.002005

bgtest(res~1,data=resids_, order = 5)

## 
##  Breusch-Godfrey test for serial correlation of order up to 5
## 
## data:  res ~ 1
## LM test = 17.682, df = 5, p-value = 0.003373

==TODO==: explain coeftest

15 使用稳健标准误（Robust Standard Errors）来修正统计推断结果

下面这段代码展示了在时间序列分析中，当模型面临残差自相关（Autocorrelation in Residuals）问题，且无法通过调整变量完全消除时，如何使用稳健标准误（Robust Standard Errors）来修正统计推断结果。

# Autocorrelation issue has beed not solved, but in DL models the problem is only in inconsistent Variance - Covariance matrix
# Newey - Davis robust estimator can be used to update the results and draw conclusions

library(lmtest)
library(sandwich)
coeftest(dl_5, vcov=vcovHAC(dl_5))

## 
## t test of coefficients:
## 
##                        Estimate Std. Error t value      Pr(>|t|)    
## (Intercept)            7.587319   3.566781  2.1272     0.0349445 *  
## d(dpi)                 0.377935   0.063942  5.9106 0.00000002012 ***
## L(d(dpi), c(1, 2, 5))1 0.173651   0.046350  3.7465     0.0002504 ***
## L(d(dpi), c(1, 2, 5))2 0.086269   0.051880  1.6629     0.0983093 .  
## L(d(dpi), c(1, 2, 5))5 0.093020   0.061803  1.5051     0.1342819    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

15.0.1 函数与参数详解

library(sandwich)：加载 sandwich 扩展包。该包在计量经济学中专门用于计算各种回归模型的稳健协方差矩阵（Robust Covariance Matrix Estimators）。由于其计算出的矩阵结构常被称为面包夹肉的“三明治”形状而得名。
coeftest(dl_5, ...)：来自于 lmtest 包的函数。它允许我们使用指定的方差-协方差矩阵来对已有模型（这里是小模型 dl_5）的系数进行重新检验（t 检验）。它只改变标准误和 P 值，而不改变已经估计出来的参数系数值（Estimate）。
vcov=vcovHAC(dl_5)（核心参数）：
- vcovHAC() 是计算 HAC（异方差自相关稳健，Heteroscedasticity and Autocorrelation Consistent） 协方差矩阵的函数。
- 它的作用是自动调整传统的 OLS 方差估计。在时间序列中，它通常采用 Newey-West 估计量（纽威-威斯特估计量） 算法（注：代码注释中写为 Newey-Davis，在经济学界通常称为 Newey-West）。
- 传入这个参数后，R 语言在计算标准误时，会同时考虑到数据中存在的异方差问题和残差自相关问题，给系数穿上一层“防弹衣”，使计算出来的标准误更加真实可靠。

16 Autoregressive Distirbuted Lagged Model

# Maybe Autoregressive Distirbuted Lagged Model would help?
# Note: Autocorrelation in residuals for ARDL lead to inconsistency of estimators!

ardl_5_all <- dynlm(d(consumption) ~ L(d(consumption),c(1:5)) + d(dpi) + L(d(dpi),c(1,2,5)), data = USA, start = c(1960, 1))
summary(ardl_5_all)

## 
## Time series regression with "zooreg" data:
## Start = 1960 Q1, End = 2000 Q4
## 
## Call:
## dynlm(formula = d(consumption) ~ L(d(consumption), c(1:5)) + 
##     d(dpi) + L(d(dpi), c(1, 2, 5)), data = USA, start = c(1960, 
##     1))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -69.815 -10.597   0.132  11.497  46.341 
## 
## Coefficients:
##                             Estimate Std. Error t value     Pr(>|t|)    
## (Intercept)                 4.402872   3.170389   1.389     0.166915    
## L(d(consumption), c(1:5))1  0.069549   0.082560   0.842     0.400871    
## L(d(consumption), c(1:5))2  0.240945   0.082132   2.934     0.003863 ** 
## L(d(consumption), c(1:5))3  0.251578   0.074329   3.385     0.000904 ***
## L(d(consumption), c(1:5))4 -0.080380   0.075909  -1.059     0.291300    
## L(d(consumption), c(1:5))5 -0.006211   0.084126  -0.074     0.941237    
## d(dpi)                      0.300907   0.051961   5.791 0.0000000381 ***
## L(d(dpi), c(1, 2, 5))1      0.079843   0.056471   1.414     0.159414    
## L(d(dpi), c(1, 2, 5))2     -0.047063   0.055210  -0.852     0.395299    
## L(d(dpi), c(1, 2, 5))5      0.041920   0.054999   0.762     0.447111    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 19.41 on 154 degrees of freedom
## Multiple R-squared:  0.4371, Adjusted R-squared:  0.4042 
## F-statistic: 13.29 on 9 and 154 DF,  p-value: 0.00000000000000135

16.1 理论模型公式

\[\Delta Y_t = \mu + \gamma_1 \Delta Y_{t-1} + \gamma_2 \Delta Y_{t-2} + \gamma_3 \Delta Y_{t-3} + \gamma_4 \Delta Y_{t-4} + \gamma_5 \Delta Y_{t-5} + \beta_0 \Delta X_t + \beta_1 \Delta X_{t-1} + \beta_2 \Delta X_{t-2} + \beta_5 \Delta X_{t-5} + \epsilon_t\]

L(d(consumption), c(1:5))（自回归部分 / AR部分）：对应公式中的 \(\Delta Y_{t-1}\) 到 \(\Delta Y_{t-5}\)。 通俗解读：用过去 1 到 5 个季度消费自身的变动来作为自变量。这对应了讲义中提到通过引入因变量滞后项来彻底吸收残差“记忆力和惯性”的做法。
d(dpi) + L(d(dpi), c(1,2,5))（分布式滞后部分 / DL部分）：对应公式中的 \(\Delta X_t\)、\(\Delta X_{t-1}\)、\(\Delta X_{t-2}\) 和 \(\Delta X_{t-5}\)。 通俗解读：依然保留了今天、昨天、前天以及大前年同季度的收入变动。

ardl_5 <- dynlm(d(consumption) ~ L(d(consumption),c(2,3)) + d(dpi) + L(d(dpi),c(1)), data = USA, start = c(1960, 1))
summary(ardl_5)

## 
## Time series regression with "zooreg" data:
## Start = 1960 Q1, End = 2000 Q4
## 
## Call:
## dynlm(formula = d(consumption) ~ L(d(consumption), c(2, 3)) + 
##     d(dpi) + L(d(dpi), c(1)), data = USA, start = c(1960, 1))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -70.136 -11.432  -0.785  11.774  44.642 
## 
## Coefficients:
##                             Estimate Std. Error t value      Pr(>|t|)    
## (Intercept)                  4.44202    2.84257   1.563       0.12012    
## L(d(consumption), c(2, 3))2  0.20552    0.06809   3.018       0.00296 ** 
## L(d(consumption), c(2, 3))3  0.25150    0.06820   3.688       0.00031 ***
## d(dpi)                       0.29335    0.04776   6.142 0.00000000627 ***
## L(d(dpi), c(1))              0.10020    0.04790   2.092       0.03806 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 19.27 on 159 degrees of freedom
## Multiple R-squared:  0.4276, Adjusted R-squared:  0.4132 
## F-statistic:  29.7 on 4 and 159 DF,  p-value: < 0.00000000000000022

#check which model is better:
anova(ardl_5_all,ardl_5)

## Analysis of Variance Table
## 
## Model 1: d(consumption) ~ L(d(consumption), c(1:5)) + d(dpi) + L(d(dpi), 
##     c(1, 2, 5))
## Model 2: d(consumption) ~ L(d(consumption), c(2, 3)) + d(dpi) + L(d(dpi), 
##     c(1))
##   Res.Df   RSS Df Sum of Sq      F Pr(>F)
## 1    154 58046                           
## 2    159 59023 -5   -976.74 0.5183 0.7622

#conclusion? both are statistically equal so we prefer with more DF (less regressors)

acf(ardl_5$residuals, type='correlation')

#has this model sorted out autocorrelation issue? - Yes!!
resids_ <- as.data.frame(ardl_5$residuals)
colnames(resids_)<-c("res")
bgtest(res~1,data=resids_, order = 1)

## 
##  Breusch-Godfrey test for serial correlation of order up to 1
## 
## data:  res ~ 1
## LM test = 0.82863, df = 1, p-value = 0.3627

bgtest(res~1,data=resids_, order = 2)

## 
##  Breusch-Godfrey test for serial correlation of order up to 2
## 
## data:  res ~ 1
## LM test = 1.1992, df = 2, p-value = 0.549

bgtest(res~1,data=resids_, order = 3)

## 
##  Breusch-Godfrey test for serial correlation of order up to 3
## 
## data:  res ~ 1
## LM test = 1.2328, df = 3, p-value = 0.7452

bgtest(res~1,data=resids_, order = 4)

## 
##  Breusch-Godfrey test for serial correlation of order up to 4
## 
## data:  res ~ 1
## LM test = 2.5794, df = 4, p-value = 0.6305

bgtest(res~1,data=resids_, order = 5)

## 
##  Breusch-Godfrey test for serial correlation of order up to 5
## 
## data:  res ~ 1
## LM test = 2.7137, df = 5, p-value = 0.744

###################################################################################################      
#Diagnostics other than Autocorrelation
###################################################################################################

# Diagnostics plots    

par(mfrow=c(2,2))
plot(ardl_5)

par(mfrow=c(1,1))

# 1. Residuals vs Fitted
# 
# This plot shows if residuals have non-linear patterns. 
# There could be a non-linear relationship between predictor variables and an outcome variable 
# and the pattern could show up in this plot if the model doesn’t capture the non-linear relationship. 
# If you find equally spread residuals around a horizontal line without distinct patterns,
# that is a good indication you don’t have non-linear relationships.

# 2. Normal Q-Q
# 
# This plot shows if residuals are normally distributed. 
# Do residuals follow a straight line well or do they deviate severely? 
# It’s good if residuals are lined well on the straight dashed line.

#residuals normality test
if(!require(interp)){install.packages("interp")}
library(interp)
jbTest(as.matrix(residuals(ardl_5)))

## 
## Title:
##  Jarque - Bera Normality Test
## 
## Test Results:
##   PARAMETER:
##     Sample Size: 164
##   STATISTIC:
##     LM: 4.04
##     ALM: 4.688
##   P VALUE:
##     LM p-value: 0.096 
##     ALM p-value: 0.084 
##     Asymptotic: 0.133

# 3. Scale-Location
# 
# It’s also called Spread-Location plot. 
# This plot shows if residuals are spread equally along the ranges of predictors. 
# This is how you can check the assumption of equal variance (homoscedasticity). 
# It’s good if you see a horizontal line with equally (randomly) spread points.

# 4. Residuals vs Leverage
# 
# This plot helps us to find influential cases (i.e., subjects) if any. 
# Not all outliers are influential in linear regression analysis (whatever outliers mean). 
# Even though data have extreme values, they might not be influential to determine a regression line. 
# That means, the results wouldn’t be much different if we either include or exclude them from analysis. 
# They follow the trend in the majority of cases and they don’t really matter; they are not influential. 
# On the other hand, some cases could be very influential even if they look to be within a reasonable range of the values. 
# They could be extreme cases against a regression line and can alter the results if we exclude them from analysis. 
# Another way to put it is that they don’t get along with the trend in the majority of the cases.
# 
# Unlike the other plots, this time patterns are not relevant. 
# We watch out for outlying values at the upper right corner or at the lower right corner. 
# Those spots are the places where cases can be influential against a regression line. 
# Look for cases outside of a dashed line, Cook’s distance.
# When cases are outside of the Cook’s distance (meaning they have high Cook’s distance scores), 
# the cases are influential to the regression results. The regression results will be altered if we exclude those cases.
# 

# http://data.library.virginia.edu/diagnostic-plots/

## Breusch-Pagan
bptest(ardl_5,data=USA, studentize=FALSE)

## 
##  Breusch-Pagan test
## 
## data:  ardl_5
## BP = 20.181, df = 4, p-value = 0.00046

bptest(ardl_5,data=USA)

## 
##  studentized Breusch-Pagan test
## 
## data:  ardl_5
## BP = 15.421, df = 4, p-value = 0.003902

## Varaince Inflation Ratio
vif(ardl_5,data=USA)

##                                GVIF Df GVIF^(1/(2*Df))
## L(d(consumption), c(2, 3)) 1.235631  2        1.054319
## d(dpi)                     1.116469  1        1.056631
## L(d(dpi), c(1))            1.116288  1        1.056545

############################################################
    
#    upload a dataset form AER package named USConsump1993
#    in the dataset relation between income and expenditure
#    in the USE years 1950-1993 are stored
    
##########################################################
  
data(USConsump1993)
USA<-as.zoo(USConsump1993)    
    
     
#    Excersie 4
    
    
    

# 4.1
# Asses wheter time series are stationary or not 
# and prepared data to (AR)DL model for expenditure 
  
    
# 4.2 
# Choose the best model from (AR)DL family
    
    
    
    
# 4.3
# Perform diagnostics analysis to check wheter the chosen model fulfills assumptions