This note provides a brief overview about the methods suitable for dynamic panel data models with large T. In this case, the traditional go-to difference or system GMM leads to proliferation of instruments, the results become sensitive to the choice of the number of lags and there remain other problems such as endogeneity and cross-sectional dependence.
In dynamic panels, Nickell bias arises because removing fixed effects forces subtraction of time averages that contain past shocks, mechanically correlating lagged dependent variables with transformed errors.
Consider the standard dynamic panel model:
\[ y_{it} = \rho y_{i,t-1} + x_{it}\beta+ \alpha_i + u_{it} \]
where:
The Core Problem
To remove fixed effects, we apply the within (FE) transformation:
\[ y_{it} - \bar y_i = \rho (y_{i,t-1} - \bar y_{i,t-1}) + (u_{it} - \bar u_i) \]
But now the transformed regressor \((y_{i,t-1} - \bar y_{i,t-1})\) becomes correlated with the transformed error term \((u_{it} - \bar u_i )\).
Why this happens:
Thus even if:
\[ E(y_{i,t-1} u_{it}) = 0 \]
we still have:
\[ E[(y_{i,t-1} - \bar y_i)(u_{it} - \bar u_i)] \neq 0 \]
Result: Finite-T Bias
This induces bias in the FE estimator of \(\rho\):
This phenomenon is known as Nickell bias.
Large T Solves the Nickell bias asymptotically
As \(T \to \infty\):
Formally:
\(Bias ~ \hat \rho_{FE} = O(1/T)\)
(\(O(1/T)\) reads as “shrinks with the rate 1/T”)
So:
But other problems - endogeneity, cross-sectional dependence etc. remain an issue.
Classic dynamic panel GMM (Arellano–Bond/System GMM) is designed for small-T settings. In macro and regional data where T is large, alternative estimators dominate because:
• Nickell bias vanishes as T grows
• Instrument proliferation becomes severe in GMM and the Hansen statistics become biased => the results become extremely sensitive on the choice of lags of variables used as internal instruments (Roodman, 2009, highlighted the impact of instrument proliferation)
• When variables are highly persistent and shocks serially correlated, lagged variables might remain correlated with current errors and the GMM with internal instruments becomes biased under endogeneity (Hayakawa 2009/2015)
• Cross-sectional dependence (CSD) becomes first-order (Chudik and Pesaran papers whowing that common factors are problem for GMM estimator)
So, for macro panels, persistent series, presence of global shocks and structural endogeneity one should consider other methods than GMM.
Also, we have now much longer time series available than a decade ago which make the large T methods feasible.
Large T - ideally more than 30, but GMM works well for low T (5–10), so with T above 20 the large T methods become feasible.
Estimate dynamic FE and correct finite-sample bias analytically, via jackknife, or bootstrap. The method is an alternative to GMM when instrument proliferation makes the results unstable.
Kiviet (1995); Bruno (2005); Everaert & Pozzi (2007)
✔ Large T, moderate N
✔ Weak cross-sectional dependence (weak forms can be corrected via
bootstrap - De Vos, Everaert, Ruyssen, 2015) ✔ Focus on dynamics, not
global shocks
✖ Strong common factors or global shocks
✖ Severe endogeneity beyond lag structure
Implemented in Stata add-ons.
When IV is needed and T is large, standard FE model becomes less biased because with large T the Nickell bias coverges to zero. Therefore, it could be reasonable to treat panel as standard FE model with external instruments.
Z_it → X_it → y_it
y_it ← ρ y_it-1 + x_it β + α_i + u_it (T large ⇒ bias ≈ 0)
Hsiao & Zhang (Journal of Econometrics, 2015) show that FE IV is unbiased when either N or T or both tend to infinity; Wooldridge (panel IV framework)
✔ Strong external instruments available
✔ CSD weak or handled via robust SE
✖ Common-factor driven dependence
✖ Global shocks correlated with regressors
Implemented in fixest R package.
Model errors explicitly as latent common factors to deal with cross-sectional dependence. Sequential method: estimation of common factors (principal components can be used), substract factors from variables (defactor them), run FE/FE-IV on defactored components.
u_it = λ_i’ f_t + ε_it
Bai (2009); Moon & Weidner (2015)
✔ Rich latent common shocks
✔ Need to estimate factor structure
✔ Strong cross-sectional dependence
✖ Very small T
✖ When simpler CCE already works
R implementation - package phtt, now removed from CRAN.
FixedEffectjlr calling Julia from R exists.
Method, which handles both endogeneity and adds common factors to deal with cross-sectional dependence. Both internal instruments (lags) and external instruments are feasible. Ability to use external instruments makes this method appealing. Estimate factors → remove them → run IV on cleaned data.
Norkutė et al. (2021); Kripfganz & Sarafidis (2021)
✔ Endogeneity + global shocks
✔ External instruments available
✔ Large N and T
✖ No credible instruments
✖ Very small T
Implemented in Stata add-on xtivdfreg. In R, the method
can be replicated piece by piece: (i) estimate factors by PC, (ii)
defactor covariates/instruments, (iii) run FE-IV with fixest/ivreg.
Extend defactored IV to spatial spillovers.
Cui, Sarafidis & Yamagata (Journal of Econometrics 2023); Kripfganz & Sarafidis (Journal of Statistical Software 2025)
✔ Peer effects or regional spillovers
✔ Endogeneity + common shocks
✖ No spatial interaction structure
✖ Overly small panels
Implemented in Stata add-on spxtivdfreg
Besides, new Panel VAR methods for large T have been developed. The most straightforward implementation is in Matlab BEAR Toolbox (https://www.ecb.europa.eu/press/research-publications/working-papers/html/bear-toolbox.en.html).
| Problem | Best Choice |
|---|---|
| Pure dynamics | Bias-corrected FE |
| Strong IV, no CSD | FE-IV |
| Global shocks | CCEMG |
| Complex latent structure | Interactive FE |
| IV + global shocks | Defactored IV |
| Spillovers + IV | Spatial defactored IV |
Difference and system GMM can handle endogenous regressors in theory, but in practice internal lag instruments are often weak, invalid under persistence and common shocks, and unreliable in macro panels.
System GMM improves efficiency but does not solve the fundamental weak-instrument and cross-sectional dependence problems.
Large-T macro panels favor:
• Factor-aware methods
• Heterogeneous dynamics
• External IV when endogeneity is structural
GMM is no longer the default.