18 D3. Panel Data Theory

Coming soon

Detailed notes on panel data theory (Fixed Effects, Random Effects, GMM-based dynamic panels) will be added here. See the Overview page for slides and a summary of topics covered.

18.1 About

Topics covered:

One-way error component model: \(y_{it} = x_{it}'\beta + \alpha_i + u_{it}\)
Fixed Effects (FE / within) estimator: demeaning to eliminate \(\alpha_i\)
Random Effects (RE / GLS) estimator: feasible GLS
Hausman test: FE vs. RE
First-difference estimator
Dynamic panels: Arellano-Bond GMM

18.2 Lecture Notes

Download lecture notes (PDF)

18.3 The One-Way Error Component Model

The standard panel data model for \(i = 1,\ldots,N\) individuals and \(t = 1,\ldots,T\) periods:

\[y_{it} = x_{it}'\beta + \alpha_i + u_{it}\]

where: - \(\alpha_i\) is an individual fixed effect — time-invariant unobserved heterogeneity - \(u_{it}\) is an idiosyncratic error: \(E[u_{it}\mid x_{i},\alpha_i] = 0\) - \(x_{it}\) is a \(k \times 1\) vector of covariates

The key question is whether \(\alpha_i\) is correlated with \(x_{it}\) (Fixed Effects) or not (Random Effects).

18.4 Fixed Effects (Within) Estimator

Assumption: \(E[\alpha_i \mid x_{i1},\ldots,x_{iT}] \neq 0\) in general — \(\alpha_i\) may be correlated with \(x_{it}\).

Strategy: Demean to eliminate \(\alpha_i\). Define \(\ddot{y}_{it} = y_{it} - \bar{y}_i\), \(\ddot{x}_{it} = x_{it} - \bar{x}_i\). Then:

\[\ddot{y}_{it} = \ddot{x}_{it}'\beta + \ddot{u}_{it}\]

The FE (within) estimator is OLS on the demeaned data: \[\hat\beta_{FE} = \left(\sum_i \sum_t \ddot{x}_{it}\ddot{x}_{it}'\right)^{-1}\sum_i\sum_t \ddot{x}_{it}\ddot{y}_{it}\]

Note: Time-invariant variables are collinear with \(\alpha_i\) and are not identified under FE.

18.5 Random Effects (GLS) Estimator

Assumption: \(E[\alpha_i \mid x_{i1},\ldots,x_{iT}] = 0\) — \(\alpha_i\) is uncorrelated with all regressors.

Write \(v_{it} = \alpha_i + u_{it}\). The composite error has a structured covariance: \[\text{Cov}(v_{it},v_{is}) = \sigma_\alpha^2 + \sigma_u^2\,\mathbf{1}[t=s]\]

The RE estimator is GLS applied to this structure. Feasible GLS estimates \(\sigma_\alpha^2\) and \(\sigma_u^2\) first.

Advantage over FE: Time-invariant regressors are identified; more efficient when RE assumption holds.

18.6 Hausman Test

The Hausman (1978) test checks \(H_0\): RE is consistent (i.e., \(\text{Cov}(\alpha_i, x_{it})=0\)).

\[H = (\hat\beta_{FE}-\hat\beta_{RE})'\left[\widehat{\text{Var}}(\hat\beta_{FE})-\widehat{\text{Var}}(\hat\beta_{RE})\right]^{-1}(\hat\beta_{FE}-\hat\beta_{RE}) \xrightarrow{d} \chi^2_k\]

Under \(H_0\), both FE and RE are consistent but RE is efficient. Under \(H_1\), only FE is consistent.

18.7 First Difference Estimator

An alternative to within-demeaning: subtract \(t-1\) from \(t\): \[\Delta y_{it} = \Delta x_{it}'\beta + \Delta u_{it}\]

Eliminates \(\alpha_i\) (same as FE when \(T=2\)). May be preferred when \(u_{it}\) is serially correlated.

18.8 Dynamic Panel Models

When \(y_{i,t-1}\) is included as a regressor, FE is inconsistent (Nickell bias, \(O(1/T)\)). The Arellano-Bond (1991) GMM estimator uses lagged levels as instruments for the differenced equation.

18.9 References

Cameron y Trivedi (2005), chapters 21–22. Hansen (2022), chapter 17.