15  C2. Generalized Method of Moments

15.1 About

Topics covered:

  • Population moment conditions and sample analogues
  • The GMM estimator and optimal weighting matrix
  • Asymptotic distribution of GMM
  • Two-step GMM and efficient GMM
  • J test for overidentification
  • OLS and IV as special cases of GMM

15.2 Lecture Notes

 

15.3 Slides

Click on the slide and use the keyboard arrows to navigate.

 

Note: Slides and notes are pending translation to English.

 


15.4 Introduction

MLE selects \(\hat{\theta}\) by maximizing the likelihood — it requires the full probability distribution (pdf). GMM (Hansen 1982) is an alternative that only requires specification of certain moments, not the full pdf.

15.5 Method of Moments (MM)

Example. Let \(y_i \sim t(\nu)\). For \(\nu>2\), \(\mathbb{E}(y^2)=\nu/(\nu-2)\). A consistent estimator uses \(\hat{\mu}_2=(1/N)\sum_i y_i^2 \rightarrow_p \mathbb{E}(y^2)\), giving

\[\hat{\nu}_{\text{MM}} = \frac{2\hat{\mu}_2}{\hat{\mu}_2 - 1} \qquad (\text{for }\hat{\mu}_2 > 1)\]

General idea. Given unknown \(\theta_{k\times 1}\), suppose we can compute \(k\) population moments as a function of \(\theta\):

\[\mathbb{E}(y_i^j)=\mu_j(\theta) \quad \text{for } j=j_1,\ldots,j_k\]

The MM estimator \(\hat{\theta}_N\) equates population moments to sample moments: \(\mu_j(\hat{\theta}_N)=(1/N)\sum y_i^j\).

15.6 Generalized Method of Moments (GMM)

With more moment conditions than parameters (\(r > k\)), we cannot set all conditions exactly to zero. Instead, we minimize a weighted quadratic criterion:

\[Q(\theta;X_N)=\left[g(\theta,X_N)\right]' W_N \left[g(\theta,X_N)\right]\]

where \(g(\theta,X_N)=\frac{1}{N}\sum_i h(\theta;w_i)\) and the true parameters satisfy \(\mathbb{E}\{h(\theta_0,w_i)\}=0\).

Hansen’s Formulation

  • \(w_i\): \(h\times 1\) vector of observed variables
  • \(\theta\): \(k\times 1\) vector of unknown parameters
  • \(h(\theta,w_i)\): \(r\times 1\) vector of functions (moment conditions)
  • \(\theta_0\) characterized by \(\mathbb{E}\{h(\theta_0,w_i)\}=0\) (orthogonality conditions)

The GMM estimator \(\hat{\theta}_N\) minimizes \(Q(\theta;X_N)=[g(\theta,X_N)]'W_N[g(\theta,X_N)]\).

Identification

  • Exact identification (\(k=r\)): set \(g(\hat{\theta}_N;X_N)=0\) and solve \(r\) equations for \(k\) unknowns.
  • Over-identification (\(r>k\)): more conditions than parameters; use \(W_N\) to weight them.

Optimal Weighting Matrix

The optimal \(W_N = \mathbb{S}^{-1}\), where

\[\mathbb{S}=\lim_{N\rightarrow\infty}N\,\mathbb{E}\{[g(\theta_0;X_N)][g(\theta_0;X_N)]'\}\]

Practical approach (iterative):

  1. Start with \(W^{(0)}=I\).
  2. Obtain \(\hat{\theta}^{(0)}\) by minimizing \(Q\) with \(W^{(0)}\).
  3. Estimate \(\hat{\mathbb{S}}^{(0)}\) using \(\hat{\theta}^{(0)}\).
  4. Update \(W^{(1)}=(\hat{\mathbb{S}}^{(0)})^{-1}\) and re-estimate.
  5. Repeat until convergence.

Asymptotic Distribution

Under regularity conditions (consistency, CLT for \(g(\theta_0,X_N)\), differentiability):

\[\sqrt{N}(\hat{\theta}_N - \theta_0) \xrightarrow{d} \mathcal{N}(0,V)\]

with \(V=(D\mathbb{S}^{-1}D')^{-1}\), where \(D'=\operatorname{plim}(\partial g/\partial\theta)|_{\hat{\theta}}\).

Sargan–Hansen \(J\)-test (Over-identification)

Hansen (1982) proposes a test for whether all \(r\) moment conditions hold simultaneously:

\[J_N = N\left[g(\hat{\theta}_N;X_N)\right]'\hat{\mathbb{S}}_N^{-1}\left[g(\hat{\theta}_N;X_N)\right] \xrightarrow{d} \chi^2_{(r-k)}\]

Rejecting \(H_0\) implies the GMM estimator is inconsistent for \(\theta_0\).

Special Cases of GMM

Many standard estimators are special cases of GMM:

  • OLS, IV, 2SLS
  • Nonlinear simultaneous equations estimators
  • Many cases of MLE