9 C1. Instrumental Variables

9.1 About

Topics covered:

The endogeneity problem and sources of endogeneity
IV identification conditions: relevance and exogeneity (exclusion restriction)
Two-Stage Least Squares (2SLS) estimator
Weak instruments: F-test and consequences
Sargan-Hansen J test for overidentification

9.2 Lecture Notes

Download lecture notes (PDF)

9.3 Slides

Click on the slide and use the keyboard arrows to navigate.

Links: View slides (full screen) · Download slides (PDF)

Note: Slides and notes are pending translation to English.

9.4 Introduction

In the first part of the econometrics sequence the key assumption is \(\text{Cov}(x,u)=0\). For example, the conditional expectation function

\[\mathbb{E}[\log(\text{Wage})_i|X_i]=\beta_0 + \beta_1\,\text{Schooling}_i+\beta_2\,\text{Experience}_i\]

The endogeneity problem arises when \(\text{Cov}(x,u)\neq 0\).

Common sources: omitted variables, measurement error, simultaneity.

Omitted variable bias. True model: \(Y=X_1\beta_1+X_2\beta_2+u_T\), estimated model: \(Y=X_1\beta_1+u_E\). Then

\[\mathbb{E}(\hat{\beta}_{1}|X)\equiv\beta_1+\beta_2\frac{\text{Cov}(X,\text{Omitted})}{\text{Var}(X)}\]

Suppose a randomly assigned variable \(z\) has a causal effect on \(x\). This variable — the instrument — can serve as a natural experiment.

9.5 Conditions for an Instrument

Relevance (\(z\rightarrow x\)): \(\text{Cor}(z,x)\neq 0\).
Exclusion (\(z\rightarrow x \rightarrow y\), \(z\not\rightarrow y\)): \(\text{Cor}(z,y|x)=0\).
Exogeneity (\(u\not\rightarrow z\)): \(\text{Cor}(z,u)=0\).

Examples from the literature:

\(y\)	\(x\)	Unobservable	Instrument \(z\)
wage	schooling	ability	father’s education
wage	schooling	ability	distance to college
wage	schooling	ability	random military assignment
health	smoking	behavior	tobacco tax
armed conflict	GDP growth	simultaneity	rainfall

9.6 Estimation

Exact Identification (\(k = r = 1\))

Using \(\text{Cov}(z,y)=\beta_1\text{Cov}(z,x)\):

\[\hat{\beta}_{\text{IV}}=\frac{\sum(z_i-\bar{z})(y_i-\bar{y})}{\sum(z_i-\bar{z})(x_i-\bar{x})}\]

As a method-of-moments estimator: \(\hat{\beta}=(z'x)^{-1}(z'y)\).

Consistency requires: (i) relevance — \(\text{plim}(z'x/N)\neq 0\); (ii) validity — \(\text{plim}(z'u/N)=0\).

2SLS (Two Stage Least Squares)

First stage — recover the exogenous variation in \(x_1\):

\[x_1=\gamma_1 z+x_2\gamma_2+\epsilon \quad\Rightarrow\quad \hat{x}_1\]

Second stage — estimate the structural equation:

\[y=\hat{x}_1\beta_1 + x_2\beta_2+u \quad\Rightarrow\quad \hat{\beta}_1\]

Notes: (1) Standard errors from the second stage must be corrected. (2) Check instrument strength via the first-stage \(F\)-statistic (rule of thumb: \(F > 10\)) or the Kleibergen-Paap Wald \(F\).

IV as GMM

The standard IV estimator (\(k\leq r\)):

\[\hat{\beta}_{\text{IV}}=\left(X'Z(Z'Z)^{-1}Z'X\right)^{-1}\left(X'Z(Z'Z)^{-1}Z'y\right)\]

Asymptotic variance: \(V(\beta)=(1/N)\left(X'Z\mathbb{S}^{-1}Z'X\right)^{-1}\).