9  C1. Instrumental Variables

9.1 About

Topics covered:

  • The endogeneity problem and sources of endogeneity
  • IV identification conditions: relevance and exogeneity (exclusion restriction)
  • Two-Stage Least Squares (2SLS) estimator
  • Weak instruments: F-test and consequences
  • Sargan-Hansen J test for overidentification

9.2 Lecture Notes

 

9.3 Slides

Click on the slide and use the keyboard arrows to navigate.

 

Note: Slides and notes are pending translation to English.

 


9.4 Introduction

In the first part of the econometrics sequence the key assumption is \(\text{Cov}(x,u)=0\). For example, the conditional expectation function

\[\mathbb{E}[\log(\text{Wage})_i|X_i]=\beta_0 + \beta_1\,\text{Schooling}_i+\beta_2\,\text{Experience}_i\]

The endogeneity problem arises when \(\text{Cov}(x,u)\neq 0\).

DAG: endogeneity

Common sources: omitted variables, measurement error, simultaneity.

Omitted variable bias. True model: \(Y=X_1\beta_1+X_2\beta_2+u_T\), estimated model: \(Y=X_1\beta_1+u_E\). Then

\[\mathbb{E}(\hat{\beta}_{1}|X)\equiv\beta_1+\beta_2\frac{\text{Cov}(X,\text{Omitted})}{\text{Var}(X)}\]

Suppose a randomly assigned variable \(z\) has a causal effect on \(x\). This variable — the instrument — can serve as a natural experiment.

9.5 Conditions for an Instrument

  • Relevance (\(z\rightarrow x\)): \(\text{Cor}(z,x)\neq 0\).
  • Exclusion (\(z\rightarrow x \rightarrow y\), \(z\not\rightarrow y\)): \(\text{Cor}(z,y|x)=0\).
  • Exogeneity (\(u\not\rightarrow z\)): \(\text{Cor}(z,u)=0\).

DAG: IV conditions

Examples from the literature:

\(y\) \(x\) Unobservable Instrument \(z\)
wage schooling ability father’s education
wage schooling ability distance to college
wage schooling ability random military assignment
health smoking behavior tobacco tax
armed conflict GDP growth simultaneity rainfall

9.6 Estimation

Exact Identification (\(k = r = 1\))

Using \(\text{Cov}(z,y)=\beta_1\text{Cov}(z,x)\):

\[\hat{\beta}_{\text{IV}}=\frac{\sum(z_i-\bar{z})(y_i-\bar{y})}{\sum(z_i-\bar{z})(x_i-\bar{x})}\]

As a method-of-moments estimator: \(\hat{\beta}=(z'x)^{-1}(z'y)\).

Consistency requires: (i) relevance\(\text{plim}(z'x/N)\neq 0\); (ii) validity\(\text{plim}(z'u/N)=0\).

2SLS (Two Stage Least Squares)

First stage — recover the exogenous variation in \(x_1\):

\[x_1=\gamma_1 z+x_2\gamma_2+\epsilon \quad\Rightarrow\quad \hat{x}_1\]

Second stage — estimate the structural equation:

\[y=\hat{x}_1\beta_1 + x_2\beta_2+u \quad\Rightarrow\quad \hat{\beta}_1\]

Notes: (1) Standard errors from the second stage must be corrected. (2) Check instrument strength via the first-stage \(F\)-statistic (rule of thumb: \(F > 10\)) or the Kleibergen-Paap Wald \(F\).

IV as GMM

The standard IV estimator (\(k\leq r\)):

\[\hat{\beta}_{\text{IV}}=\left(X'Z(Z'Z)^{-1}Z'X\right)^{-1}\left(X'Z(Z'Z)^{-1}Z'y\right)\]

Asymptotic variance: \(V(\beta)=(1/N)\left(X'Z\mathbb{S}^{-1}Z'X\right)^{-1}\).