11 B4. Multinomial Choice Models
11.1 About
Topics covered:
- Multinomial Logit (MNL): setup, MLE, and marginal effects
- Independence of Irrelevant Alternatives (IIA) property and Hausman-McFadden test
- Conditional Logit with alternative-specific attributes
- Multinomial Probit (simulation-based, relaxes IIA)
- Ordered Probit and Ordered Logit
11.2 Lecture Notes
11.3 Overview
When the outcome \(y_i\) takes more than two unordered values — e.g., transportation mode (car, bus, train), occupational choice, brand choice — we need multinomial choice models. These generalize binary Logit/Probit to \(J \geq 2\) alternatives.
11.4 Multinomial Logit (MNL)
The Multinomial Logit model (McFadden, 1974) specifies, for \(j = 0,1,\ldots,J-1\):
\[P(y_i = j \mid x_i) = \frac{\exp(x_i'\beta_j)}{\sum_{k=0}^{J-1}\exp(x_i'\beta_k)}\]
One category (typically \(j=0\)) is the base/reference category, with \(\beta_0 = 0\) for identification. So:
\[P(y_i = j \mid x_i) = \frac{\exp(x_i'\beta_j)}{1 + \sum_{k=1}^{J-1}\exp(x_i'\beta_k)}, \quad j \geq 1\]
\[P(y_i = 0 \mid x_i) = \frac{1}{1 + \sum_{k=1}^{J-1}\exp(x_i'\beta_k)}\]
Estimation
MLE via maximizing: \[\ell(\beta) = \sum_{i=1}^n \sum_{j=0}^{J-1} \mathbf{1}[y_i=j]\,\ln P(y_i=j\mid x_i)\]
Marginal effects
\[\frac{\partial P(y_i=j\mid x_i)}{\partial x_i} = P(y_i=j\mid x_i)\left[\beta_j - \sum_{k=0}^{J-1}P(y_i=k\mid x_i)\beta_k\right]\]
Note: marginal effects sum to zero across alternatives.
11.5 Independence of Irrelevant Alternatives (IIA)
The MNL satisfies a key restriction: the ratio of probabilities for any two alternatives depends only on their respective utility parameters, not on other alternatives:
\[\frac{P(y_i=j\mid x_i)}{P(y_i=k\mid x_i)} = \frac{\exp(x_i'\beta_j)}{\exp(x_i'\beta_k)} = \exp\left[x_i'(\beta_j-\beta_k)\right]\]
This is the IIA property (Independence of Irrelevant Alternatives).
Implication: Adding or removing an alternative does not change the relative probabilities of the remaining alternatives. This is restrictive — the classic “red bus/blue bus” counter-example shows it can be implausible.
Testing IIA: Hausman-McFadden test (1984) — estimate the model on all \(J\) alternatives vs. a subset, compare coefficients.
11.6 Conditional Logit
In the Conditional Logit model (McFadden, 1974), covariates vary across alternatives rather than (or in addition to) across individuals. Let \(z_{ij}\) be the attributes of alternative \(j\) for individual \(i\):
\[P(y_i = j \mid z_{i0},\ldots,z_{i,J-1}) = \frac{\exp(z_{ij}'\gamma)}{\sum_{k=0}^{J-1}\exp(z_{ik}'\gamma)}\]
Applications: travel demand (travel time, cost by mode), hedonic models.
The conditional logit also satisfies IIA. A mixed or nested logit relaxes this.
11.7 Multinomial Probit
The Multinomial Probit model assumes the utility errors \((\varepsilon_{i0},\ldots,\varepsilon_{i,J-1})\) are jointly normally distributed with an unrestricted covariance matrix \(\Sigma\). This:
- Does not impose IIA
- Is much more flexible than MNL
- Requires numerical simulation (GHK simulator) for the likelihood, since it involves \(J\)-dimensional normal integrals
In practice, MNL or nested logit are often preferred for computational tractability.
11.8 Ordered Choice Models
When alternatives have a natural ordering (e.g., education level, credit rating, survey scale) but the gaps between categories are unknown, ordered models are appropriate.
Define a latent variable: \[y_i^* = x_i'\beta + \varepsilon_i\]
with cutpoints \(-\infty = \mu_0 < \mu_1 < \cdots < \mu_{J-1} < \mu_J = \infty\). The observed outcome is: \[y_i = j \iff \mu_{j-1} < y_i^* \leq \mu_j\]
Ordered Probit / Ordered Logit
\[P(y_i = j \mid x_i) = F(\mu_j - x_i'\beta) - F(\mu_{j-1} - x_i'\beta)\]
where \(F = \Phi\) (Ordered Probit) or \(F = \Lambda\) (Ordered Logit). Both \(\beta\) and \(\{\mu_j\}\) are estimated jointly by MLE.
Key constraint: the sign of \(\beta_j\) is the same for all outcome categories (a single index drives the probability of higher outcomes).
11.9 References
Cameron y Trivedi (2005), chapters 15–16. Davidson y MacKinnon (2004).