11  B4. Multinomial Choice Models

11.1 About

Topics covered:

  • Multinomial Logit (MNL): setup, MLE, and marginal effects
  • Independence of Irrelevant Alternatives (IIA) property and Hausman-McFadden test
  • Conditional Logit with alternative-specific attributes
  • Multinomial Probit (simulation-based, relaxes IIA)
  • Ordered Probit and Ordered Logit

11.2 Lecture Notes

 


11.3 Overview

When the outcome \(y_i\) takes more than two unordered values — e.g., transportation mode (car, bus, train), occupational choice, brand choice — we need multinomial choice models. These generalize binary Logit/Probit to \(J \geq 2\) alternatives.


11.4 Multinomial Logit (MNL)

The Multinomial Logit model (McFadden, 1974) specifies, for \(j = 0,1,\ldots,J-1\):

\[P(y_i = j \mid x_i) = \frac{\exp(x_i'\beta_j)}{\sum_{k=0}^{J-1}\exp(x_i'\beta_k)}\]

One category (typically \(j=0\)) is the base/reference category, with \(\beta_0 = 0\) for identification. So:

\[P(y_i = j \mid x_i) = \frac{\exp(x_i'\beta_j)}{1 + \sum_{k=1}^{J-1}\exp(x_i'\beta_k)}, \quad j \geq 1\]

\[P(y_i = 0 \mid x_i) = \frac{1}{1 + \sum_{k=1}^{J-1}\exp(x_i'\beta_k)}\]

Estimation

MLE via maximizing: \[\ell(\beta) = \sum_{i=1}^n \sum_{j=0}^{J-1} \mathbf{1}[y_i=j]\,\ln P(y_i=j\mid x_i)\]

Marginal effects

\[\frac{\partial P(y_i=j\mid x_i)}{\partial x_i} = P(y_i=j\mid x_i)\left[\beta_j - \sum_{k=0}^{J-1}P(y_i=k\mid x_i)\beta_k\right]\]

Note: marginal effects sum to zero across alternatives.


11.5 Independence of Irrelevant Alternatives (IIA)

The MNL satisfies a key restriction: the ratio of probabilities for any two alternatives depends only on their respective utility parameters, not on other alternatives:

\[\frac{P(y_i=j\mid x_i)}{P(y_i=k\mid x_i)} = \frac{\exp(x_i'\beta_j)}{\exp(x_i'\beta_k)} = \exp\left[x_i'(\beta_j-\beta_k)\right]\]

This is the IIA property (Independence of Irrelevant Alternatives).

Implication: Adding or removing an alternative does not change the relative probabilities of the remaining alternatives. This is restrictive — the classic “red bus/blue bus” counter-example shows it can be implausible.

Testing IIA: Hausman-McFadden test (1984) — estimate the model on all \(J\) alternatives vs. a subset, compare coefficients.


11.6 Conditional Logit

In the Conditional Logit model (McFadden, 1974), covariates vary across alternatives rather than (or in addition to) across individuals. Let \(z_{ij}\) be the attributes of alternative \(j\) for individual \(i\):

\[P(y_i = j \mid z_{i0},\ldots,z_{i,J-1}) = \frac{\exp(z_{ij}'\gamma)}{\sum_{k=0}^{J-1}\exp(z_{ik}'\gamma)}\]

Applications: travel demand (travel time, cost by mode), hedonic models.

The conditional logit also satisfies IIA. A mixed or nested logit relaxes this.


11.7 Multinomial Probit

The Multinomial Probit model assumes the utility errors \((\varepsilon_{i0},\ldots,\varepsilon_{i,J-1})\) are jointly normally distributed with an unrestricted covariance matrix \(\Sigma\). This:

  • Does not impose IIA
  • Is much more flexible than MNL
  • Requires numerical simulation (GHK simulator) for the likelihood, since it involves \(J\)-dimensional normal integrals

In practice, MNL or nested logit are often preferred for computational tractability.


11.8 Ordered Choice Models

When alternatives have a natural ordering (e.g., education level, credit rating, survey scale) but the gaps between categories are unknown, ordered models are appropriate.

Define a latent variable: \[y_i^* = x_i'\beta + \varepsilon_i\]

with cutpoints \(-\infty = \mu_0 < \mu_1 < \cdots < \mu_{J-1} < \mu_J = \infty\). The observed outcome is: \[y_i = j \iff \mu_{j-1} < y_i^* \leq \mu_j\]

Ordered Probit / Ordered Logit

\[P(y_i = j \mid x_i) = F(\mu_j - x_i'\beta) - F(\mu_{j-1} - x_i'\beta)\]

where \(F = \Phi\) (Ordered Probit) or \(F = \Lambda\) (Ordered Logit). Both \(\beta\) and \(\{\mu_j\}\) are estimated jointly by MLE.

Key constraint: the sign of \(\beta_j\) is the same for all outcome categories (a single index drives the probability of higher outcomes).


11.9 References

Cameron y Trivedi (2005), chapters 15–16. Davidson y MacKinnon (2004).