next up previous
Next: Parameter Estimation Up: Multinomial Logistic Regression Previous: Multinomial Logistic Regression

The Model

Generalizing to a multinomial dependent variable requires us to make some notational adaptations. Let $ J$ represent the number of discrete categories of the dependent variable, where $ J \ge 2$. Now, consider random variable $ Z$ that can take on one of $ J$ possible values. If each observation is independent, then each $ Z_i$ is a multinomial random variable. Once again, we aggregate the data into populations each of which represents one unique combination of independent variable settings. As with the binomial logistic regression model, the column vector $ \boldsymbol{n}$ contains elements $ n_i$ which represent the number of observations in population $ i$, and such that $ {\sum_{i=1}^{N} n_i = M}$, the total sample size.

Since each observation records one of $ J$ possible values for the dependent variable, $ Z$, let $ \boldsymbol{y}$ be a matrix with $ N$ rows (one for each population) and $ J-1$ columns. Note that if $ J=2$ this reduces to the column vector used in the binomial logistic regression model. For each population, $ y_{ij}$ represents the observed counts of the $ j^{th}$ value of $ Z_i$. Similarly, $ \boldsymbol{\pi}$ is a matrix of the same dimensions as $ \boldsymbol{y}$ where each element $ \pi_{ij}$ is the probability of observing the $ j^{th}$ value of the dependent variable for any given observation in the $ i^{th}$ population.

The design matrix of independent variables, $ \boldsymbol{X}$, remains the same--it contains $ N$ rows and $ K+1$ columns where $ K$ is the number of independent variables and the first element of each row, $ x_{i0} = 1$, the intercept. Let $ \boldsymbol{\beta}$ be a matrix with $ K+1$ rows and $ J-1$ columns, such that each element $ \beta_{kj}$ contains the parameter estimate for the $ k^{th}$ covariate and the $ j^{th}$ value of the dependent variable.

For the multinomial logistic regression model, we equate the linear component to the log of the odds of a $ j^{th}$ observation compared to the $ J^{th}$ observation. That is, we will consider the $ J^{th}$ category to be the omitted or baseline category, where logits of the first $ J-1$ categories are constructed with the baseline category in the denominator.

$\displaystyle \log \biggl(\frac{\pi_{ij}}{\pi_{iJ}}\biggr) = \log \biggl(\frac{...
...qquad \begin{array}{c} i = 1, 2, \ldots, N \\ j = 1, 2, \ldots, J-1 \end{array}$ (24)

Solving for $ \pi_{ij}$, we have:


$\displaystyle \pi_{ij}$ $\displaystyle =$ $\displaystyle \frac{e^{\sum_{k=0}^K x_{ik}\beta_{kj}}}{1 + \sum_{j=1}^{J-1}e^{\sum_{k=0}^K x_{ik}\beta_{kj}}} \qquad j < J$  
$\displaystyle \pi_{iJ}$ $\displaystyle =$ $\displaystyle \frac{1}{1 + \sum_{j=1}^{J-1}e^{\sum_{k=0}^K x_{ik}\beta_{kj}}}$ (25)


next up previous
Next: Parameter Estimation Up: Multinomial Logistic Regression Previous: Multinomial Logistic Regression

Scott Czepiel
http://czep.net/contact.html