The Model

Next: Parameter Estimation Up: Multinomial Logistic Regression Previous: Multinomial Logistic Regression

The Model

Generalizing to a multinomial dependent variable requires us to make some notational adaptations. Let

represent the number of discrete categories of the dependent variable, where $J \ge 2$ . Now, consider random variable

that can take on one of

possible values. If each observation is independent, then each

is a multinomial random variable. Once again, we aggregate the data into populations each of which represents one unique combination of independent variable settings. As with the binomial logistic regression model, the column vector $\boldsymbol{n}$ contains elements

which represent the number of observations in population

, and such that ${\sum_{i=1}^{N} n_i = M}$ , the total sample size.

Since each observation records one of possible values for the dependent variable, , let $\boldsymbol{y}$ be a matrix with rows (one for each population) and columns. Note that if this reduces to the column vector used in the binomial logistic regression model. For each population, $y_{ij}$ represents the observed counts of the $j^{th}$ value of . Similarly, $\boldsymbol{\pi}$ is a matrix of the same dimensions as $\boldsymbol{y}$ where each element $\pi_{ij}$ is the probability of observing the $j^{th}$ value of the dependent variable for any given observation in the $i^{th}$ population.

The design matrix of independent variables, $\boldsymbol{X}$ , remains the same--it contains rows and columns where is the number of independent variables and the first element of each row, $x_{i0} = 1$ , the intercept. Let $\boldsymbol{\beta}$ be a matrix with rows and columns, such that each element $\beta_{kj}$ contains the parameter estimate for the $k^{th}$ covariate and the $j^{th}$ value of the dependent variable.

For the multinomial logistic regression model, we equate the linear component to the log of the odds of a $j^{th}$ observation compared to the $J^{th}$ observation. That is, we will consider the $J^{th}$ category to be the omitted or baseline category, where logits of the first categories are constructed with the baseline category in the denominator.

$\displaystyle \log \biggl(\frac{\pi_{ij}}{\pi_{iJ}}\biggr) = \log \biggl(\frac{... ...qquad \begin{array}{c} i = 1, 2, \ldots, N \\ j = 1, 2, \ldots, J-1 \end{array}$

(24)

Solving for $\pi_{ij}$ , we have:

$\displaystyle \pi_{ij}$	$\displaystyle =$	$\displaystyle \frac{e^{\sum_{k=0}^K x_{ik}\beta_{kj}}}{1 + \sum_{j=1}^{J-1}e^{\sum_{k=0}^K x_{ik}\beta_{kj}}} \qquad j < J$
$\displaystyle \pi_{iJ}$	$\displaystyle =$	$\displaystyle \frac{1}{1 + \sum_{j=1}^{J-1}e^{\sum_{k=0}^K x_{ik}\beta_{kj}}}$	(25)

Next: Parameter Estimation Up: Multinomial Logistic Regression Previous: Multinomial Logistic Regression

Scott Czepiel
http://czep.net/contact.html