multinomial               package:VGAM               R Documentation

_M_u_l_t_i_n_o_m_i_a_l _L_o_g_i_t _M_o_d_e_l

_D_e_s_c_r_i_p_t_i_o_n:

     Fits a multinomial logit model to an unordered factor response.

_U_s_a_g_e:

     multinomial(zero = NULL, parallel = FALSE, nointercept = NULL)

_A_r_g_u_m_e_n_t_s:

     In the following, the response Y is assumed to be a factor with
     unordered values 1,2,...,M+1, so that M is the number of
     linear/additive predictors eta_j.

    zero: An integer-valued vector specifying which linear/additive
          predictors are modelled as intercepts only. The values must
          be from the set {1,2,...,M}. The default value means none are
          modelled as intercept-only terms.

parallel: A logical, or formula specifying which terms have
          equal/unequal coefficients.

nointercept: An integer-valued vector specifying which linear/additive
          predictors have no intercepts. The values must be from the
          set {1,2,...,M}.

_D_e_t_a_i_l_s:

     The model can be written

                    eta_j =  log(P[Y=j]/ P[Y=M+1])

     where eta_j is the jth linear/additive predictor. Here, j=1,...,M
     and eta_{M+1} is 0 by definition.  That is, the last level of the
     factor, or last column of the response matrix, is taken as the
     reference level or baseline-this is for identifiability of the
     parameters.

     In almost all the literature, the constraint matrices associated
     with this family of models are known. For example, setting
     'parallel=TRUE' will make all constraint matrices (except for the
     intercept) equal to a vector of M 1's.  If the constraint matrices
     are unknown and to be estimated, then this can be achieved by
     fitting the model as a reduced-rank vector generalized linear
     model (RR-VGLM; see 'rrvglm').  In particular, a multinomial logit
     model with unknown constraint matrices is known as a stereotype
     model (Anderson, 1984), and can be fitted with 'rrvglm'.

_V_a_l_u_e:

     An object of class '"vglmff"' (see 'vglmff-class'). The object is
     used by modelling functions such as 'vglm', 'rrvglm' and 'vgam'.

_W_a_r_n_i_n_g:

     The arguments 'zero' and 'nointercept' can be inputted with values
     that fail. For example, 'multinomial(zero=2, nointercept=1:3)'
     means the second linear/additive predictor is identically zero,
     which will cause a failure.

     Be careful about the use of other potentially contradictory
     constraints, e.g., 'multinomial(zero=2, parallel = TRUE ~ x3)'. 
     If in doubt, apply 'constraints()' to the fitted object to check.

     No check is made to verify that the response is nominal.

_N_o_t_e:

     The response should be either a matrix of counts (with row sums
     that are all positive), or a factor. In both cases, the 'y' slot
     returned by 'vglm'/'vgam'/'rrvglm' is the matrix of counts.

     The multinomial logit model is more appropriate for a nominal
     (unordered) factor response. For an ordinal (ordered) factor
     response, models such as those based on cumulative probabilities
     (see 'cumulative') are more suited.

     'multinomial' is prone to numerical difficulties if the groups are
     separable and/or the fitted probabilities are close to 0 or 1. The
     fitted values returned are estimates of the probabilities P[Y=j]
     for j=1,...,M+1.

     Here is an example of the usage of the 'parallel' argument. If
     there are covariates 'x1', 'x2' and 'x3', then 'parallel = TRUE ~
     x1 + x2 -1' and 'parallel = FALSE ~ x3' are equivalent. This would
     constrain the regression coefficients for 'x1' and 'x2' to be
     equal; those of the intercepts and 'x3' would be different.

     In Example 4 below, a conditional logit model is fitted to a
     artificial data set that explores how cost and travel time affect
     people's decision about how to travel to work.  Walking is the
     baseline group. The variable 'Cost.car' is the difference between
     the cost of travel to work by car and walking, etc.  The variable
     'Durn.car' is the difference between the travel duration/time to
     work by car and walking, etc.  For other details about the 'xij'
     argument see 'vglm.control' and 'fill'.

     The 'multinom' function in the 'nnet' package uses the first level
     of the factor as baseline, whereas the last level of the factor is
     used here. Consequently the estimated  regression coefficients
     differ.

_A_u_t_h_o_r(_s):

     Thomas W. Yee

_R_e_f_e_r_e_n_c_e_s:

     Yee, T. W. and Hastie, T. J. (2003) Reduced-rank vector
     generalized linear models. _Statistical Modelling_,  *3*, 15-41.

     McCullagh, P. and Nelder, J. A. (1989) _Generalized Linear
     Models_, 2nd ed. London: Chapman & Hall.

     Agresti, A. (2002) _Categorical Data Analysis_, 2nd ed. New York:
     Wiley.

     Simonoff, J. S. (2003) _Analyzing Categorical Data_, New York:
     Springer-Verlag.

     Anderson, J. A. (1984) Regression and ordered categorical
     variables.  _Journal of the Royal Statistical Society, Series B,
     Methodological_, *46*, 1-30.

     Documentation accompanying the 'VGAM' package at <URL:
     http://www.stat.auckland.ac.nz/~yee> contains further information
     and examples.

_S_e_e _A_l_s_o:

     'acat', 'cumulative', 'cratio', 'sratio', 'dirichlet',
     'dirmultinomial', 'rrvglm', 'Multinomial', 'iris'.

_E_x_a_m_p_l_e_s:

     # Example 1: fit a multinomial logit model to Edgar Anderson's iris data
     data(iris)
     ## Not run: 
     fit = vglm(Species ~ ., multinomial, iris)
     coef(fit, matrix=TRUE) 
     ## End(Not run)

     # Example 2a: a simple example 
     y = t(rmultinom(10, size = 20, prob=c(0.1,0.2,0.8))) # Counts
     fit = vglm(y ~ 1, multinomial)
     fitted(fit)[1:4,]   # Proportions
     fit@prior.weights # Not recommended for extraction of prior weights
     weights(fit, type="prior", matrix=FALSE) # The better method
     fit@y   # Sample proportions
     constraints(fit)   # Constraint matrices

     # Example 2b: Different input to Example 2a but same result
     w = apply(y, 1, sum) # Prior weights
     yprop = y / w    # Sample proportions
     fitprop = vglm(yprop ~ 1, multinomial, weights=w)
     fitted(fitprop)[1:4,]   # Proportions
     weights(fitprop, type="prior", matrix=FALSE)
     fitprop@y # Same as the input

     # Example 3: Fit a rank-1 stereotype model 
     data(car.all)
     fit = rrvglm(Country ~ Width + Height + HP, multinomial, car.all, Rank=1)
     coef(fit)   # Contains the C matrix
     constraints(fit)$HP     # The A matrix 
     coef(fit, matrix=TRUE)  # The B matrix
     Coef(fit)@C             # The C matrix 
     ccoef(fit)              # Better to get the C matrix this way
     Coef(fit)@A             # The A matrix 
     svd(coef(fit, matrix=TRUE)[-1,])$d    # This has rank 1; = C 

     # Example 4: The use of the xij argument (conditional logit model)
     set.seed(111)
     n = 100  # Number of people who travel to work
     M = 3  # There are M+1 models of transport
     ymat = matrix(0, n, M+1)
     ymat[cbind(1:n, sample(x=M+1, size=n, replace=TRUE))] = 1
     dimnames(ymat) = list(NULL, c("bus","train","car","walk"))
     transport = data.frame(cost.bus=runif(n), cost.train=runif(n),
                            cost.car=runif(n), cost.walk=runif(n),
                            durn.bus=runif(n), durn.train=runif(n),
                            durn.car=runif(n), durn.walk=runif(n))
     transport = round(transport, dig=2) # For convenience
     transport = transform(transport,
                           Cost.bus   = cost.bus   - cost.walk,
                           Cost.car   = cost.car   - cost.walk,
                           Cost.train = cost.train - cost.walk,
                           Durn.bus   = durn.bus   - durn.walk,
                           Durn.car   = durn.car   - durn.walk,
                           Durn.train = durn.train - durn.walk)
     fit = vglm(ymat ~ Cost.bus + Cost.train + Cost.car + 
                       Durn.bus + Durn.train + Durn.car,
                fam = multinomial,
                xij = list(Cost ~ Cost.bus + Cost.train + Cost.car,
                           Durn ~ Durn.bus + Durn.train + Durn.car),
                data=transport)
     model.matrix(fit, type="lm")[1:7,]   # LM model matrix
     model.matrix(fit, type="vlm")[1:7,]  # Big VLM model matrix
     coef(fit)
     coef(fit, matrix=TRUE)
     coef(fit, matrix=TRUE, compress=FALSE)
     summary(fit)

