cumulative               package:VGAM               R Documentation

_O_r_d_i_n_a_l _R_e_g_r_e_s_s_i_o_n _w_i_t_h _C_u_m_u_l_a_t_i_v_e _P_r_o_b_a_b_i_l_i_t_i_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     Fits a cumulative logit/probit/cloglog/cauchit/... regression
     model to an ordered (preferably) factor response.

_U_s_a_g_e:

     cumulative(link = "logit", parallel = FALSE, reverse = FALSE,
                earg = list(), mv = FALSE, intercept.apply = FALSE)

_A_r_g_u_m_e_n_t_s:

     In the following, the response Y is assumed to be a factor with
     ordered values 1,2,...,M+1, so that M is the number of
     linear/additive predictors eta_j.

    link: Link function applied to the M cumulative probabilities.  See
          'Links' for more choices.

parallel: A logical, or formula specifying which terms have
          equal/unequal coefficients.

 reverse: Logical. By default, the cumulative probabilities used are
          P(Y<=1), P(Y<=2), ..., P(Y<=M). If 'reverse' is 'TRUE', then 
          P(Y>=2), P(Y>=3), ..., P(Y>=M+1) will be used.

          This should be set to 'TRUE' for 'link=' 'golf', 'polf',
          'nbolf'. For these links the cutpoints must be an increasing
          sequence; if 'reverse=FALSE' for then the cutpoints must be
          an decreasing sequence.

    earg: List. Extra argument for the link function. See 'earg' in
          'Links' for general information.

      mv: Logical. Multivariate response? If 'TRUE' then the input
          should be a matrix with values 1,2,...,L, where L is the
          number of levels. Each column of the matrix is a response,
          i.e., multivariate response. A suitable matrix can be
          obtained from 'Cut'.

intercept.apply: Logical. Whether the 'parallel' argument should be
          applied to the intercept term. This should be set to 'TRUE'
          for 'link=' 'golf', 'polf', 'nbolf'.

_D_e_t_a_i_l_s:

     By default, the non-parallel cumulative logit model is fitted,
     i.e.,

                        eta_j = logit(P[Y<=j])

     where j=1,2,...,M and the eta_j are not constrained to be
     parallel. This is also known as the _non-proportional odds model_.
     If the logit link is replaced by a complementary log-log link 
     ('cloglog') then this is known as the _proportional-hazards
     model_.

     In almost all the literature, the constraint matrices associated
     with this family of models are known. For example, setting
     'parallel=TRUE' will make all constraint matrices (except for the
     intercept) equal to a vector of M 1's. If the constraint matrices
     are equal, unknown and to be estimated, then this can be achieved
     by fitting the model as a reduced-rank vector generalized linear
     model (RR-VGLM; see 'rrvglm'). Currently, reduced-rank vector
     generalized additive models (RR-VGAMs) have not been implemented
     here.

_V_a_l_u_e:

     An object of class '"vglmff"' (see 'vglmff-class'). The object is
     used by modelling functions such as 'vglm', 'rrvglm' and 'vgam'.

_W_a_r_n_i_n_g:

     No check is made to verify that the response is ordinal.

_N_o_t_e:

     The response should be either a matrix of counts (with row sums
     that are all positive), or a factor. In both cases, the 'y' slot
     returned by 'vglm'/'vgam'/'rrvglm' is the matrix of counts.

     For a nominal (unordered) factor response, the multinomial logit
     model ('multinomial') is more appropriate.

     With the logit link, setting 'parallel=TRUE' will fit a
     proportional odds model. Note that the 'TRUE' here does not apply
     to the intercept term.  In practice, the validity of the
     proportional odds assumption needs to be checked, e.g., by a
     likelihood ratio test. If acceptable on the data, then numerical
     problems are less likely to occur during the fitting, and there
     are less parameters. Numerical problems occur when the
     linear/additive predictors cross, which results in probabilities
     outside of (0,1); setting 'parallel=TRUE' will help avoid this
     problem.

     Here is an example of the usage of the 'parallel' argument. If
     there are covariates 'x1', 'x2' and 'x3', then 'parallel = TRUE ~
     x1 + x2 -1' and 'parallel = FALSE ~ x3' are equivalent. This would
     constrain the regression coefficients for 'x1' and 'x2' to be
     equal; those of the intercepts and 'x3' would be different.

     In the future, this family function may be renamed to ``'cups'''
     (for *cu*mulative *p*robabilitie*s*) or ``'cute''' (for
     *cu*mulative probabili*t*i*e*s).

_A_u_t_h_o_r(_s):

     Thomas W. Yee

_R_e_f_e_r_e_n_c_e_s:

     Agresti, A. (2002) _Categorical Data Analysis_, 2nd ed. New York:
     Wiley.

     Dobson, A. J. (2001) _An Introduction to Generalized Linear
     Models_, 2nd ed. Boca Raton: Chapman & Hall/CRC Press.

     McCullagh, P. and Nelder, J. A. (1989) _Generalized Linear
     Models_, 2nd ed. London: Chapman & Hall.

     Simonoff, J. S. (2003) _Analyzing Categorical Data_, New York:
     Springer-Verlag.

     Yee, T. W. and Wild, C. J. (1996) Vector generalized additive
     models. _Journal of the Royal Statistical Society, Series B,
     Methodological_, *58*, 481-493.

     Documentation accompanying the 'VGAM' package at <URL:
     http://www.stat.auckland.ac.nz/~yee> contains further information
     and examples.

_S_e_e _A_l_s_o:

     'acat', 'cratio', 'sratio', 'multinomial', 'pneumo', 'logit',
     'probit', 'cloglog', 'cauchit', 'golf', 'polf', 'nbolf'.

_E_x_a_m_p_l_e_s:

     # Fit the proportional odds model, p.179, in McCullagh and Nelder (1989)
     data(pneumo)
     pneumo = transform(pneumo, let=log(exposure.time))
     (fit = vglm(cbind(normal, mild, severe) ~ let,
                 cumulative(parallel=TRUE, reverse=TRUE), pneumo))
     fit@y   # Sample proportions
     weights(fit, type="prior")   # Number of observations
     coef(fit, matrix=TRUE)
     constraints(fit)   # Constraint matrices

     # Check that the model is linear in let
     fit2 = vgam(cbind(normal, mild, severe) ~ s(let, df=2),
                 cumulative(reverse=TRUE), pneumo)
     ## Not run: 
     plot(fit2, se=TRUE, overlay=TRUE, lcol=1:2, scol=1:2)
     ## End(Not run)

