rrvglm                 package:VGAM                 R Documentation

_F_i_t_t_i_n_g _R_e_d_u_c_e_d-_R_a_n_k _V_e_c_t_o_r _G_e_n_e_r_a_l_i_z_e_d _L_i_n_e_a_r _M_o_d_e_l_s (_R_R-_V_G_L_M_s)

_D_e_s_c_r_i_p_t_i_o_n:

     A _reduced-rank vector generalized linear model_ (RR-VGLM) is
     fitted. RR-VGLMs are VGLMs but some of the constraint matrices are
     estimated. In this documentation, M is the number of linear
     predictors.

_U_s_a_g_e:

     rrvglm(formula, family, data = list(), weights = NULL, subset = NULL,
            na.action = na.fail, etastart = NULL, mustart = NULL,
            coefstart = NULL, control = rrvglm.control(...), offset = NULL,
            method = "rrvglm.fit", model = FALSE, x.arg = TRUE, y.arg = TRUE,
            contrasts = NULL, constraints = NULL, extra = NULL,
            qr.arg = FALSE, smart = TRUE, ...)

_A_r_g_u_m_e_n_t_s:

 formula: a symbolic description of the model to be fit. The RHS of the
          formula is applied to each linear predictor. Different
          variables in each linear predictor can be chosen by
          specifying constraint matrices. 

  family: a function of class '"vglmff"' describing what statistical
          model is to be fitted.

    data: an optional data frame containing the variables in the model.
          By default the variables are taken from
          'environment(formula)', typically the environment from which
          'rrvglm' is called.

 weights: an optional vector or matrix of (prior) weights  to be used
          in the fitting process. If 'weights' is a matrix, then it
          must be in _matrix-band_ form, whereby the first M  columns
          of the matrix are the diagonals, followed by the
          upper-diagonal band, followed by the band above that, etc. In
          this case, there can be up to M(M+1) columns, with the last
          column corresponding to the (1,M) elements of the weight
          matrices.

  subset: an optional logical vector specifying a subset of
          observations to be used in the fitting process.

na.action: a function which indicates what should happen when the data
          contain 'NA's. The default is set by the 'na.action' setting
          of 'options', and is 'na.fail' if that is unset. The
          ``factory-fresh'' default is 'na.omit'.

etastart: starting values for the linear predictors. It is a M-column
          matrix. If M=1 then it may be a vector. 

 mustart: starting values for the fitted values. It can be a vector or
          a matrix. Some family functions do not make use of this
          argument.

coefstart: starting values for the coefficient vector.

 control: a list of parameters for controlling the fitting process. 
          See 'rrvglm.control' for details.

  offset: a vector or M-column matrix of offset values. These are _a
          priori_ known and are added to the linear predictors during
          fitting.

  method: the method to be used in fitting the model. The default (and
          presently only) method 'rrvglm.fit' uses iteratively
          reweighted least squares (IRLS).

   model: a logical value indicating whether the _model frame_ should
          be assigned in the 'model' slot.

x.arg, y.arg: logical values indicating whether the model matrix and
          response vector/matrix used in the fitting process should be
          assigned in the 'x' and 'y' slots. Note the model matrix is
          the LM model matrix; to get the VGLM model matrix type
          'model.matrix(vglmfit)' where 'vglmfit' is a 'vglm' object. 

contrasts: an optional list. See the 'contrasts.arg' of
          'model.matrix.default'.

constraints: an optional list  of constraint matrices. The components
          of the list must be named with the term it corresponds to
          (and it must match in character format).  Each constraint
          matrix must have M rows, and be of full-column rank. By
          default, constraint matrices are the M by M identity matrix
          unless arguments in the family function itself override these
          values.  If 'constraints' is used it must contain _all_ the
          terms; an incomplete list is not accepted.

   extra: an optional list with any extra information that might be
          needed by the family function.

  qr.arg: logical value indicating whether the slot 'qr', which returns
          the QR decomposition of the VLM model matrix, is returned on
          the object.

   smart: logical value indicating whether smart prediction
          ('smartpred') will be used.

     ...: further arguments passed into 'rrvglm.control'.

_D_e_t_a_i_l_s:

     The central formula is given by

                        eta = B_1^T x_1 + A nu

     where x1 is a vector (usually just a 1 for an intercept), x2 is
     another vector of explanatory variables, and nu=C^T x_2 is an
     R-vector of latent variables. Here, eta is a vector of linear
     predictors, e.g., the mth element is eta_m = log(E[Y_m]) for the
     mth Poisson response.  The matrices B_1, A and C are estimated
     from the data, i.e., contain the regression coefficients.  For
     ecologists, the central formula represents a _constrained linear
     ordination_ (CLO) since it is linear in the latent variables. It
     means that the response is a monotonically increasing or
     decreasing function of the latent variables.

     The underlying algorithm of RR-VGLMs is iteratively reweighted
     least squares (IRLS) with an optimizing algorithm applied within
     each IRLS iteration (e.g., alternating algorithm).

     In theory, any 'VGAM' family function that works for 'vglm' and
     'vgam' should work for 'rrvglm' too.

     'rrvglm.fit' is the function that actually does the work. It is
     'vglm.fit' with some extra code.

_V_a_l_u_e:

     An object of class '"rrvglm"', which has the the same slots as a
     '"vglm"' object. The only difference is that the some of the
     constraint matrices are estimates rather than known. But 'VGAM'
     stores the models the same internally. The slots of '"vglm"'
     objects are described in 'vglm-class'.

_N_o_t_e:

     The smart prediction ('smartpred') library is packed with the
     'VGAM' library.

     The arguments of 'rrvglm' are the same as those of 'vglm' but with
     some extras in 'rrvglm.control'.

     In the example below, a rank-1 stereotype model of Anderson (1984)
     is fitted to some car data.  The reduced-rank regression is
     performed, adjusting for two covariates. Setting a trivial
     constraint matrix for the latent variable variables in x2 avoids a
     warning message when it is overwritten by a (common) estimated
     constraint matrix. It shows that German cars tend to be more
     expensive than American cars, given a car of fixed weight and
     width.

     If 'fit <- rrvglm(..., data=mydata)' then 'summary(fit)' requires
     corner constraints and no missing values in 'mydata'. Often the
     estimated variance-covariance matrix of the parameters is not
     positive-definite; if this occurs, try refitting the model with a
     different value for 'Index.corner'.

     For _constrained quadratic ordination_ (CQO) see 'cqo' for more
     details about QRR-VGLMs.

     With multivariate binary responses, one must use
     'binomialff(mv=TRUE)' to indicate that the response (matrix) is
     multivariate. Otherwise, it is interpreted as a single binary
     response variable.

_A_u_t_h_o_r(_s):

     Thomas W. Yee

_R_e_f_e_r_e_n_c_e_s:

     Yee, T. W. and Hastie, T. J. (2003) Reduced-rank vector
     generalized linear models. _Statistical Modelling_, *3*, 15-41.

     Yee, T. W. (2004) A new technique for maximum-likelihood canonical
     Gaussian ordination. _Ecological Monographs_, *74*, 685-701.

     Anderson, J. A. (1984) Regression and ordered categorical
     variables. _Journal of the Royal Statistical Society, Series B,
     Methodological_, *46*, 1-30.

_S_e_e _A_l_s_o:

     'rrvglm.control', 'lvplot.rrvglm' (same as 'biplot.rrvglm'),
     'rrvglm-class', 'grc', 'cqo', 'vglmff-class', 'vglm',
     'vglm-class', 'smartpred', 'rrvglm.fit'. Methods functions include
     'Coef.rrvglm', 'summary.rrvglm', etc.

_E_x_a_m_p_l_e_s:

     data(car.all)
     attach(car.all)
     index = Country == "Germany" | Country == "USA" |
             Country == "Japan" | Country == "Korea"
     detach(car.all)
     scar = car.all[index, ]  # standardized car data
     fcols = c(13,14,18:20,22:26,29:31,33,34,36)  # These are factors
     scar[,-fcols] = scale(scar[,-fcols]) # Standardize all numerical vars
     ones = matrix(1, 3, 1)
     cms = list("(Intercept)"=diag(3), Width=ones, Weight=ones,
                Disp.=diag(3), Tank=diag(3), Price=diag(3), 
                Frt.Leg.Room=diag(3))
     set.seed(111)
     fit = rrvglm(Country ~ Width + Weight + Disp. + Tank + Price + Frt.Leg.Room,
                  multinomial, data =  scar, Rank = 2, trace = TRUE,
                  constraints=cms, Norrr = ~ 1 + Width + Weight,
                  Uncor=TRUE, Corner=FALSE, Bestof=2)
     fit@misc$deviance  # A history of the fits
     Coef(fit)
     ## Not run: 
     biplot(fit, chull=TRUE, scores=TRUE, clty=2, ccol="blue", scol="red",
            Ccol="darkgreen", Clwd=2, Ccex=2,
            main="1=Germany, 2=Japan, 3=Korea, 4=USA")
     ## End(Not run)

