fill                  package:VGAM                  R Documentation

_C_r_e_a_t_e_s _a _M_a_t_r_i_x _o_f _A_p_p_r_o_p_r_i_a_t_e _D_i_m_e_n_s_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     A support function for the argument 'xij', it generates a matrix
     of an appropriate dimension.

_U_s_a_g_e:

     fill(x, values = 0, ncolx = ncol(x))

_A_r_g_u_m_e_n_t_s:

       x: A vector or matrix which is used to determine the dimension
          of the answer, in particular, the number of rows.  After
          converting 'x' to a matrix if necessary, the answer is a
          matrix of 'values' values, of dimension 'nrow(x)' by 'ncolx'.

  values: Numeric. The answer contains these values which are recycled
          if necessary.

   ncolx: The number of columns of the returned matrix. The default is
          the number of columns of 'x'.

_D_e_t_a_i_l_s:

     The 'xij' argument for 'vglm' allows the user to input variables
     specific to each linear predictor.  For example, consider the
     bivariate logit model where the first/second linear/additive
     predictor is the logistic regression of the first/second binary
     response respectively. The third linear/additive predictor is
     'log(OR) = eta3', where 'OR' is the odds ratio.  If one has ocular
     pressure as a covariate in this model then 'xij' is required to
     handle the ocular pressure for each eye, since these will be
     different in general. [This contrasts with a variable such as
     'age', the age of the person, which has a common value for both
     eyes.]  In order to input these data into 'vglm' one often finds
     that functions 'fill', 'fill1', etc. are useful.

     All terms in the 'xij' argument must appear in the main 'formula'
     argument in 'vglm'.

_V_a_l_u_e:

     'matrix(values, nrow=nrow(x), ncol=ncolx)', i.e., a matrix
     consisting of values 'values', with the number of rows matching
     'x', and the default number of columns is the number of columns of
     'x'.

_W_a_r_n_i_n_g:

     The use of the 'xij' argument overrides other arguments such as
     'exchangeable' and 'zero'.  Care is needed in such cases. See the
     examples below.

_N_o_t_e:

     Additionally, there are currently 3 other identical 'fill'
     functions, called 'fill1', 'fill2' and 'fill3'; if you need more
     then assign 'fill4 = fill5 = fill1' etc. The reason for this is
     that if more than one 'fill' function is needed then they must be
     unique. For example, if M=4 then  'xij = op ~ lop + rop +
     fill(mop) + fill(mop)' would reduce to  'xij = op ~ lop + rop +
     fill(mop)', whereas 'xij = op ~ lop + rop + fill1(mop) +
     fill2(mop)' would retain M terms, which is needed.

     The constraint matrices, as returned by 'constraints', have a
     different meaning when 'xij' is used.

     In Examples 1 to 3 below, the 'xij' argument illustrates
     covariates that are specific to a linear predictor. Here,
     'lop'/'rop' are the ocular pressures of the left/right eye in an
     artificial dataset, and 'mop' is their mean.  Variables 'leye' and
     'reye' might be the presence/absence of a particular disease on
     the LHS/RHS eye respectively.  Examples 1 and 2 are deliberately
     misspecified. The output from, e.g., 'coef(fit, matrix=TRUE)',
     looks wrong but is correct because the coefficients are multiplied
     by the zeros  produced from 'fill'.

     In Example 4, the 'xij' argument illustrates fitting the model
     where there is a common smooth function of the ocular pressure. 
     One should use regression splines since 's' in 'vgam' does not
     handle the 'xij' argument.  However, regression splines such as
     'bs' and 'ns' need to have the same knots here for both functions,
     and Example 4 illustrates a trick involving a function 'BS' to
     obtain this. Although regression splines create more than a single
     column per term in the model matrix, 'fill(BS(lop,rop,mop))'
     creates the required (same) number of columns.

_A_u_t_h_o_r(_s):

     T. W. Yee

_R_e_f_e_r_e_n_c_e_s:

     More information can be found at <URL:
     http://www.stat.auckland.ac.nz/~yee>.

_S_e_e _A_l_s_o:

     'vglm', 'vglm.control'.

_E_x_a_m_p_l_e_s:

     fill(runif(5))
     fill(runif(5), ncol=3)
     fill(runif(5), val=1, ncol=3)

     # Generate eyes data for the examples below. Eyes are independent (OR=1).
     set.seed(123)
     n = 2000  # Number of people
     eyes = data.frame(lop = round(runif(n), 2),
                       rop = round(runif(n), 2),
                       age = round(rnorm(n, 40, 10)))
     eyes = transform(eyes,
         mop = (lop + rop) / 2, # mean ocular pressure
         eta1 = 0 - 2*lop + 0.04*age, # Linear predictor for left eye
         eta2 = 0 - 2*rop + 0.04*age) # Linear predictor for right eye
     eyes = transform(eyes,
         leye = rbinom(n, size=1, prob=exp(eta1)/(1+exp(eta1))),
         reye = rbinom(n, size=1, prob=exp(eta2)/(1+exp(eta2))))

     # Example 1
     # Non-exchangeable errors (misspecified model)
     fit1 = vglm(cbind(leye,reye) ~ lop + rop + fill(lop) + age,
                 family = binom2.or(exchangeable=FALSE, zero=NULL),
                 xij = op ~ lop + rop + fill(lop), data=eyes)
     model.matrix(fit1, type="lm")[1:7,]   # LM model matrix
     model.matrix(fit1, type="vlm")[1:7,]  # Big VLM model matrix
     coef(fit1)
     coef(fit1, matrix=TRUE)  # Looks wrong but is correct
     coef(fit1, matrix=TRUE, compress=FALSE)  # Looks wrong but is correct
     constraints(fit1)
     max(abs(predict(fit1)-predict(fit1, new=eyes))) # Predicts correctly
     summary(fit1)

     # Example 2
     # Nonexchangeable errors (misspecified model), OR is a function of mop
     fit2 = vglm(cbind(leye,reye) ~ lop + rop + mop + age,
                 family = binom2.or(exchangeable=FALSE, zero=NULL),
                 xij = op ~ lop + rop + mop, data=eyes)
     model.matrix(fit2, type="lm")[1:7,]   # LM model matrix
     model.matrix(fit2, type="vlm")[1:7,]  # Big VLM model matrix
     coef(fit2)
     coef(fit2, matrix=TRUE)  # correct
     coef(fit2, matrix=TRUE, compress=FALSE)  # correct
     max(abs(predict(fit2)-predict(fit2, new=eyes))) # Predicts correctly
     summary(fit2)

     # Example 3. This model is correctly specified.
     # Exchangeable errors
     fit3 = vglm(cbind(leye,reye) ~ lop + rop + fill(lop) + age,
                 family = binom2.or(exchangeable=TRUE, zero=3),
                 xij = op ~ lop + rop + fill(lop), data=eyes)
     model.matrix(fit3, type="lm")[1:7,]   # LM model matrix
     model.matrix(fit3, type="vlm")[1:7,]  # Big VLM model matrix
     coef(fit3)
     coef(fit3, matrix=TRUE) # Looks wrong but is correct
     coef(fit3, matrix=TRUE, compress=FALSE) # Looks wrong but is correct
     predict(fit3, new=eyes[1:4,])  # Note the 'scalar' OR, i.e., zero=3
     max(abs(predict(fit3)-predict(fit3, new=eyes))) # Predicts correctly
     summary(fit3)

     # Example 4. This model uses regression splines on ocular pressure.
     # It assumes exchangeable errors.
     BS = function(x, ...) bs(c(x,...), df=3)[1:length(x),]
     fit4 = vglm(cbind(leye,reye) ~ BS(lop,rop,mop) + BS(rop,lop,mop) +
                 fill(BS(lop,rop,mop)) + age,
                 family = binom2.or(exchangeable=TRUE, zero=3),
                 xij = BS(op) ~ BS(lop,rop,mop) + BS(rop,lop,mop) +
                       fill(BS(lop,rop,mop)), data=eyes)
     model.matrix(fit4, type="lm")[1:7,]   # LM model matrix
     model.matrix(fit4, type="vlm")[1:7,]  # Big VLM model matrix
     coef(fit4)
     coef(fit4, matrix=TRUE) # Looks wrong but is correct
     coef(fit4, matrix=TRUE, compress=FALSE) # Looks wrong but is correct
     predict(fit4, new=eyes[1:4,])  # Note the 'scalar' OR, i.e., zero=3
     max(abs(predict(fit4)-predict(fit4, new=eyes))) # Predicts correctly
     summary(fit4)

