negbinomial               package:VGAM               R Documentation

_N_e_g_a_t_i_v_e _B_i_n_o_m_i_a_l _D_i_s_t_r_i_b_u_t_i_o_n _F_a_m_i_l_y _F_u_n_c_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     Maximum likelihood estimation of the two parameters of a negative
     binomial distribution.

_U_s_a_g_e:

     negbinomial(lmu = "loge", lk = "loge",
                 ik = NULL, cutoff = 0.995, Maxiter=5000, 
                 deviance.arg = FALSE, method.init=1, zero = -2)

_A_r_g_u_m_e_n_t_s:

 lmu, lk: Link functions applied to the mu and k parameters. See
          'Links' for more choices.

      ik: Optional initial values for k. If failure to converge occurs
          try different values (and/or use 'method.init'). For a
          S-column response, 'ik' can be of length S. A value 'NULL'
          means an initial value for each response is computed
          internally using a range of values. This argument is ignored
          if used within 'cqo'; see  the 'iKvector' argument of
          'qrrvglm.control' instead.

  cutoff: A numeric which is close to 1 but never exactly 1. Used to
          specify how many terms of the infinite series for computing
          the second diagonal element of the expected information
          matrix are actually used. The sum of the probabilites are
          added until they reach this value or more (but no more than
          'Maxiter' terms allowed). It is like specifying 'p' in an
          imaginary function 'qnegbin(p)'.

 Maxiter: Integer. The maximum number of terms allowed when computing
          the second diagonal element of the expected information
          matrix. In theory, the value involves an infinite series. If
          this argument is too small then the value may be inaccurate.

deviance.arg: Logical. If 'TRUE', the deviance function is attached to
          the object. Under ordinary circumstances, it should be left
          alone because it really assumes the index parameter is at the
          maximum likelihood estimate. Consequently, one cannot use
          that criterion to minimize within the IRLS algorithm. It
          should be set 'TRUE' only when used with 'cqo'  under the
          fast algorithm.

method.init: An integer with value '1', '2' or '3' which specifies the
          initialization method for the mu parameter. If failure to
          converge occurs try another value (and/or specify a value for
          'ik').

    zero: Integer valued vector, usually assigned -2 or 2 if used at
          all.  Specifies which of the two linear/additive predictors
          are modelled as an intercept only. By default, the k
          parameter (after 'lk' is applied) is modelled as a single
          unknown number that is estimated.  It can be modelled as a
          function of the explanatory variables by setting 'zero=NULL'.
          A negative value means that the value is recycled, so setting
          -2 means all k are intercept-only.

_D_e_t_a_i_l_s:

     The negative binomial distribution can be motivated in several
     ways, e.g., as a Poisson distribution with a mean that is gamma
     distributed. There are several common parametrizations of the
     negative binomial distribution. The one used here uses the mean mu
     and an _index_ parameter k, both which are positive. Specifically,
     the density of a random variable Y is 

       f(y;mu,k) = C_{y}^{y + k - 1} [mu/(mu+k)]^y [k/(k+mu)]^k

     where y=0,1,2,..., and mu > 0 and k > 0. Note that the dispersion
     parameter is  1/k, so that as k approaches infinity the negative
     binomial distribution approaches a Poisson distribution. The
     response has variance Var(Y)=mu*(1+mu/k). When fitted, the
     'fitted.values' slot of the object contains the estimated value of
     the mu parameter, i.e., of the mean E(Y).

     The negative binomial distribution can be coerced into the
     classical GLM framework, with one of the parameters being of
     interest and the other treated as a nuisance/scale parameter (and
     implemented in the MASS library). This 'VGAM' family function
     'negbinomial' treats both parameters on the same footing, and
     estimates them both by full maximum likelihood estimation.

     The parameters mu and k are independent (diagonal expected
     information matrix), and the confidence region for k is extremely
     skewed so that its standard error is often of no practical use.
     The parameter 1/k has been used as a measure of aggregation.

     This 'VGAM' function handles _multivariate_ responses, so that a
     matrix can be used as the response. The number of columns is the
     number of species, say, and setting 'zero=-2' means that _all_
     species have a k equalling a (different) intercept only.

_V_a_l_u_e:

     An object of class '"vglmff"' (see 'vglmff-class'). The object is
     used by modelling functions such as 'vglm' and 'vgam'.

_W_a_r_n_i_n_g:

     The Poisson model corresponds to k equalling infinity. If the data
     is Poisson or close to Poisson, numerical problems will occur.
     Possibly choosing a log-log link may help in such cases, otherwise
     use 'poissonff'.

     This function is fragile; the maximum likelihood estimate of the
     index parameter is fraught (see Lawless, 1987). In general, the
     'quasipoissonff' is more robust than this function. Assigning
     values to the 'ik' argument may lead to a local solution, and
     smaller values are preferred over large values when using this
     argument.

     Yet to do: write a family function which uses the methods of
     moments estimator for k.

_N_o_t_e:

     This function can be used by the fast algorithm in 'cqo', however,
     setting 'EqualTolerances=TRUE' and 'ITolerances=FALSE' is
     recommended.

     In the first example below (Bliss and Fisher, 1953), from each of
     6 McIntosh apple trees in an orchard that had been sprayed, 25
     leaves were randomly selected. On each of the leaves, the number
     of adult female European red mites were counted.

_A_u_t_h_o_r(_s):

     Thomas W. Yee

_R_e_f_e_r_e_n_c_e_s:

     Lawless, J. F. (1987) Negative binomial and mixed Poisson
     regression. _The Canadian Journal of Statistics_ *15*, 209-225.

     Bliss, C. and Fisher, R. A. (1953) Fitting the negative binomial
     distribution to biological data. _Biometrics_ *9*, 174-200.

_S_e_e _A_l_s_o:

     'quasipoissonff', 'poissonff', 'cao', 'cqo', 'posnegbinomial',
     'rnbinom', 'nbolf'.

_E_x_a_m_p_l_e_s:

     y = 0:7  # Example 1: apple tree data
     w = c(70, 38, 17, 10, 9, 3, 2, 1)
     fit = vglm(y ~ 1, negbinomial, weights=w)
     summary(fit)
     coef(fit, matrix=TRUE)
     Coef(fit)

     ## Not run: 
     n = 500  # Example 2: simulated data
     x = runif(n)
     y1 = rnbinom(n, mu=exp(3+x), size=exp(1)) # k is size
     y2 = rnbinom(n, mu=exp(2-x), size=exp(0))
     fit = vglm(cbind(y1,y2) ~ x, negbinomial, tra=TRUE) # multivariate response
     coef(fit, matrix=TRUE)
     ## End(Not run)

