dexpbinomial              package:VGAM              R Documentation

_D_o_u_b_l_e _E_x_p_o_n_e_n_t_i_a_l _B_i_n_o_m_i_a_l _D_i_s_t_r_i_b_u_t_i_o_n _F_a_m_i_l_y _F_u_n_c_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     Fits a double exponential binomial distribution by maximum
     likelihood estimation. The two parameters here are the mean and
     dispersion parameter.

_U_s_a_g_e:

     dexpbinomial(lmean="logit", ldispersion="logit",
                  emean=list(),  edispersion=list(),
                  idispersion=0.25, zero=2)

_A_r_g_u_m_e_n_t_s:

lmean, ldispersion: Link functions applied to the two parameters,
          called mu and theta respectively below. See 'Links' for more
          choices. The defaults cause the parameters to be restricted
          to (0,1). 

emean, edispersion: List. Extra argument for each of the links. See
          'earg' in 'Links' for general information.

idispersion: Initial value for the dispersion parameter. If given, it
          must be in range, and is recyled to the necessary length. Use
          this argument if convergence failure occurs.

    zero: An integer specifying which linear/additive predictor is to
          be modelled as an intercept only. If assigned, the single
          value should be either '1' or '2'. The default is to have a
          single dispersion parameter value. To model both parameters
          as functions of the covariates assign 'zero=NULL'.

_D_e_t_a_i_l_s:

     This distribution provides a way for handling overdispersion in a
     binary response. The double exponential binomial distribution
     belongs the family of double exponential distributions proposed by
     Efron (1986). Below, equation numbers refer to that original
     article. Briefly, the idea is that an ordinary one-parameter
     exponential family allows the addition of a second parameter theta
     which varies the dispersion of the family without changing the
     mean. The extended family behaves like the original family with
     sample size changed from n to n*theta. The extended family is an
     exponential family in mu when n and theta are fixed, and an
     exponential family in theta when n and mu are fixed. Having 0 <
     theta < 1 corresponds to overdispersion with respect to the
     binomial distribution. See Efron (1986) for full details.

     This 'VGAM' family function implements an _approximation_ (2.10)
     to the exact density (2.4). It replaces the normalizing constant
     by unity since the true value nearly equals 1. The default model
     fitted is eta1 =logit(mu) and eta2 = logit(theta). This restricts
     both parameters to lie between 0 and 1, although the dispersion
     parameter can be modelled over a larger parameter space by
     assigning the arguments 'ldispersion' and 'edispersion'.

     Approximately, the mean (of Y) is mu. The _effective sample size_
     is the dispersion parameter multiplied by the original sample
     size, i.e., n*theta. This family function uses Fisher scoring, and
     the two estimates are asymptotically independent because the
     expected information matrix is diagonal.

_V_a_l_u_e:

     An object of class '"vglmff"' (see 'vglmff-class'). The object is
     used by modelling functions such as 'vglm'.

_W_a_r_n_i_n_g:

     Numerical difficulties can occur; if so, try using 'idispersion'.

_N_o_t_e:

     This function processes the input in the same way as 'binomialff',
     however multivariate responses are not allowed
     ('binomialff(mv=FALSE)').

_A_u_t_h_o_r(_s):

     T. W. Yee

_R_e_f_e_r_e_n_c_e_s:

     Efron, B. (1986) Double exponential families and their use in
     generalized linear regression. _Journal of the American
     Statistical Association_, *81*, 709-721.

_S_e_e _A_l_s_o:

     'binomialff', 'toxop'.

_E_x_a_m_p_l_e_s:

     # This example mimics the example in Efron (1986). The results here
     # differ slightly.
     data(toxop)

     # Scale the variables
     toxop = transform(toxop,
                       phat = positive / ssize,
                       srainfall = scale(rainfall),  # (6.1)
                       sN = scale(ssize))            # (6.2)

     # A fit similar (should be identical) to Section 6 of Efron (1986).
     # But does not use poly(), and M=1.25 here, as in (5.3)
     cmlist = list("(Intercept)"=diag(2),
                   "I(srainfall)"=rbind(1,0),
                   "I(srainfall^2)"=rbind(1,0),
                   "I(srainfall^3)"=rbind(1,0),
                   "I(sN)"=rbind(0,1),
                   "I(sN^2)"=rbind(0,1))
     elist = list(min=0, max=1.25)
     fit = vglm(phat ~ I(srainfall) + I(srainfall^2) + I(srainfall^3) +
                       I(sN) + I(sN^2),
                fam = dexpbinomial(ldisp="elogit", idisp=0.2,
                                   edisp=elist, zero=NULL),
                data=toxop, weight=ssize, trace=TRUE, constraints=cmlist)

     # Now look at the results
     coef(fit)
     coef(fit, matrix=TRUE)
     fitted(fit)[1:4,]
     summary(fit)
     vcov(fit)
     sqrt(diag(vcov(fit)))   # Standard errors

     # Effective sample size (not quite the last column of Table 1)
     predict(fit)[1:4,]
     Dispersion = elogit(predict(fit)[,2], earg=elist, inverse=TRUE)
     c(round(weights(fit, type="prior") * Dispersion, dig=1))

     # Ordinary logistic regression (gives same results as (6.5))
     ofit = vglm(phat ~ I(srainfall) + I(srainfall^2) + I(srainfall^3),
                 fam = binomialff, data=toxop, weight=ssize, trace=TRUE)

     # Same as fit but it uses poly(), and can be plotted (cf. Figure 1)
     cmlist2 = list("(Intercept)"=diag(2),
                    "poly(srainfall, 3)"=rbind(1,0),
                    "poly(sN, 2)"=rbind(0,1))
     fit2 = vglm(phat ~ poly(srainfall, 3) + poly(sN, 2),
                 fam = dexpbinomial(ldisp="elogit", idisp=0.2,
                                    edisp=list(min=0, max=1.25), zero=NULL),
                 data=toxop, weight=ssize, trace=TRUE, constraints=cmlist2)
     ## Not run: 
     par(mfrow=c(1,2))
     plotvgam(fit2, se=TRUE, lcol="blue", scol="red")  # Cf. Figure 1

     # Cf. Figure 1(a)
     par(mfrow=c(1,2))
     o = with(toxop, sort.list(rainfall))
     with(toxop, plot(rainfall[o], fitted(fit2)[o], type="l", col="blue",
                      las=1, ylim=c(0.3, 0.65)))
     with(toxop, points(rainfall[o], fitted(ofit)[o], col="red", type="b",
                        pch=19))

     # Cf. Figure 1(b)
     o = with(toxop, sort.list(ssize))
     with(toxop, plot(ssize[o], Dispersion[o], type="l", col="blue", las=1,
                      xlim=c(0, 100)))
     ## End(Not run)

