dirichlet                package:VGAM                R Documentation

_F_i_t_t_i_n_g _a _D_i_r_i_c_h_l_e_t _D_i_s_t_r_i_b_u_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     Fits a Dirichlet distribution to a matrix of compositions.

_U_s_a_g_e:

     dirichlet(link = "loge", earg=list(), zero=NULL)

_A_r_g_u_m_e_n_t_s:

     In the following, the response is assumed to be a M-column matrix
     with positive values and whose rows each sum to unity. Such data
     can be thought of as compositional data. There are M
     linear/additive predictors eta_j.

    link: Link function applied to each of the M (positive) shape
          parameters alpha_j. See 'Links' for more choices. The default
          gives eta_j=log(alpha_j).

    earg: List. Extra argument for the link. See 'earg' in 'Links' for
          general information.

    zero: An integer-valued vector specifying which linear/additive
          predictors are modelled as intercepts only. The default is
          none of them. If used, choose values from the set
          {1,2,...,M}.

_D_e_t_a_i_l_s:

     The Dirichlet distribution is commonly used to model compositional
     data, including applications in genetics. Suppose (Y_1,...,Y_M)^T
     is the response. Then it has a Dirichlet distribution if
     (Y_1,...,Y_{M-1})^T has density

 (Gamma(alpha_+) / prod_{j=1}^M gamma(alpha_j)) prod_{j=1}^M y_j^(alpha_j -1)

     where alpha_+= alpha_1 + ... + alpha_M, alpha_j > 0, and the
     density is defined on the unit simplex

 Delta_M = { (y_1,...,y_M)^T : y_1 > 0, ..., y_M > 0, sum_{j=1}^M y_j = 1 }.

     One has E(Y_j) = alpha_j / alpha_{+}, which are returned as the
     fitted values. For this distribution Fisher scoring corresponds to
     Newton-Raphson.

     The Dirichlet distribution can be motivated by considering the
     random variables (G_1,...,G_M)^T which are each independent and
     identically distributed as a gamma distribution with density 
     f(g_j)= g_j^(alpha_j - 1) e^(-g_j) / gamma(alpha_j). Then the
     Dirichlet distribution arises when Y_j = G_j / (G_1 + ... + G_M).

_V_a_l_u_e:

     An object of class '"vglmff"' (see 'vglmff-class'). The object is
     used by modelling functions such as 'vglm', 'rrvglm' and 'vgam'.

     When fitted, the 'fitted.values' slot of the object contains the
     M-column matrix of means.

_N_o_t_e:

     The response should be a matrix of positive values whose rows each
     sum to unity. Similar to this is count data, where probably a
     multinomial logit model ('multinomial') may be appropriate.
     Another similar distribution to the Dirichlet is the
     Dirichlet-multinomial (see 'dirmultinomial').

_A_u_t_h_o_r(_s):

     Thomas W. Yee

_R_e_f_e_r_e_n_c_e_s:

     Lange, K. (2002) _Mathematical and Statistical Methods for Genetic
     Analysis_, 2nd ed. New York: Springer-Verlag.

     Evans, M., Hastings, N. and Peacock, B. (2000) _Statistical
     Distributions_, New York: Wiley-Interscience, Third edition.

_S_e_e _A_l_s_o:

     'rdiric', 'dirmultinomial', 'multinomial'.

_E_x_a_m_p_l_e_s:

     y = rdiric(n=1000, shape=c(3, 1, 4))
     fit = vglm(y ~ 1, dirichlet, trace = TRUE, crit="c")
     Coef(fit)
     coef(fit, matrix=TRUE)
     fitted(fit)[1:2,]

