covOGK              package:robustbase              R Documentation

_O_r_t_h_o_g_o_n_a_l_i_z_e_d _G_n_a_n_a_d_e_s_i_k_a_n-_K_e_t_t_e_n_r_i_n_g (_O_G_K) _C_o_v_a_r_i_a_n_c_e _M_a_t_r_i_x _E_s_t_i_m_a_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     Computes the orthogonalized pairwise covariance matrix estimate
     described in in Maronna and Zamar (2002).  The pairwise proposal
     goes back to Gnanadesikan and Kettenring (1972).

_U_s_a_g_e:

     covOGK(X, n.iter = 2, sigmamu, rcov = covGK, weight.fn = hard.rejection,
            keep.data = FALSE, ...)

     covGK (x, y, scalefn = scaleTau2, ...)
     s_mad(x, mu.too = FALSE, na.rm = FALSE)
     s_IQR(x, mu.too = FALSE, na.rm = FALSE)

_A_r_g_u_m_e_n_t_s:

       X: data in something that can be coerced into a numeric matrix.

  n.iter: number of orthogonalization iterations.  Usually 1 or 2;
          values greater than 2 are unlikely to have any significant
          effect on the estimate (other than increasing the computing
          time).

sigmamu, scalefn: a function that computes univariate robust location
          and scale estimates.  By default it should return a single
          numeric value containing the robust scale (standard
          deviation) estimate.  When 'mu.too' is true, 'sigmamu()'
          should return a numeric vector of length 2 containing robust
          location and scale estimates.  See 'scaleTau2', 's_Qn',
          's_Sn', 's_mad' or 's_IQR' for examples to be used as
          'sigmamu' argument.

    rcov: function that computes a robust covariance estimate between
          two vectors.  The default, Gnanadesikan-Kettenring's 'covGK',
          is simply (s^2(X+Y) - s^2(X-Y))/4 where s() is the scale
          estimate 'sigmamu()'.

weight.fn: a function of the robust distances and the number of
          variables p to compute the weights used in the reweighting
          step.

keep.data: logical indicating if the (untransformed) data matrix 'X'
          should be kept as part of the result.

     ...: additional arguments; for 'covOGK' to be passed to
          'sigmamu()' and 'weight.fn()'; for 'covGK' passed to
          'scalefn'.

     x,y: numeric vectors of the same length, the covariance of which
          is sought in 'covGK' (or the scale, in 's_mad' or 's_IQR').

  mu.too: logical indicating if both location and scale should be
          returned or just the scale (when 'mu.too=FALSE' as by
          default).

   na.rm: if 'TRUE' then 'NA' values are stripped from 'x' before
          computation takes place.

_D_e_t_a_i_l_s:

     Typical default values for the _function_ arguments 'sigmamu',
     'rcov', and 'weight.fn', are available as well, see the _Examples_
     below, *but* their names and calling sequences are still subject
     to discussion and may be changed in the future.

_V_a_l_u_e:

     'covOGK()' currently returns a list with components 

  center: robust location: numeric vector of length p.

     cov: robust covariance matrix estimate: p x p matrix.

wcenter, wcov: re-*w*eighted versions of 'center' and 'cov'.

 weights: the robustness weights used.

distances: the mahalanobis distances computed using 'center' and 'cov'.

     ...... 
      *but note that this might be radically changed to returning an S4
     classed object!*

     'covGK()' is a trivial 1-line function returning the covariance
     estimate

                 c^(x,y) = [s^(x+y)^2 - s^(x-y)^2]/4,

     where s^(u) is the scale estimate of u specified by 'scalefn'.

     's_mad()', and 's_IQR()' return the scale estimates 'mad' or 'IQR'
     respectively, where the 's_*' functions return a length-2 vector
     (mu, sig) when 'mu.too = TRUE', see also 'scaleTau2'.

_A_u_t_h_o_r(_s):

     Kjell Konis konis@stats.ox.ac.uk, with modifications by Martin
     Maechler.

_R_e_f_e_r_e_n_c_e_s:

     Maronna, R.A. and Zamar, R.H. (2002) Robust estimates of location
     and dispersion of high-dimensional datasets; _Technometrics_
     *44*(4), 307-317.

     Gnanadesikan, R. and John R. Kettenring (1972) Robust estimates,
     residuals, and outlier detection with multiresponse data.
     _Biometrics_ *28*, 81-124.

_S_e_e _A_l_s_o:

     'scaleTau2', 'covMcd', 'cov.rob'.

_E_x_a_m_p_l_e_s:

     data(hbk)
     hbk.x <- data.matrix(hbk[, 1:3])

     cO1 <- covOGK(hbk.x, sigmamu = scaleTau2)
     cO2 <- covOGK(hbk.x, sigmamu = s_Qn)
     cO3 <- covOGK(hbk.x, sigmamu = s_Sn)
     cO4 <- covOGK(hbk.x, sigmamu = s_mad)
     cO5 <- covOGK(hbk.x, sigmamu = s_IQR)

     data(toxicity)
     cO1tox <- covOGK(toxicity, sigmamu = scaleTau2)
     cO2tox <- covOGK(toxicity, sigmamu = s_Qn)

     ## nice formatting of correlation matrices:
     as.dist(round(cov2cor(cO1tox$cov), 2))
     as.dist(round(cov2cor(cO2tox$cov), 2))

     ## "graphical"
     symnum(cov2cor(cO1tox$cov))
     symnum(cov2cor(cO2tox$cov), legend=FALSE)

