biVar                 package:Hmisc                 R Documentation

_B_i_v_a_r_i_a_t_e _S_u_m_m_a_r_i_e_s _C_o_m_p_u_t_e_d _S_e_p_a_r_a_t_e_l_y _b_y _a _S_e_r_i_e_s _o_f _P_r_e_d_i_c_t_o_r_s

_D_e_s_c_r_i_p_t_i_o_n:

     'biVar' is a generic function that accepts a formula and usual
     'data', 'subset', and 'na.action' parameters plus a list
     'statinfo' that specifies a function of two variables to compute
     along with information about labeling results for printing and
     plotting.  The function is called separately with each right hand
     side variable and the same left hand variable.  The result is a
     matrix of bivariate statistics and the 'statinfo' list that drives
     printing and plotting.  The plot method draws a dot plot with
     x-axis values by default sorted in order of one of the statistics
     computed by the function.

     'spearman2' computes the square of Spearman's rho rank correlation
     and a generalization of it in which 'x' can relate
     non-monotonically to 'y'.  This is done by computing the Spearman
     multiple rho-squared between '(rank(x), rank(x)^2)' and 'y'. When
     'x' is categorical, a different kind of Spearman correlation used
     in the Kruskal-Wallis test is computed (and 'spearman2' can do the
     Kruskal-Wallis test).  This is done by computing the ordinary
     multiple 'R^2' between 'k-1' dummy variables and 'rank(y)', where
     'x' has 'k' categories.  'x' can also be a formula, in which case
     each predictor is correlated separately with 'y', using
     non-missing observations for that predictor. 'biVar' is used to do
     the looping and bookkeeping.  By default the plot shows the
     adjusted 'rho^2', using the same formula used for the ordinary
     adjusted 'R^2'.  The 'F' test uses the unadjusted R2.

     'spearman' computes Spearman's rho on non-missing values of two
     variables.  'spearman.test' is a simple version of
     'spearman2.default'.

     'chiSquare' is set up like 'spearman2' except it is intended for a
     categorical response variable.  Separate Pearson chi-square tests
     are done for each predictor, with optional collapsing of
     infrequent categories.  Numeric predictors having more than 'g'
     levels are categorized into 'g' quantile groups.  'chiSquare' uses
     'biVar'.

_U_s_a_g_e:

     biVar(formula, statinfo, data=NULL, subset=NULL,
           na.action=na.retain, exclude.imputed=TRUE, ...)

     ## S3 method for class 'biVar':
     print(x, ...)

     ## S3 method for class 'biVar':
     plot(x, what=info$defaultwhat,
                            sort.=TRUE,
                            main, xlab, ...)

     spearman2(x, ...)

     ## Default S3 method:
     spearman2(x, y, p=1, minlev=0, na.rm=TRUE, exclude.imputed=na.rm, ...)

     ## S3 method for class 'formula':
     spearman2(formula, data=NULL,
               subset, na.action=na.retain, exclude.imputed=TRUE, ...)

     spearman(x, y)

     spearman.test(x, y, p=1)

     chiSquare(formula, data=NULL, subset=NULL, na.action=na.retain,
               exclude.imputed=TRUE, ...)

_A_r_g_u_m_e_n_t_s:

 formula: a formula with a single left side variable

statinfo: see 'spearman2.formula' or 'chiSquare' code

data, subset, na.action: the usual options for models.  Default for
          'na.action' is to retain all values, NA or not, so that NAs
          can be deleted in only a pairwise fashion. 

exclude.imputed: set to 'FALSE' to include imputed values (created by
          'impute') in the calculations. 

     ...: other arguments that are passed to the function used to
          compute the bivariate statistics or to 'dotchart2' for
          'plot'. 

   na.rm: logical; delete NA values?

       x: a numeric matrix with at least 5 rows and at least 2 columns
          (if 'y' is absent).  For 'spearman2', the first argument may
          be a vector of any type, including character or factor.  The
          first argument may also be a formula, in which case all
          predictors are correlated individually with  the response
          variable.  'x' may be a formula for 'spearman2' in which case
          'spearman2.formula' is invoked.  Each predictor in the right
          hand side of the formula is separately correlated with the
          response variable.  For 'print' or 'plot', 'x' is an object
          produced by 'biVar'.  For 'spearman' and 'spearman.test' 'x'
          is a numeric vector, as is 'y'.  For 'chiSquare', 'x' is a
          formula. 

       y: a numeric vector 

       p: for numeric variables, specifies the order of the Spearman
          'rho^2' to use.  The default is 'p=1' to compute the ordinary
          'rho^2'.  Use 'p=2' to compute the quadratic rank
          generalization to allow non-monotonicity.  'p' is ignored for
          categorical predictors. 

  minlev: minimum relative frequency that a level of a categorical
          predictor should have before it is pooled with other
          categories (see 'combine.levels') in 'spearman2' and
          'chiSquare' (in which case it also applies to the response). 
          The default, 'minlev=0' causes no pooling. 

    what: specifies which statistic to plot.  Possibilities include the
          column names that appear with the print method is used. 

   sort.: set 'sort.=FALSE' to suppress sorting variables by the
          statistic being plotted 

    main: main title for plot.  Default title shows the name of the
          response variable. 

    xlab: x-axis label.  Default constructed from 'what'. 

_D_e_t_a_i_l_s:

     Uses midranks in case of ties, as described by Hollander and
     Wolfe. P-values for Spearman, Wilcoxon, or Kruskal-Wallis tests
     are approximated by using the 't' or 'F' distributions.

_V_a_l_u_e:

     'spearman2.default' (the function that is called for a single 'x',
     i.e., when there is no formula) returns a vector of statistics for
     the variable. 'biVar', 'spearman2.formula', and 'chiSquare' return
     a matrix with rows corresponding to predictors.

_A_u_t_h_o_r(_s):

     Frank Harrell 
      Department of Biostatistics 
      Vanderbilt University 
      f.harrell@vanderbilt.edu

_R_e_f_e_r_e_n_c_e_s:

     Hollander M. and Wolfe D.A. (1973).  Nonparametric Statistical
     Methods. New York: Wiley.

     Press WH, Flannery BP, Teukolsky SA, Vetterling, WT (1988):
     Numerical Recipes in C.  Cambridge: Cambridge University Press.

_S_e_e _A_l_s_o:

     'combine.levels', 'varclus', 'dotchart2', 'impute', 'chisq.test',
     'cut2'.

_E_x_a_m_p_l_e_s:

     x <- c(-2, -1, 0, 1, 2)
     y <- c(4,   1, 0, 1, 4)
     z <- c(1,   2, 3, 4, NA)
     v <- c(1,   2, 3, 4, 5)

     spearman2(x, y)
     plot(spearman2(z ~ x + y + v, p=2))

     f <- chiSquare(z ~ x + y + v)
     f

