matchit               package:MatchIt               R Documentation

_M_a_t_c_h_I_t: _M_a_t_c_h_i_n_g _S_o_f_t_w_a_r_e _f_o_r _C_a_u_s_a_l _I_n_f_e_r_e_n_c_e

_D_e_s_c_r_i_p_t_i_o_n:

     'matchit' is the main command of the package _MatchIt_, which
     enables parametric models for causal inference to work better by
     selecting well-matched subsets of the original treated and control
     groups.  MatchIt implements the suggestions of Ho, Imai, King, and
     Stuart (2004) for improving parametric statistical models by
     preprocessing data with nonparametric matching methods.  MatchIt
     implements a wide range of sophisticated matching methods, making
     it possible to greatly reduce the dependence of causal inferences
     on hard-to-justify, but commonly made, statistical modeling
     assumptions. The software also easily fits into existing research
     practices since, after preprocessing with MatchIt, researchers can
     use whatever parametric model they would have used without
     MatchIt, but produce inferences with substantially more robustness
     and less sensitivity to modeling assumptions.  Matched data sets
     created by MatchIt can be entered easily in Zelig (<URL:
     http://gking.harvard.edu/zelig>) for subsequent parametric
     analyses. Full documentation is available online at <URL:
     http://gking.harvard.edu/matchit>, and help for specific commands
     is available through 'help.matchit'.

_U_s_a_g_e:

     matchit(formula, data, method = "nearest", distance = "logit",
                    distance.options = list(), discard = "none",
                    reestimate = FALSE, ...)

_A_r_g_u_m_e_n_t_s:

 formula: This argument takes the usual syntax of R formula, 'treat ~
          x1 + x2', where 'treat' is a binary treatment indicator and
          'x1' and 'x2' are the pre-treatment covariates. Both the
          treatment indicator and pre-treatment covariates must be
          contained in the same data frame, which is specified as
          'data' (see below).  All of the usual R syntax for formula
          works. For example, 'x1:x2' represents the first order
          interaction term between 'x1' and 'x2', and 'I(x1^2)'
          represents the square term of 'x1'. See 'help(formula)' for
          details. 

    data: This argument specifies the data frame containing the
          variables called in 'formula'. 

  method: This argument specifies a matching method. Currently,
          '"exact"' (exact matching), '"full"' (full matching),
          '"genetic"' (genetic matching), '"nearest"' (nearest neighbor
          matching), '"optimal"' (optimal matching), and '"subclass"'
          (subclassification) are available. The default is
          '"nearest"'. Note that within each of these matching methods,
          _MatchIt_ offers a variety of options. See <URL:
          http://gking.harvard.edu/matchit/docs/Inputs.html> for the
          complete list of supported options. 

distance: This argument specifies the method used to estimate the
          distance measure. The default is logistic regression,
          '"logit"'. A variety of other methods are available. See
          <URL:
          http://gking.harvard.edu/matchit/docs/All_Matching_Methods.ht
          ml> for the complete list of supported methods. 

distance.options: This optional argument specifies the optional
          arguments that are passed to the model for estimating the
          distance measure. The input to this argument should be a
          list. 

 discard: This argument specifies whether to discard units that fall
          outside some measure of support of the distance score before
          matching, and not allow them to be used at all in the
          matching procedure.  Note that discarding units may change
          the quantity of interest being estimated.

     _n_o_n_e (default) discards no units before matching. 

     _b_o_t_h discards all units (treated and control) that are outside the
          support of the distance measure.  

     _c_o_n_t_r_o_l discards only control units outside the support of the
          distance measure of the treated units.

     _t_r_e_a_t discards only treated units outside the support of the
          distance measure of the control units.  

reestimate: This argument specifies whether the model for distance
          measure should be re-estimated after units are discarded. The
          input must be a logical value. The default is 'FALSE'. 

     ...: Additional arguments to be passed to a variety of matching
          methods. See <URL: http://gking.harvard.edu/matchit/??> for
          the complete list of options. 

_D_e_t_a_i_l_s:

     The matching is done using the 'matchit(treat ~ X, ...)'  command,
     where 'treat' is the vector of treatment assignments and 'X' are
     the covariates to be used in the matching.  There are a number of
     matching options, detailed below.  The full syntax is
     'matchit(formula, data=NULL, discard=0, exact=FALSE,
     replace=FALSE, ratio=1, model="logit", reestimate=FALSE,
     nearest=TRUE, m.order=2, caliper=0, calclosest=FALSE,
     mahvars=NULL, subclass=0, sub.by="treat", counter=TRUE,
     full=FALSE, full.options=list(), ...)' A summary of the results
     can be seen graphically using 'plot(matchitobject)', or
     numerically using 'summary(matchitobject)'. 'print(matchitobject)'
     also prints out the output.

_V_a_l_u_e:

    call: The original 'matchit' call. 

 formula: The formula used to specify the model for estimating the
          distance measure. 

   model: The output of the model used to estimate the distance
          measure.  'summary(m.out$model)' will give the summary of the
          model where 'm.out' is the output object from 'matchit'. 

match.matrix: An n_1 by 'ratio' matrix where

           the row names, which can be obtained through
          'row.names(match.matrix)', represent the names of the
          treatment units, which come from the data frame specified in
          'data'.

          each column stores the name(s) of the control unit(s) matched
          to the treatment unit of that row. For example, when the
          'ratio' input for nearest neighbor or optimal matching is
          specified as 3, the three columns of  'match.matrix'
          represent the three control units matched to one treatment
          unit).

          'NA' indicates that the treatment unit was not matched. 

discarded: A vector of length $n$ that displays whether the units were
          ineligible for matching due to common support restrictions. 
          It equals 'TRUE' if unit i was discarded, and it is set to
          'FALSE' otherwise. 

distance: A vector of length n with the estimated distance measure for
          each unit. 

 weights: A vector of length n that provides the weights assigned to
          each unit in the matching process.  Unmatched units have
          weights equal to '0'. Matched treated units have weight '1'. 
          Each matched control unit has weight proportional to the
          number of treatment units to which it was matched, and the
          sum of the control weights is equal to the number of uniquely
          matched control units. See <URL:
          http://gking.harvard.edu/matchit/docs/How_Exactly_are.html>
          for more details. 

subclass: The subclass index in an ordinal scale from 1 to the total
          number of subclasses as specified in 'subclass' (or the total
          number of subclasses from full or exact matching).  Unmatched
          units have 'NA'. 

   q.cut: The subclass cut-points that classify the distance measure. 

   treat: The treatment indicator from 'data' (the left-hand side of
          'formula'). 

       X: The covariates used for estimating the distance measure (the
          right-hand side of 'formula'). 

_A_u_t_h_o_r(_s):

     Daniel Ho <daniel.ho@yale.edu>;  Kosuke Imai
     <kimai@princeton.edu>; Gary King <king@harvard.edu>; Elizabeth
     Stuart<stuart@stat.harvard.edu>

_R_e_f_e_r_e_n_c_e_s:

     Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart (2004)
     `Matching as Nonparametric Preprocessing for Improving Parametric
     Causal Inference,'' preprint available at <URL:
     http://gking.harvard.edu/files/abs/matchp-abs.shtml>

_S_e_e _A_l_s_o:

     Please use 'help.matchit' to access the matchit reference manual. 
     The complete document is available online at <URL:
     http://gking.harvard.edu/matchit>.

