

   lqs {lqs}                                    R Documentation

   RReessiissttaanntt RReeggrreessssiioonn

   DDeessccrriippttiioonn::

        Fit a regression to the `good' points in the dataset,
        thereby achieving a regression estimator with a high
        breakdown point.  `lmsreg' and `ltsreg' are compatibil-
        ity wrappers.

   UUssaaggee::

        lqs(x, ...)
        lqs.formula(formula, data = NULL, ...,
                    method = c("lts", "lqs", "lms", "S", "model.frame"),
                    subset, na.action = na.fail, model = TRUE,
                    x = FALSE, y = FALSE, contrasts = NULL)
        lqs.default(x, y, intercept, method = c("lts", "lqs", "lms", "S"),
                    quantile, control = lqs.control(...), k0 = 1.548, seed, ...)
        lmsreg(...)
        ltsreg(...)

   AArrgguummeennttss::

    formula: a formula of the form `y ~ x1 + x2 + ...{}{}'.

       data: data frame from which variables specified in `for-
             mula' are preferentially to be taken.

     subset: An index vector specifying the cases to be used in
             fitting. (NOTE: If given, this argument must be
             named exactly.)

   na.action: A function to specify the action to be taken if
             `NA's are found. The default action is for the
             procedure to fail. An alternative is `na.omit',
             which leads to omission of cases with missing val-
             ues on any required variable.  (NOTE: If given,
             this argument must be named exactly.)

          x: a matrix or data frame containing the explanatory
             variables.

          y: the response: a vector of length the number of
             rows of `x'.

   intercept: should the model include an intercept?

     method: the method to be used. `model.frame' returns the
             model frame: for the others see the `Details' sec-
             tion. Using `lmsreg' or `ltsreg' forces `"lms"'
             and `"lts"' respectively.

   quantile: the quantile to be used: see `Details'. This is
             over-ridden if `method = "lms"'.

    control: additional control items: see `Details'.

       seed: the seed to be used for random sampling: see
             `.Random.seed'. The current value of `.Ran-
             dom.seed' will be preserved if it is set..

        ...: arguments to be passed to `lqs.default' or
             `lqs.control'.

   DDeettaaiillss::

        Suppose there are `n' data points and `p' regressors,
        including any intercept.

        The first three methods minimize some function of the
        sorted squared residuals. For methods `"lqs"' and
        `"lms"' is the `quantile' squared residual, and for
        `"lts"' it is the sum of the `quantile' smallest
        squared residuals. `"lqs"' and `"lms"' differ in the
        defaults for `quantile', which are `floor((n+p+1)/2)'
        and `floor((n+1)/2)' respectively.  For `"lts"' the
        default is `floor(n/2) + floor((p+1)/2)'.

        The `"S"' estimation method solves for the scale `s'
        such that the average of a function chi of the residu-
        als divided by `s' is equal to a given constant.

        The `control' argument is a list with components:

        `psamp': the size of each sample. Defaults to `p'.

        `nsamp': the number of samples or `"best"' or `"exact"'
        or `"sample"'. If `"sample"' the number chosen is
        `min(5*p, 3000)', taken from Rousseeuw and Hubert
        (1997).  If `"best"' exhaustive enumeration is done up
        to 5000 samples: if `"exact"' exhaustive enumeration
        will be attempted however many samples are needed.

        `adjust': should the intercept be optimized for each
        sample?

   VVaalluuee::

        An object of class `"lqs"'.

   NNoottee::

        There seems no reason other than historical to use the
        `lms' and `lqs' options.  LMS estimation is of low
        efficiency (converging at rate n^{-1/3}) whereas LTS
        has the same asymptotic efficiency as an M estimator
        with trimming at the quartiles (Marazzi, 1993, p.201).
        LQS and LTS have the same maximal breakdown value of
        `(floor((n-p)/2) + 1)/n' attained if `floor((n+p)/2) <=
        quantile <= floor((n+p+1)/2)'.  The only drawback men-
        tioned of LTS is greater computation, as a sort was
        thought to be required (Marazzi, 1993, p.201) but this
        is not true as a partial sort can be used (and is used
        in this implementation).

        Adjusting the intercept for each trial fit does need
        the residuals to be sorted, and may be significant
        extra computation if `n' is large and `p' small.

        Opinions differ over the choice of `psamp'. Rousseeuw
        and Hubert (1997) only consider p; Marazzi (1993) rec-
        ommends p+1 and suggests that more samples are better
        than adjustment for a given computational limit.

        The computations are exact for a model with just an
        intercept and adjustment, and for LQS for a model with
        an intercept plus one regressor and exhaustive search
        with adjustment. For all other cases the minimization
        is only known to be approximate.

   AAuutthhoorr((ss))::

        B.D. Ripley

   RReeffeerreenncceess::

        P. J. Rousseeuw and A. M. Leroy (1987) Robust Regres-
        sion and Outlier Detection.  Wiley.

        A. Marazzi (1993) Algorithms, Routines and S Functions
        for Robust Statistics.  Wadsworth and Brooks/Cole.

        P. Rousseeuw and M. Hubert (1997) Recent developments
        in PROGRESS. In L1-Statistical Procedures and Related
        Topics ed Y. Dodge, IMS Lecture Notes volume 31, pp.
        201-214.

   SSeeee AAllssoo::

        `predict.lqs'

   EExxaammpplleess::

        data(stackloss)
        .Random.seed <- 1:4
        lqs(stack.loss ~ ., data=stackloss)
        lqs(stack.loss ~ ., data=stackloss, method="S", nsamp="exact")

