

   factor {base}                                R Documentation

   FFaaccttoorrss

   DDeessccrriippttiioonn::

        The function `factor' is used to encode a vector as a
        factor (the names category and enumerated type are also
        used for factors).  If `ordered' is `TRUE', the factor
        levels are assumed to be ordered.  For compatibility
        with S there is also a function `ordered'.

        `is.factor', `is.ordered', `as.factor' and `as.ordered'
        are the membership and coercion functions for these
        classes.

   UUssaaggee::

        factor(x, levels = sort(unique(x), na.last = TRUE), labels,
               exclude = NA, ordered = FALSE)
        ordered(x, ...)

        is.factor(x)
        is.ordered(x)

        as.factor(x)
        as.ordered(x)

   AArrgguummeennttss::

          x: a vector of data, usually taking a small number of
             distinct values

     levels: an optional vector of the values that `x' might
             have taken. The default is the set of values taken
             by `x', sorted into increasing order.

     labels: either an optional vector of labels for the levels
             (in the same order as `levels' after removing
             those in `exclude'), or a character string of
             length 1.

    exclude: a vector of values to be excluded when forming the
             set of levels. This should be of the same type as
             `x', and will be coerced if necessary.

    ordered: logical flag to determine if the levels should be
             regraded as ordered (in the order given).

        ...: (in `ordered(.)'): any of the above, apart from
             `ordered' itself.

   DDeettaaiillss::

        The type of the vector `x' is not restricted.

        Ordered factors differ from factors only in their
        class, but methods and the model-fitting functions
        treat the two classes quite differently.

        The encoding of the vector happens as follows. First
        all the values in `exclude' are removed from `levels'.
        If `x[i]' equals `levels[j]', then the `i'-th element
        of the result is `j'.  If no match is found for `x[i]'
        in `levels', then the `i'-th element of the result is
        set to `NA'.

        Normally the `levels' used as an attribute of the
        result are the reduced set of levels after removing
        those in `exclude', but this can be altered by supply-
        ing `labels'. This should either be a set of new labels
        for the levels, or a character string, in which case
        the levels are that character string with a sequence
        number appended.

        `factor(x)' applied to a factor is a no-operation
        unless there are unused levels: in that case, a factor
        with the reduced level set is returned. If `exclude' is
        used it should also be a factor with the same level set
        as `x' or a set of codes for the levels to be excluded.

        The codes of a factor may contain `NA'. For a numeric
        `x', set `exclude=NULL' to make `NA' an extra level
        (`"NA"'), by default the last level.

   VVaalluuee::

        `factor' returns an object of class `"factor"' which
        has a set of numeric codes the length of `x' with a
        `"levels"' attribute of mode `character'.  If `ordered'
        is true (or `ordered' is used) the result has class
        `c("ordered", "factor")'.

        `is.factor' returns `TRUE' or `FALSE' depending on
        whether its argument is of type factor or not.  Corre-
        spondingly, `is.ordered' returns `TRUE' when its argu-
        ment is ordered and `FALSE' otherwise.

        `as.factor' coerces its argument to a factor.  It is an
        abbreviated form of `factor'.

        `as.ordered(x)' returns `x' if this is ordered, and
        `ordered(x)' otherwise.

   WWaarrnniinngg::

        The interpretation of a factor depends on both the
        codes and the `"levels"' attribute. Be careful only to
        compare factors with the same set of levels (in the
        same order).  In particular, `as.numeric' applied to a
        factor is meaningless, and may happen by implicit coer-
        cion.

        The levels of a factor are by default sorted, but the
        sort order may well depend on the locale at the time of
        creation, and should not be assumed to be ASCII.

   SSeeee AAllssoo::

        `gl' for construction of ``balanced'' factors and `C'
        for factors with specified contrasts.  `levels' and
        `nlevels' for accessing the levels,  and `codes' to get
        integer codes.

   EExxaammpplleess::

        ff <- factor(substring("statistics", 1:10, 1:10), levels=letters)
        ff
        codes(ff)
        factor(ff)# drops the levels that do not occur
        factor(factor(letters[7:10])[2:3]) # exercise indexing and reduction
        factor(letters[1:20], label="letter")

        class(ordered(4:1))# "ordered", inheriting from "factor"

