

   HHiieerraarrcchhiiccaall CClluusstteerriinngg

        hclust(d, method="complete")

        plot.hclust(hclust.obj, hang=0.1, ...)

   AArrgguummeennttss::

          d: a dissimilarity structure as produced by `dist'.

     method: the agglomeration method to be used. This should
             be (an unambiguous abbreviation of) one of
             `"ward"', `"single"', `"complete"', `"average"',
             `"mcquitty"', `"median"' or `"centroid"'.

   hclust.obj: an object of the type produced by `hclust'.

       hang: The fraction of the plot height which labels
             should hang below the rest of the plot.  A nega-
             tive value will cause the labels to hang down from
             0.

   DDeessccrriippttiioonn::

        This function performs a hierarchical cluster analysis
        using a set of dissimilarities for the n objects being
        clustered.  Initially, each object is assigned to its
        own cluster and then the algorithm proceeds itera-
        tively, at each stage joining the two most similar
        clusters, continuing until there is just a single clus-
        ter.  At each stage distances between clusters are
        recomputed by the Lance-Williams dissimilarity update
        formula according to the particular clustering method
        being used.

        An number of different clustering methods are provided.
        Ward's minimum variance method aims at finding compact,
        spherical clusters.  The complete linkage method finds
        similar clusters. The single linkage method (which is
        closely related to the minimal spanning tree) adopts a
        `friends of friends' clustering strategy.  The other
        methods can be regarded as aiming for clusters with
        characteristics somewhere between the single and com-
        plete link methods.

        In hierarchical cluster displays, a decision is needed
        at each merge to specify which subtree should go on the
        left and which on the right.  Since, for n observations
        there are n-1 merges, there are 2^(n-1) possible order-
        ings for the leaves in a cluster tree, or dendrogram.
        The algorithm in `hclust' is to order the subtree so
        that the tighter cluster is on the left (the last, i.e.
        most recent, merge of the left subtree is at a lower
        value than the last merge of the right subtree).
        Observations are the tightest clusters possible, and
        merges involving two observations place them in order
        by their observation sequence number.

   VVaalluuee::

        An object of class hclust which describes the tree pro-
        duced by the clustering processs.  The object is a list
        with components:

      merge: an n-1 by 2 matrix.  Row i of `merge' describes
             the merging of clusters at step i of the cluster-
             ing.  If an element j in the row is negative, then
             observation -j was merged at this stage.  If j is
             positive then the merge was with the cluster
             formed at the (earlier) stage j of the algorithm.
             Thus negative entries in `merge' indicate agglom-
             erations of singletons, and positive entries indi-
             cate agglomerations of non-singletons.

     height: a set of n-1 non-decreasing real values.  The
             clustering height: that is, the value of the cri-
             terion associated with the clustering `method' for
             the particular agglomeration.

      order: a vector giving the permutation of the original
             observations suitable for plotting, in the sense
             that a cluster plot using this ordering and matrix
             `merge' will not have crossings of the branches.

     labels: labels for each of the objects being clustered.

   AAuutthhoorr((ss))::

        The `hclust' function is based on Fortran code con-
        tributed to STATLIB by F. Murtagh.

   RReeffeerreenncceess::

        Everitt, B. (1974).  Cluster Analysis.  London: Heine-
        mann Educ. Books.

        Hartigan, J. A. (1975).  Clustering  Algorithms.  New
        York: Wiley.

        Sneath, P. H. A. and R. R. Sokal (1973).  Numerical
        Taxonomy.  San Francisco: Freeman.

        Anderberg, M. R. (1973).  Cluster Analysis for Applica-
        tions.  Academic Press: New York.

        Gordon, A. D. (1981).  Classification.  London: Chapman
        and Hall.

        Murtagh, F. (1985).  ``Multidimensional Clustering
        Algorithms'', in COMPSTAT Lectures 4.  Wuerzburg: Phys-
        ica-Verlag (for algorithmic details of algorithms
        used).

   SSeeee AAllssoo::

        `kmeans'.

   EExxaammpplleess::

        library(mva)
        data(crimes)
        hc <- hclust(dist(crimes), "ave")
        # create space for labels
        mar <- mar.save <- par("mar")
        mar[1] <- mar[1] + 4
        par(mar=mar)
        plot(hc, hang=-1)
        par(mar=mar.save)

