Package edu.uci.ics.jung.algorithms.util
Class DiscreteDistribution
- java.lang.Object
-
- edu.uci.ics.jung.algorithms.util.DiscreteDistribution
-
public class DiscreteDistribution extends java.lang.ObjectA utility class for calculating properties of discrete distributions. Generally, these distributions are represented as arrays ofdoublevalues, which are assumed to be normalized such that the entries in a single array sum to 1.
-
-
Constructor Summary
Constructors Constructor Description DiscreteDistribution()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description protected static voidcheckLengths(double[] dist, double[] reference)Throws anIllegalArgumentExceptionif the two arrays are not of the same length.static doublecosine(double[] dist, double[] reference)Returns the cosine distance between the two specified distributions, which must have the same number of elements.static doubleentropy(double[] dist)Returns the entropy of this distribution.static doubleKullbackLeibler(double[] dist, double[] reference)Returns the Kullback-Leibler divergence between the two specified distributions, which must have the same number of elements.static double[]mean(double[][] distributions)Returns the mean of the specified array of distributions, represented as normalized arrays ofdoublevalues.static double[]mean(java.util.Collection<double[]> distributions)Returns the mean of the specifiedCollectionof distributions, which are assumed to be normalized arrays ofdoublevalues.static voidnormalize(double[] counts, double alpha)Normalizes, with Lagrangian smoothing, the specifieddoublearray, so that the values sum to 1 (i.e., can be treated as probabilities).static doublesquaredError(double[] dist, double[] reference)Returns the squared difference between the two specified distributions, which must have the same number of elements.static doublesymmetricKL(double[] dist, double[] reference)ReturnsKullbackLeibler(dist, reference) + KullbackLeibler(reference, dist).
-
-
-
Method Detail
-
KullbackLeibler
public static double KullbackLeibler(double[] dist, double[] reference)Returns the Kullback-Leibler divergence between the two specified distributions, which must have the same number of elements. This is defined as the sum over alliofdist[i] * Math.log(dist[i] / reference[i]). Note that this value is not symmetric; seesymmetricKLfor a symmetric variant.- See Also:
symmetricKL(double[], double[])
-
symmetricKL
public static double symmetricKL(double[] dist, double[] reference)ReturnsKullbackLeibler(dist, reference) + KullbackLeibler(reference, dist).- See Also:
KullbackLeibler(double[], double[])
-
squaredError
public static double squaredError(double[] dist, double[] reference)Returns the squared difference between the two specified distributions, which must have the same number of elements. This is defined as the sum over alliof the square of(dist[i] - reference[i]).
-
cosine
public static double cosine(double[] dist, double[] reference)Returns the cosine distance between the two specified distributions, which must have the same number of elements. The distributions are treated as vectors indist.length-dimensional space. Given the following definitionsv= the sum over alliofdist[i] * dist[i]w= the sum over alliofreference[i] * reference[i]vw= the sum over alliofdist[i] * reference[i]vw / (Math.sqrt(v) * Math.sqrt(w)).
-
entropy
public static double entropy(double[] dist)
Returns the entropy of this distribution. High entropy indicates that the distribution is close to uniform; low entropy indicates that the distribution is close to a Dirac delta (i.e., if the probability mass is concentrated at a single point, this method returns 0). Entropy is defined as the sum over alliof-(dist[i] * Math.log(dist[i]))
-
checkLengths
protected static void checkLengths(double[] dist, double[] reference)Throws anIllegalArgumentExceptionif the two arrays are not of the same length.
-
normalize
public static void normalize(double[] counts, double alpha)Normalizes, with Lagrangian smoothing, the specifieddoublearray, so that the values sum to 1 (i.e., can be treated as probabilities). The effect of the Lagrangian smoothing is to ensure that all entries are nonzero; effectively, a value ofalphais added to each entry in the original array prior to normalization.- Parameters:
counts-alpha-
-
mean
public static double[] mean(java.util.Collection<double[]> distributions)
Returns the mean of the specifiedCollectionof distributions, which are assumed to be normalized arrays ofdoublevalues.- See Also:
mean(double[][])
-
mean
public static double[] mean(double[][] distributions)
Returns the mean of the specified array of distributions, represented as normalized arrays ofdoublevalues. Will throw an "index out of bounds" exception if the distribution arrays are not all of the same length.
-
-