NIPY logo

Site Navigation

NIPY Community

Table Of Contents

This Page

neurospin.clustering.gmm

Module: neurospin.clustering.gmm

Inheritance diagram for nipy.neurospin.clustering.gmm:

Gaussian Mixture Model Class: contains the basic fields and methods of GMMs The class GMM _old uses C bindings which are computationally and memory efficient.

Author : Bertrand Thirion, 2006-2009

Classes

GMM

class nipy.neurospin.clustering.gmm.GMM(k=1, dim=1, prec_type='full', means=None, precisions=None, weights=None)

Standard GMM.

this class contains the following fields k (int): the number of components in the mixture dim (int): is the dimension of the data prec_type = ‘full’ (string) is the parameterization

of the precisions/covariance matrices: either ‘full’ or ‘diagonal’.
means: array of shape (k,dim):
all the means (mean parameters) of the components
precisions: array of shape (k,dim,dim):
the precisions (inverse covariance matrix) of the components

weights: array of shape(k): weights of the mixture

fixme : - no copy method

Methods

average_log_like
bic
check
check_x
estimate
evidence
guess_regularizing
initialize
initialize_and_estimate
likelihood
map_label
mixture_likelihood
plugin
pop
show
show_components
test
train
unweighted_likelihood
update
__init__(k=1, dim=1, prec_type='full', means=None, precisions=None, weights=None)

Initialize the structure, at least with the dimensions of the problem

Parameters:

k (int) the number of classes of the model :

dim (int) the dimension of the problem :

prec_type = ‘full’ : coavriance:precision parameterization

(diagonal ‘diag’ or full ‘full’).

means = None: array of shape (self.k,self.dim) :

precisions = None: array of shape (self.k,self.dim,self.dim) :

or (self.k, self.dim)

weights=None: array of shape (self.k) :

By default, means, precision and weights are set as :

zeros() :

eye() :

1/k ones() :

with the correct dimensions :

average_log_like(x, tiny=1.0000000000000001e-15)

returns the averaged log-likelihood of the model for the dataset x

Parameters:

x: array of shape (nbitems,self.dim) :

the data used in the estimation process

tiny = 1.e-15: a small constant to avoid numerical singularities :

bic(like, tiny=1.0000000000000001e-15)

computation of bic approximation of evidence

Parameters:

like, array of shape (nbitem,self.k) :

component-wise likelihood

tiny=1.e-15, a small constant to avoid numerical singularities :

Returns:

the bic value :

check()
Checking the shape of different matrices involved in the model
check_x(x)

essentially check that x.shape[1]==self.dim

x is returned with possibly reshaping

estimate(x, niter=100, delta=0.0001, verbose=0)

estimation of self given a dataset x

Parameters:

x array of shape (nbitem,dim) :

the data from which the model is estimated

niter=100: maximal number of iterations in the estimation process :

delta = 1.e-4: increment of data likelihood at which :

convergence is declared

verbose=0: verbosity mode :

Returns:

bic : an asymptotic approximation of model evidence

evidence(x)

computation of bic approximation of evidence

Parameters:

x array of shape (nbitems,dim) :

the data from which bic is computed

Returns:

the bic value :

guess_regularizing(x, bcheck=1)

Set the regularizing priors as weakly informative according to Fraley and raftery; Journal of Classification 24:155-181 (2007)

Parameters:

x array of shape (nbitems,dim) :

the data used in the estimation process

initialize(x)

this function initializes self according to a certain dataset x: 1. sets the regularizing hyper-parameters 2. initializes z using a k-means algorithm, then 3. upate the parameters

Parameters:

x, array of shape (nbitems,self.dim) :

the data used in the estimation process

initialize_and_estimate(x, z=None, niter=100, delta=0.0001, ninit=1, verbose=0)

estimation of self given x

Parameters:

x array of shape (nbitem,dim) :

the data from which the model is estimated

z = None: array of shape (nbitem) :

a prior labelling of the data to initialize the computation

niter=100: maximal number of iterations in the estimation process :

delta = 1.e-4: increment of data likelihood at which :

convergence is declared

ninit=1: number of initialization performed :

to reach a good solution

verbose=0: verbosity mode :

Returns:

the best model is returned :

likelihood(x)

return the likelihood of the model for the data x the values are weighted by the components weights

Parameters:

x array of shape (nbitems,self.dim) :

the data used in the estimation process

Returns:

like, array of shape(nbitem,self.k) :

component-wise likelihood

map_label(x, like=None)

return the MAP labelling of x

Parameters:

x array of shape (nbitem,dim) :

the data under study

like=None array of shape(nbitem,self.k) :

component-wise likelihood if like==None, it is recomputed

Returns:

z: array of shape(nbitem): the resulting MAP labelling :

of the rows of x

mixture_likelihood(x)

returns the likelihood of the mixture for x

Parameters:

x: array of shape (nbitems,self.dim) :

the data used in the estimation process

plugin(means, precisions, weights)

Set manually the weights, means and precision of the model

Parameters:

means: array of shape (self.k,self.dim) :

precisions: array of shape (self.k,self.dim,self.dim) :

or (self.k, self.dim)

weights: array of shape (self.k) :

pop(l, tiny=1.0000000000000001e-15)

compute the population, i.e. the statistics of allocation

Parameters:

l array of shape (nbitem,self.k): :

the likelihood of each item being in each class

show(x, gd, density=None, nbf=-1)
Function to plot a GMM -WIP Currently, works only in 1D and 2D
show_components(x, gd, density=None, mpaxes=None)

Function to plot a GMM – Currently, works only in 1D

Parameters:

x: array of shape(nbitems,dim) :

the data under study used to draw an histogram

gd: grid descriptor structure :

density = None: :

density of the model one the discrete grid implied by gd

mpaxes = None: axes handle to make the figure :

if None, a new figure is created

test(x, tiny=1.0000000000000001e-15)

returns the log-likelihood of the mixture for x

Parameters:

x array of shape (nbitems,self.dim) :

the data used in the estimation process

Returns:

ll: array of shape(nbitems) :

the log-likelihood of the rows of x

train(x, z=None, niter=100, delta=0.0001, ninit=1, verbose=0)
idem initialize_and_estimate
unweighted_likelihood(x)

return the likelihood of each data for each component the values are not weighted by the component weights

Parameters:

x: array of shape (nbitems,self.dim) :

the data used in the estimation process

Returns:

like, array of shape(nbitem,self.k) :

unweighted component-wise likelihood

update(x, l)
Identical to self._Mstep(x,l)

grid_descriptor

class nipy.neurospin.clustering.gmm.grid_descriptor(dim=1)

Bases: object

A tiny class to handle cartesian grids

Methods

getinfo
make_grid
__init__(dim=1)
getinfo(lim, nbs)
make_grid()

Functions

nipy.neurospin.clustering.gmm.best_fitting_GMM(x, krange, prec_type='full', niter=100, delta=0.0001, ninit=1, verbose=0)

Given a certain dataset x, find the best-fitting GMM within a certain range indexed by krange

Parameters:

x array of shape (nbitem,dim) :

the data from which the model is estimated

krange (list of floats) the range of values to test for k :

prec_type =’full’, string (to be chosen within ‘full’,’diag’) :

the covariance parameterization

niter=100, int, maximal number of iterations in the estimation process :

delta = 1.e-4n float increment of data likelihood at which :

convergence is declared

ninit = 1, int number of initialization performed :

to reach a good solution

verbose=0: verbosity mode :

Returns:

mg : the best-fitting GMM

nipy.neurospin.clustering.gmm.plot2D(x, my_gmm, z=None, show=0, verbose=0, withDots=True, logScale=False, mpaxes=None)

Given a set of points in a plane and a GMM, plot them

Parameters:

x : array of shape (npoints,dim=2)

my_gmm: a gmm whose density has to be ploted :

z = None: array of shape (npoints) :

that gives a labelling of the points in x by default, it is not taken into account

show = 0: do we show the image :

verbose=0 : verbosity mode

withDots=True, bool :

Plot the dots or not

logScale=False, bool :

plot the likelihood in log scale or not

mpaxes=None, int :

if not None, axes haandle for plotting

Returns:

gd, grid_descriptor instance, :

that represents the grid used in the function

ax, handle to the figure axes :