

.. _example_applications_topics_extraction_with_nmf.py:


========================================================
Topics extraction with Non-Negative Matrix Factorization
========================================================

This is a proof of concept application of Non Negative Matrix
Factorization of the term frequency matrix of a corpus of documents so
as to extract an additive model of the topic structure of the corpus.

The default parameters (n_samples / n_features / n_topics) should make
the example runnable in a couple of tens of seconds. You can try to
increase the dimensions of the problem be ware than the time complexity
is polynomial.

Here are some sample extracted topics that look quite good:

Topic #0:
god people bible israel jesus christian true moral think christians
believe don say human israeli church life children jewish

Topic #1:
drive windows card drivers video scsi software pc thanks vga
graphics help disk uni dos file ide controller work

Topic #2:
game team nhl games ca hockey players buffalo edu cc year play
university teams baseball columbia league player toronto

Topic #3:
window manager application mit motif size display widget program
xlib windows user color event information use events x11r5 values

Topic #4:
pitt gordon banks cs science pittsburgh univ computer soon disease
edu reply pain health david article medical medicine 16


**Python source code:** :download:`topics_extraction_with_nmf.py <topics_extraction_with_nmf.py>`

.. literalinclude:: topics_extraction_with_nmf.py
    :lines: 37-
    