#include <CAcInvertedFile.h>
Inheritance diagram for CAcInvertedFile:

Public Member Functions | |
| virtual bool | operator() () const =0 |
| for testing if the inverted file is correctly constructed | |
| virtual string | IDToURL (TID inID) const =0 |
| Translate a DocumentID to a URL (for output). | |
| virtual pair< bool, TID > | URLToID (const string &inURL) const =0 |
| Translate an URL to its document ID. | |
| virtual list< TID > * | getAllFeatureIDs () const =0 |
| Getting a list of all features contained in this. | |
| bool | operator() () const |
| for testing if the inverted file is correctly constructed | |
| CAcInvertedFile (const CXMLElement &inCollectionElement) | |
| This opens an exsisting inverted file, and then inits this structure. | |
| bool | init (bool) |
| called by constructors | |
| ~CAcInvertedFile () | |
| Destructor. | |
| string | IDToURL (TID inID) const |
| Translate a DocumentID to a URL (for output). | |
| TID | URLToID (const string &inURL) const |
| Translate an URL to its document ID. | |
| TID | getMaximumFeatureID () const |
| This is interesting for browsing. | |
| list< TID > * | getAllFeatureIDs () const |
| Getting a list of all features contained in this. | |
The proper inverted file access | |
| virtual CDocumentFrequencyList * | FeatureToList (TFeatureID inFID) const =0 |
| Give the List of documents containing the feature inFID. | |
| virtual CDocumentFrequencyList * | URLToFeatureList (string inURL) const =0 |
| List of features contained by a document with URL inURL. | |
| virtual CDocumentFrequencyList * | DIDToFeatureList (TID inDID) const =0 |
| List of features contained by a document with ID inDID. | |
Accessing information about features | |
| virtual double | FeatureToCollectionFrequency (TFeatureID) const =0 |
| Collection frequency for a given feature. | |
| virtual unsigned int | getFeatureDescription (TID inFeatureID) const =0 |
| What kind of feature is the feature with ID inFeatureID? | |
Accessing additional document information | |
| virtual double | DIDToMaxDocumentFrequency (TID) const =0 |
| returns the maximum document frequency for one document ID | |
| virtual double | DIDToDFSquareSum (TID) const =0 |
| Returns the document-frequency square sum for a given document ID. | |
| virtual double | DIDToSquareDFLogICFSum (TID) const =0 |
| Returns this function for a given document ID. | |
| virtual bool | generateInvertedFile ()=0 |
| Generating an inverted File, if there is none. | |
| virtual bool | checkConsistency ()=0 |
| Check the consistency of the inverted file system accessed by this accessor. | |
The proper inverted file access | |
| CDocumentFrequencyList * | FeatureToList (TFeatureID) const |
| List of documents containing the feature. | |
| CDocumentFrequencyList * | URLToFeatureList (string inURL) const |
| List of features contained by a document. | |
| CDocumentFrequencyList * | DIDToFeatureList (TID inDID) const |
| List of features contained by a document with ID inDID. | |
Accessing information about features | |
| double | FeatureToCollectionFrequency (TFeatureID) const |
| Collection frequency for a given feature. | |
| unsigned int | getFeatureDescription (TID inFeatureID) const |
| What kind of feature is the feature with ID inFeatureID? | |
Accessing additional document information | |
| double | DIDToMaxDocumentFrequency (TID) const |
| returns the maximum document frequency for one document ID | |
| double | DIDToDFSquareSum (TID) const |
| Returns the document-frequency square sum for a given document ID. | |
| double | DIDToSquareDFLogICFSum (TID) const |
| Returns this function for a given document ID. | |
| bool | generateInvertedFile () |
| Generating an inverted File, if there is none. | |
| bool | newGenerateInvertedFile () |
| Generating an inverted File, if there is none. | |
| bool | checkConsistency () |
| Check the consistency of the inverted file system accessed by this accessor. | |
| bool | findWithinStream (TID inFeatureID, TID inDocumentID, double inDocumentFrequency) const |
| Is the Document with inDocumentID contained in the document frequency list of the feature inFeatureID and is the associated document frequency the same? | |
Protected Types | |
|
typedef hash_map< TID, unsigned int > | CIDToOffset |
| map from feature id to the offset for this feature | |
Protected Member Functions | |
| void | writeOffsetFileElement (TID inFeatureID, int inPosition, ostream &inOpenOffsetFile) |
| add a pair of FeatureID,Offset to the open offset file (helper function for inverted file construction) | |
| CDocumentFrequencyList * | getFeatureFile (string inFileName) const |
| loads a *.fts file. | |
Protected Attributes | |
| TID | mMaximumFeatureID |
| the maximum feature ID arising in this file | |
|
CArraySelfDestroyPointer< char > | mInvertedFileBuffer |
| A buffer, if the inverted file is to be held in ram. | |
| CSelfDestroyPointer< istream > | mInvertedFile |
| The inverted file. | |
| ifstream | mOffsetFile |
| Feature -> Offset in inverted file. | |
| ifstream | mFeatureDescriptionFile |
| File of feature descriptions. | |
| string | mInvertedFileName |
| Name of the inverted file. | |
| string | mOffsetFileName |
| Name of the Offset file. | |
| string | mFeatureDescriptionFileName |
| Name for the file with the feature description. | |
| CIDToOffset | mIDToOffset |
| map from feature id to the offset for this feature | |
| hash_map< TID, double > | mFeatureToCollectionFrequency |
| map from feature to the collection frequency | |
for fast access... | |
| hash_map< TID, unsigned int > | mFeatureDescription |
| map from the feature ID to the feature description | |
| CADIHash | mDocumentInformation |
| additional information about the document like, e.g. | |
This access is done "by hand" at present this not really efficient, however we plan to move to memory mapped files.
|
|
This opens an exsisting inverted file, and then inits this structure. After that it is fully usable As a paramter it takes an XMLElement which contains a "collection" element and its content. If the attribute vi-generate-inverted-file is true, then a new inverted file will be generated using the parameters given in inCollectionElement. you will NOT be able to use *this afterwards. The REAL constructor. |
|
|
Give the List of documents containing the feature inFID. CORNELIA: CDocumentFrequencyList ist nichts anderes als eine liste von int,float paaren: struct{ int mID, float mFrequency; } Implemented in CAcIFFileSystem.
|
|
|
Generating an inverted File, if there is none. Fast but stupid in-memory method. This method is very fast, if all the inverted file (and a bit more) can be kept in memory at runtime. If this is not the case, extensive swapping is the result, virtually halting the inverted file creation. Reimplemented in CAcIFFileSystem.
|
|
|
Getting a list of all features contained in this. This function is necessary, because in the present system only about 50 percent of the features are really used. A feature is considered used if it arises in mIDToOffset. Reimplemented in CAcIFFileSystem.
|
|
|
Getting a list of all features contained in this. This function is necessary, because in the present system only about 50 percent of the features are really used. A feature is considered used if it arises in at least one image Implemented in CAcIFFileSystem.
|
|
|
loads a *.fts file. and returns the feature list Reimplemented in CAcIFFileSystem.
|
|
|
Generating an inverted File, if there is none. Employing the two-way-merge method described in "managing gigabytes", chapter 5.2. Sort-based inversion. (Page 181) Reimplemented in CAcIFFileSystem.
|
|
|
additional information about the document like, e.g. the euclidean length of the feature list. Reimplemented in CAcIFFileSystem.
|