Classifier Class Reference

Classify a dataset. More...

#include <Classifier.h>

List of all members.

Public Member Functions

 Classifier (nor_utils::Args &args, int verbose=1)
 The constructor.
void run (const string &dataFileName, const string &shypFileName, const int numRanksEnclosed=2)
 Starts the classification process.

Protected Member Functions

InputDataloadInputData (const string &dataFileName, const string &shypFileName)
 Loads the data.
void computeResults (InputData *pData, vector< BaseLearner * > &weakHypotheses, vector< ExampleResults > &results)
 Compute the results using the weak hypotheses.
double getOverallError (InputData *pData, const vector< ExampleResults > &results, int atLeastRank=0)
 Compute the overall error on the data.
void getClassError (InputData *pData, const vector< ExampleResults > &results, vector< double > &classError, int atLeastRank=0)
 Compute the error per class.

Protected Attributes

int _verbose
 Defines the level of verbosity:
  • 0 = no messages
  • 1 = basic messages
  • 2 = show all messages.

nor_utils::Args_args
 The arguments defined by the user.
string _outputInfoFile
 The filename of the step-by-step information file that will be updated.

Classes

class  ExampleResults
 Holds the results per example. More...


Detailed Description

Classify a dataset.

Using the strong hypothesis file (shyp.xml by default) it builds the list of weak hypothesis (or weak learners), and use them to perform a classification over the given data set. The strong hypothesis is the linear combination of the weak hypotheses and their confidence alpha, and is defined as:

\[ {\bf g}(x) = \sum_{t=1}^T \alpha^{(t)} {\bf h}^{(t)}(x), \]

where the bold defines a vector as returned value. To obtain a single class, we simply take the winning class that receives the "most vote", that is:

\[ f(x) = \mathop{\rm arg\, max}_{\ell} g_\ell(x). \]

Date:
15/11/2005

Definition at line 62 of file Classifier.h.


Constructor & Destructor Documentation

Classifier nor_utils::Args args,
int  verbose = 1
 

The constructor.

It initializes the variable and set them using the information provided by the arguments passed. They are parsed using the helpers provided by class Args.

Parameters:
args The arguments defined by the user in the command line.
verbose The level of verbosity
See also:
_verbose
Date:
16/11/2005

Definition at line 32 of file Classifier.cpp.

References Classifier::_outputInfoFile, Args::getValue(), and Args::hasArgument().


Member Function Documentation

void computeResults InputData pData,
vector< BaseLearner * > &  weakHypotheses,
vector< ExampleResults > &  results
[protected]
 

Compute the results using the weak hypotheses.

This method is the one that effectively computes ${\bf g}(x)$ .

Parameters:
pData A pointer to the data to be classified.
weakHypotheses The list of weak hypotheses.
results The vector where the results will be stored.
See also:
ExampleResults
Date:
16/11/2005

Definition at line 178 of file Classifier.cpp.

References Classifier::_outputInfoFile, OutputInfo::endLine(), ClassMappings::getNumClasses(), InputData::getNumExamples(), OutputInfo::outputError(), and OutputInfo::outputIteration().

Referenced by Classifier::ExampleResults::getRankedList(), and Classifier::run().

void getClassError InputData pData,
const vector< ExampleResults > &  results,
vector< double > &  classError,
int  atLeastRank = 0
[protected]
 

Compute the error per class.

Parameters:
pData A pointer to the data. Needed to get the real class of the example.
results The vector where the results are hold.
classError The returned per class errors.
atLeastRank The maximum rank in which the classification will not be considered an error. If atLeastRank = 0, no errors are allowed. If it is 1, the second "guess" will be taken into consideration, among the first, and so on.
See also:
ExampleResults
Date:
16/11/2005

Definition at line 260 of file Classifier.cpp.

References ClassMappings::getNumClasses().

Referenced by Classifier::ExampleResults::getRankedList(), and Classifier::run().

double getOverallError InputData pData,
const vector< ExampleResults > &  results,
int  atLeastRank = 0
[protected]
 

Compute the overall error on the data.

Parameters:
pData A pointer to the data. Needed to get the real class of the example.
results The vector where the results are hold.
atLeastRank The maximum rank in which the classification will not be considered an error. If atLeastRank = 0, no errors are allowed. If it is 1, the second "guess" will be taken into consideration, among the first, and so on.
Returns:
The error.
See also:
ExampleResults
Date:
16/11/2005

Definition at line 238 of file Classifier.cpp.

References InputData::getClass(), and InputData::getNumExamples().

Referenced by Classifier::ExampleResults::getRankedList(), and Classifier::run().

InputData * loadInputData const string &  dataFileName,
const string &  shypFileName
[protected]
 

Loads the data.

It needs the Strong Hypothesis file because it needs the information about the weak learner used to generate it. The weak learner might have associated a special InputData derived class, which is returned by BaseLearner::createInputData() once the weak learner has been identified.

Parameters:
dataFileName The file name of the data to be classified.
shypFileName The strong hypothesis filename. It is the xml file containing the
Warning:
The returned object must be destroyed by the caller.
Date:
21/11/2005

Definition at line 133 of file Classifier.cpp.

Referenced by Classifier::run().

void run const string &  dataFileName,
const string &  shypFileName,
const int  numRanksEnclosed = 2
 

Starts the classification process.

Parameters:
dataFileName The file name of the data to be classified.
shypFileName The strong hypothesis filename. It is the xml file containing the list of weak hypotheses that form the strong hypothesis.
numRanksEnclosed This parameter defines the number of ranks to be printed.
Remarks:
If numRanksEnclosed=1, the only error displayed will be the one in which the $\mathop{\rm arg\, max}_{\ell} g_\ell(x)$ is not the correct class. If numRanksEnclosed=2, in addition to the standard error, there will be also the error in which the real class is not the max, nor the second biggest of ${\bf g}(x)$ . With larger values of numRanksEnclosed it displays also the other values. This is useful for multi-class problems, when if the "first guess" was wrong a "second guess" is allowed.
Date:
16/11/2005

Definition at line 42 of file Classifier.cpp.

References Classifier::_verbose, Classifier::computeResults(), Classifier::getClassError(), ClassMappings::getClassNameFromIdx(), ClassMappings::getNumClasses(), InputData::getNumExamples(), Classifier::getOverallError(), UnSerialization::loadHypotheses(), and Classifier::loadInputData().


The documentation for this class was generated from the following files:
Generated on Mon Nov 28 21:43:47 2005 for MultiBoost by  doxygen 1.4.5