StumpLearner Class Reference

A generic decision stump learner. More...

#include <StumpLearner.h>

Inheritance diagram for StumpLearner:

BaseLearner MultiStumpLearner SingleStumpLearner List of all members.

Public Member Functions

 StumpLearner ()
 The constructor.
virtual void initOptions (nor_utils::Args &args)
 Set the arguments of the algorithm using the standard interface of the arguments.
virtual void declareArguments (nor_utils::Args &args)
 Declare weak-learner-specific arguments.
virtual InputDatacreateInputData ()
 Creates an InputData object that it is good for the weak learner.
virtual char classify (InputData *pData, const int idx, const int classIdx)
 Return {+1, -1} for the given class and value using the learned classifier.
virtual void save (ofstream &outputStream, const int numTabs=0)
 Save the current object information needed for classification, that is: _v, The alignment vector and _selectedColumn, the column of the data with that yielded the lowest error.
virtual void load (nor_utils::StreamTokenizer &st)
 Load the xml file that contains the serialized information needed for the classification and that belongs to this class.

Protected Types

enum  eAbstType { ABST_NO_ABSTENTION, ABST_GREEDY, ABST_FULL }
 The type of abstention. More...

Protected Member Functions

virtual char phi (double val, int classIdx)=0
 A discriminative function.
virtual double getEnergy (vector< sRates > &mu, double &alpha, vector< char > &v)
 Return the energy of the current learner.
virtual double doGreedyAbstention (vector< sRates > &mu, double currEnergy, sRates &eps, double &alpha, vector< char > &v)
 Updates the v vector (alignment vector) using a greedy abstention algorithm.
virtual double doFullAbstention (const vector< sRates > &mu, double currEnergy, sRates &eps, double &alpha, vector< char > &v)
 Updates the v vector (alignment vector) evaluating all the possible combinations.

Protected Attributes

vector< char > _v
 The class-wise abstention/alignment vector.
int _selectedColumn
 The column of the training data with the lowest error.
double _theta
 the value of theta. Default = 0;
eAbstType _abstention
 Activate abstention. Default = 0 (no abstention);.
vector< double > _rightErrors
 the class-wise errors on the right side of the threshold.
vector< double > _leftErrors
 the class-wise errors on the left side of the threshold.
vector< double > _bestErrors
 the errors of the best found threshold.
vector< double > _weightsPerClass
 The total weight per class.
vector< double > _halfWeightsPerClass
 The half of the total weights per class.

Classes

struct  sRates
 The per class rates. More...

Detailed Description

A generic decision stump learner.

Date:
05/11/05

Definition at line 50 of file StumpLearner.h.


Member Enumeration Documentation

enum eAbstType [protected]
 

The type of abstention.

Date:
28/11/2005
See also:
doGreedyAbstention

doFullAbstention

Enumerator:
ABST_NO_ABSTENTION  No abstention is performed.
ABST_GREEDY  The abstention is type greedy, which complexity is O(k^2), where k is the number of classes.
ABST_FULL  The abstention is full, which complexity is O(2^k).

Definition at line 255 of file StumpLearner.h.


Constructor & Destructor Documentation

StumpLearner  )  [inline]
 

The constructor.

It initializes theta to zero and the abstention to false (that is, their default values).

Date:
11/11/2005

Definition at line 59 of file StumpLearner.h.


Member Function Documentation

char classify InputData pData,
const int  idx,
const int  classIdx
[virtual]
 

Return {+1, -1} for the given class and value using the learned classifier.

Parameters:
pData The pointer to the data
idx The index of the example to classify
classIdx The index of the class
Remarks:
Passing the data and the index to the example is not nice at all. This will soon be replace with the passing of the example itself in some form (probably a structure to the example).
Returns:
+1 if the classifier thinks that val belongs to class classIdx, -1 if it does not and 0 if it abstain
Date:
13/11/2005

Implements BaseLearner.

Definition at line 81 of file StumpLearner.cpp.

References StumpLearner::_selectedColumn, StumpLearner::_v, InputData::getValue(), and StumpLearner::phi().

InputData * createInputData  )  [virtual]
 

Creates an InputData object that it is good for the weak learner.

Overrided to return SortedData.

Warning:
The object must be destroyed by the caller.
See also:
InputData

BaseLearner::createInputData()

SortedData

Date:
21/11/2005

Reimplemented from BaseLearner.

Definition at line 74 of file StumpLearner.cpp.

void declareArguments nor_utils::Args args  )  [virtual]
 

Declare weak-learner-specific arguments.

These arguments will be added to the list of arguments under the group specific of the weak learner. It is called automatically in main, when the list of arguments is built up. Use this method to declare the arguments that belongs to the weak learner only.

This class declares the argument "-abstention" only.

Parameters:
args The Args class reference which can be used to declare additional arguments.
Date:
28/11/2005

Implements BaseLearner.

Definition at line 64 of file StumpLearner.cpp.

References Args::declareArgument().

double doFullAbstention const vector< sRates > &  mu,
double  currEnergy,
sRates eps,
double &  alpha,
vector< char > &  v
[protected, virtual]
 

Updates the v vector (alignment vector) evaluating all the possible combinations.

Again we do not leave the abstention to the weak learner but we add 0 to the alignment vector v, which decreases most the energy function.

Parameters:
mu The class rates.
currEnergy The current energy value, obtained in getEnergy().
eps The current epsilons, that the overall error, correct and abstention rates.
alpha The value of alpha that will be updated by this function minimizing the energy and using the helper function provided by BaseLearner.
v The alignment vector that will be updated in the case of abstention.
Remarks:
The complexity of this algorithm is O(2^k).
Date:
28/11/2005

Definition at line 210 of file StumpLearner.cpp.

References BaseLearner::_smallVal, BaseLearner::getAlpha(), ClassMappings::getNumClasses(), nor_utils::is_zero(), StumpLearner::sRates::rMin, StumpLearner::sRates::rPls, and StumpLearner::sRates::rZero.

Referenced by StumpLearner::getEnergy().

double doGreedyAbstention vector< sRates > &  mu,
double  currEnergy,
sRates eps,
double &  alpha,
vector< char > &  v
[protected, virtual]
 

Updates the v vector (alignment vector) using a greedy abstention algorithm.

We do not leave the decision to the weak learner as usual, but we add 0 to the decisions of the alignment vector v. This is done by optimizing the energy value with a greedy algorithm (that, for the time being, is not proved to be optimal. We first get v from one of the stump algorithms (see for instance SingleStumpLearner::findThreshold()). Then, in an iteration over the classes, we select the "best" element of v to set to 0, that is, the one that decreases the energy the most.

Parameters:
mu The class rates. It is not const because sort() is called.
currEnergy The current energy value, obtained in getEnergy().
eps The current epsilons, that the overall error, correct and abstention rates.
alpha The value of alpha that will be updated by this function minimizing the energy and using the helper function provided by BaseLearner.
v The alignment vector that will be updated in the case of abstention.
Remarks:
The complexity of this algorithm is O(k^2).
Date:
28/11/2005

Definition at line 142 of file StumpLearner.cpp.

References BaseLearner::_smallVal, BaseLearner::getAlpha(), ClassMappings::getNumClasses(), nor_utils::is_zero(), StumpLearner::sRates::rMin, StumpLearner::sRates::rPls, and StumpLearner::sRates::rZero.

Referenced by StumpLearner::getEnergy().

double getEnergy vector< sRates > &  mu,
double &  alpha,
vector< char > &  v
[protected, virtual]
 

Return the energy of the current learner.

The energy is defined as

\[ 2 \sqrt{\epsilon_+ \epsilon_-} + \epsilon_0 \]

and it is the value to minimize.

Parameters:
alpha The value of alpha that will be updated by this function minimizing the energy and using the helper function provided by BaseLearner.
mu The class rates.
v The alignment vector that will be updated in the case of abstention.
Returns:
The energy value that we want minimize.
Date:
12/11/2005

Definition at line 88 of file StumpLearner.cpp.

References StumpLearner::_abstention, BaseLearner::_smallVal, StumpLearner::_theta, StumpLearner::ABST_FULL, StumpLearner::ABST_GREEDY, StumpLearner::ABST_NO_ABSTENTION, StumpLearner::doFullAbstention(), StumpLearner::doGreedyAbstention(), BaseLearner::getAlpha(), ClassMappings::getNumClasses(), nor_utils::is_zero(), StumpLearner::sRates::rMin, and StumpLearner::sRates::rPls.

void initOptions nor_utils::Args args  )  [virtual]
 

Set the arguments of the algorithm using the standard interface of the arguments.

Call this to set the arguments asked by the user.

Parameters:
args The arguments defined by the user in the command line.
Date:
14/11/2005

Reimplemented from BaseLearner.

Definition at line 38 of file StumpLearner.cpp.

References StumpLearner::_abstention, StumpLearner::_theta, StumpLearner::ABST_FULL, StumpLearner::ABST_GREEDY, Args::getValue(), and Args::hasArgument().

void load nor_utils::StreamTokenizer st  )  [virtual]
 

Load the xml file that contains the serialized information needed for the classification and that belongs to this class.

Parameters:
st The stream tokenizer that returns tags and values as tokens
See also:
save()
Date:
13/11/2005

Reimplemented from BaseLearner.

Reimplemented in MultiStumpLearner, and SingleStumpLearner.

Definition at line 302 of file StumpLearner.cpp.

References StumpLearner::_selectedColumn, StumpLearner::_v, BaseLearner::load(), and UnSerialization::seekAndParseVectorTag().

Referenced by SingleStumpLearner::load(), and MultiStumpLearner::load().

virtual char phi double  val,
int  classIdx
[protected, pure virtual]
 

A discriminative function.

Remarks:
Positive or negative do NOT refer to positive or negative classification. This function is equivalent to the phi function in my thesis.
Parameters:
val The value to discriminate
classIdx The index of the class
Returns:
+1 if val is on one side of the border for classIdx and -1 otherwise
Date:
11/11/2005

Implemented in MultiStumpLearner, and SingleStumpLearner.

Referenced by StumpLearner::classify().

void save ofstream &  outputStream,
const int  numTabs = 0
[virtual]
 

Save the current object information needed for classification, that is: _v, The alignment vector and _selectedColumn, the column of the data with that yielded the lowest error.

Parameters:
outputStream The stream where the data will be saved
numTabs The number of tabs before the tag. Useful for indentation
Remarks:
To fully save the object it is very important to call also the super-class method.
See also:
BaseLearner::save()
Date:
13/11/2005

Reimplemented from BaseLearner.

Reimplemented in MultiStumpLearner, and SingleStumpLearner.

Definition at line 286 of file StumpLearner.cpp.

References StumpLearner::_selectedColumn, StumpLearner::_v, BaseLearner::save(), Serialization::standardTag(), and Serialization::vectorTag().

Referenced by SingleStumpLearner::save(), and MultiStumpLearner::save().


Member Data Documentation

vector<char> _v [protected]
 

The class-wise abstention/alignment vector.

It is obtained simply with

\[ v_\ell = \begin{cases} +1 & \mbox{ if } \mu_{\ell+} > \mu_{\ell-}\\ -1 & \mbox{ otherwise.} \end{cases} \]

where $\mu$ are defined in sMu.

See also:
sMu
Date:
11/11/2005

Definition at line 244 of file StumpLearner.h.

Referenced by StumpLearner::classify(), StumpLearner::load(), and StumpLearner::save().


The documentation for this class was generated from the following files:
Generated on Mon Nov 28 21:43:48 2005 for MultiBoost by  doxygen 1.4.5