StumpLearner Class Reference

A generic decision stump learner. More...

#include <StumpLearner.h>

Inheritance diagram for StumpLearner:


Public Member Functions
	StumpLearner ()
	The constructor.
virtual void	initOptions (nor_utils::Args &args)
	Set the arguments of the algorithm using the standard interface of the arguments.
virtual void	declareArguments (nor_utils::Args &args)
	Declare weak-learner-specific arguments.
virtual InputData *	createInputData ()
	Creates an InputData object that it is good for the weak learner.
virtual char	classify (InputData *pData, const int idx, const int classIdx)
	Return {+1, -1} for the given class and value using the learned classifier.
virtual void	save (ofstream &outputStream, const int numTabs=0)
	Save the current object information needed for classification, that is: _v, The alignment vector and _selectedColumn, the column of the data with that yielded the lowest error.
virtual void	load (nor_utils::StreamTokenizer &st)
	Load the xml file that contains the serialized information needed for the classification and that belongs to this class.
Protected Types
enum	eAbstType { ABST_NO_ABSTENTION, ABST_GREEDY, ABST_FULL }
	The type of abstention. More...
Protected Member Functions
virtual char	phi (double val, int classIdx)=0
	A discriminative function.
virtual double	getEnergy (vector< sRates > &mu, double &alpha, vector< char > &v)
	Return the energy of the current learner.
virtual double	doGreedyAbstention (vector< sRates > &mu, double currEnergy, sRates &eps, double &alpha, vector< char > &v)
	Updates the v vector (alignment vector) using a greedy abstention algorithm.
virtual double	doFullAbstention (const vector< sRates > &mu, double currEnergy, sRates &eps, double &alpha, vector< char > &v)
	Updates the v vector (alignment vector) evaluating all the possible combinations.
Protected Attributes
vector< char >	_v
	The class-wise abstention/alignment vector.
int	_selectedColumn
	The column of the training data with the lowest error.
double	_theta
	the value of theta. Default = 0;
eAbstType	_abstention
	Activate abstention. Default = 0 (no abstention);.
vector< double >	_rightErrors
	the class-wise errors on the right side of the threshold.
vector< double >	_leftErrors
	the class-wise errors on the left side of the threshold.
vector< double >	_bestErrors
	the errors of the best found threshold.
vector< double >	_weightsPerClass
	The total weight per class.
vector< double >	_halfWeightsPerClass
	The half of the total weights per class.
Classes
struct	sRates
	The per class rates. More...

Detailed Description

A generic decision stump learner.

Date:: 05/11/05

Definition at line 50 of file StumpLearner.h.

Member Enumeration Documentation

enum eAbstType [protected]

The type of abstention.

Date:
28/11/2005

See also:
doGreedyAbstention
doFullAbstention

Enumerator:

ABST_NO_ABSTENTION No abstention is performed.

ABST_GREEDY The abstention is type greedy, which complexity is O(k^2), where k is the number of classes.

ABST_FULL The abstention is full, which complexity is O(2^k).

Definition at line 255 of file StumpLearner.h.

Constructor & Destructor Documentation

StumpLearner ( ) [inline]

The constructor.
It initializes theta to zero and the abstention to false (that is, their default values).
Date:
11/11/2005

Definition at line 59 of file StumpLearner.h.

Member Function Documentation

char classify ( InputData * pData,

const int idx,

const int classIdx

) [virtual]

Return {+1, -1} for the given class and value using the learned classifier.

Parameters:

pData The pointer to the data

idx The index of the example to classify

classIdx The index of the class

Remarks:
Passing the data and the index to the example is not nice at all. This will soon be replace with the passing of the example itself in some form (probably a structure to the example).

Returns:
+1 if the classifier thinks that val belongs to class classIdx, -1 if it does not and 0 if it abstain

Date:
13/11/2005

Implements BaseLearner.
Definition at line 81 of file StumpLearner.cpp.
References StumpLearner::_selectedColumn, StumpLearner::_v, InputData::getValue(), and StumpLearner::phi().

InputData * createInputData ( ) [virtual]

Creates an InputData object that it is good for the weak learner.
Overrided to return SortedData.
Warning:
The object must be destroyed by the caller.

See also:
InputData
BaseLearner::createInputData()
SortedData

Date:
21/11/2005

Reimplemented from BaseLearner.
Definition at line 74 of file StumpLearner.cpp.

void declareArguments ( nor_utils::Args & args ) [virtual]

Declare weak-learner-specific arguments.
These arguments will be added to the list of arguments under the group specific of the weak learner. It is called automatically in main, when the list of arguments is built up. Use this method to declare the arguments that belongs to the weak learner only.
This class declares the argument "-abstention" only.
Parameters:

args The Args class reference which can be used to declare additional arguments.

Date:
28/11/2005

Implements BaseLearner.
Definition at line 64 of file StumpLearner.cpp.
References Args::declareArgument().

double doFullAbstention ( const vector< sRates > & mu,

double currEnergy,

sRates & eps,

double & alpha,

vector< char > & v

) [protected, virtual]

Updates the v vector (alignment vector) evaluating all the possible combinations.
Again we do not leave the abstention to the weak learner but we add 0 to the alignment vector v, which decreases most the energy function.
Parameters:

mu The class rates.

currEnergy The current energy value, obtained in getEnergy().

eps The current epsilons, that the overall error, correct and abstention rates.

alpha The value of alpha that will be updated by this function minimizing the energy and using the helper function provided by BaseLearner.

v The alignment vector that will be updated in the case of abstention.

Remarks:
The complexity of this algorithm is O(2^k).

Date:
28/11/2005

Definition at line 210 of file StumpLearner.cpp.
References BaseLearner::_smallVal, BaseLearner::getAlpha(), ClassMappings::getNumClasses(), nor_utils::is_zero(), StumpLearner::sRates::rMin, StumpLearner::sRates::rPls, and StumpLearner::sRates::rZero.
Referenced by StumpLearner::getEnergy().

double doGreedyAbstention ( vector< sRates > & mu,

double currEnergy,

sRates & eps,

double & alpha,

vector< char > & v

) [protected, virtual]

Updates the v vector (alignment vector) using a greedy abstention algorithm.
We do not leave the decision to the weak learner as usual, but we add 0 to the decisions of the alignment vector v. This is done by optimizing the energy value with a greedy algorithm (that, for the time being, is not proved to be optimal. We first get v from one of the stump algorithms (see for instance SingleStumpLearner::findThreshold()). Then, in an iteration over the classes, we select the "best" element of v to set to 0, that is, the one that decreases the energy the most.
Parameters:

mu The class rates. It is not const because sort() is called.

currEnergy The current energy value, obtained in getEnergy().

eps The current epsilons, that the overall error, correct and abstention rates.

alpha The value of alpha that will be updated by this function minimizing the energy and using the helper function provided by BaseLearner.

v The alignment vector that will be updated in the case of abstention.

Remarks:
The complexity of this algorithm is O(k^2).

Date:
28/11/2005

Definition at line 142 of file StumpLearner.cpp.
References BaseLearner::_smallVal, BaseLearner::getAlpha(), ClassMappings::getNumClasses(), nor_utils::is_zero(), StumpLearner::sRates::rMin, StumpLearner::sRates::rPls, and StumpLearner::sRates::rZero.
Referenced by StumpLearner::getEnergy().

double getEnergy ( vector< sRates > & mu,

double & alpha,

vector< char > & v

) [protected, virtual]

Return the energy of the current learner.
The energy is defined as
$2 \sqrt{\epsilon_+ \epsilon_-} + \epsilon_0$
and it is the value to minimize.
Parameters:

alpha The value of alpha that will be updated by this function minimizing the energy and using the helper function provided by BaseLearner.

mu The class rates.

v The alignment vector that will be updated in the case of abstention.

Returns:
The energy value that we want minimize.

Date:
12/11/2005

Definition at line 88 of file StumpLearner.cpp.
References StumpLearner::_abstention, BaseLearner::_smallVal, StumpLearner::_theta, StumpLearner::ABST_FULL, StumpLearner::ABST_GREEDY, StumpLearner::ABST_NO_ABSTENTION, StumpLearner::doFullAbstention(), StumpLearner::doGreedyAbstention(), BaseLearner::getAlpha(), ClassMappings::getNumClasses(), nor_utils::is_zero(), StumpLearner::sRates::rMin, and StumpLearner::sRates::rPls.

void initOptions ( nor_utils::Args & args ) [virtual]

Set the arguments of the algorithm using the standard interface of the arguments.
Call this to set the arguments asked by the user.
Parameters:

args The arguments defined by the user in the command line.

Date:
14/11/2005

Reimplemented from BaseLearner.
Definition at line 38 of file StumpLearner.cpp.
References StumpLearner::_abstention, StumpLearner::_theta, StumpLearner::ABST_FULL, StumpLearner::ABST_GREEDY, Args::getValue(), and Args::hasArgument().

void load ( nor_utils::StreamTokenizer & st ) [virtual]

Load the xml file that contains the serialized information needed for the classification and that belongs to this class.

Parameters:

st The stream tokenizer that returns tags and values as tokens

See also:
save()

Date:
13/11/2005

Reimplemented from BaseLearner.
Reimplemented in MultiStumpLearner, and SingleStumpLearner.
Definition at line 302 of file StumpLearner.cpp.
References StumpLearner::_selectedColumn, StumpLearner::_v, BaseLearner::load(), and UnSerialization::seekAndParseVectorTag().
Referenced by SingleStumpLearner::load(), and MultiStumpLearner::load().

virtual char phi ( double val,

int classIdx

) [protected, pure virtual]

A discriminative function.

Remarks:
Positive or negative do NOT refer to positive or negative classification. This function is equivalent to the phi function in my thesis.

Parameters:

val The value to discriminate

classIdx The index of the class

Returns:
+1 if val is on one side of the border for classIdx and -1 otherwise

Date:
11/11/2005

Implemented in MultiStumpLearner, and SingleStumpLearner.
Referenced by StumpLearner::classify().

void save ( ofstream & outputStream,

const int numTabs = 0

) [virtual]

Save the current object information needed for classification, that is: _v, The alignment vector and _selectedColumn, the column of the data with that yielded the lowest error.

Parameters:

outputStream The stream where the data will be saved

numTabs The number of tabs before the tag. Useful for indentation

Remarks:
To fully save the object it is very important to call also the super-class method.

See also:
BaseLearner::save()

Date:
13/11/2005

Reimplemented from BaseLearner.
Reimplemented in MultiStumpLearner, and SingleStumpLearner.
Definition at line 286 of file StumpLearner.cpp.
References StumpLearner::_selectedColumn, StumpLearner::_v, BaseLearner::save(), Serialization::standardTag(), and Serialization::vectorTag().
Referenced by SingleStumpLearner::save(), and MultiStumpLearner::save().

Member Data Documentation

vector<char> _v [protected]

The class-wise abstention/alignment vector.
It is obtained simply with
$v_\ell = \begin{cases} +1 & \mbox{ if } \mu_{\ell+} > \mu_{\ell-}\\ -1 & \mbox{ otherwise.} \end{cases}$
where $\mu$ are defined in sMu.
See also:
sMu

Date:
11/11/2005

Definition at line 244 of file StumpLearner.h.
Referenced by StumpLearner::classify(), StumpLearner::load(), and StumpLearner::save().

The documentation for this class was generated from the following files:

src/WeakLearners/StumpLearner.h
src/WeakLearners/StumpLearner.cpp

Generated on Mon Nov 28 21:43:48 2005 for MultiBoost by

1.4.5

StumpLearner Class Reference

Public Member Functions

Protected Types

Protected Member Functions

Protected Attributes

Classes

Detailed Description

Member Enumeration Documentation

Constructor & Destructor Documentation

Member Function Documentation

Member Data Documentation