BaseLearner Class Reference

Generic base learner. More...

#include <BaseLearner.h>

Inheritance diagram for BaseLearner:

StumpLearner MultiStumpLearner SingleStumpLearner List of all members.

Public Member Functions

 BaseLearner ()
 The constructor.
virtual void initOptions (nor_utils::Args &args)
 Set the arguments of the algorithm using the standard interface of the arguments.
virtual void declareArguments (nor_utils::Args &args)=0
 Declare weak-learner-specific arguments.
virtual BaseLearnercreate ()=0
 Returns a new object of the derived type.
virtual InputDatacreateInputData ()
 Creates an InputData object that it is good for the weak learner.
virtual void run (InputData *pData)=0
 Run the learner to build the classifier on the given data.
virtual char classify (InputData *pData, const int idx, const int classIdx)=0
 Classify the data on the given example index and class using the learned classifier.
const double getAlpha () const
 Get the value of alpha.
virtual void save (ofstream &outputStream, const int numTabs=0)
 Serialize the object.
virtual void load (nor_utils::StreamTokenizer &st)
 Unserialize the object.

Static Public Member Functions

static LearnersRegsRegisteredLearners ()
 Map the registered basic learners.

Protected Member Functions

virtual void setSmoothingVal (const double smoothingVal)
 Set the smoothing value for alpha.
virtual double getAlpha (const double error)
 Compute alpha using the error.
virtual double getAlpha (const double eps_min, const double eps_pls)
 Compute alpha with abstention.
virtual double getAlpha (const double eps_min, const double eps_pls, double theta)
 Compute alpha with abstention and theta.

Protected Attributes

double _smoothingVal
 The smoothing value for alpha.
double _alpha
 The confidence in the current learner.
const double _smallVal
 A small value.

Classes

class  LearnersRegs
 Holds the information about the registered learners. More...

Detailed Description

Generic base learner.

All the weak learners used by AdaBoost should inherit from this one.

Todo:
Add a getAlpha for non-binary (ternary) base-classifiers, using line-search.

Definition at line 51 of file BaseLearner.h.


Constructor & Destructor Documentation

BaseLearner  )  [inline]
 

The constructor.

It initializes _smallVal to 1E-10, and _alpha to 0

See also:
_alpha
Date:
11/11/2005

Definition at line 150 of file BaseLearner.h.


Member Function Documentation

virtual char classify InputData pData,
const int  idx,
const int  classIdx
[pure virtual]
 

Classify the data on the given example index and class using the learned classifier.

Parameters:
pData The pointer to the data.
idx The index of the example to classify.
classIdx The index of the class.
Remarks:
Passing the data and the index to the example is not nice at all. This will soon be replaced with the passing of the example itself in some form (probably a structure to the example).
Returns:
+1 if the classifier thinks that val belongs to class classIdx, -1 if it does not and 0 if it abstain

Implemented in StumpLearner.

Referenced by OutputInfo::outputEdge(), and AdaBoostLearner::updateWeights().

virtual BaseLearner* create  )  [pure virtual]
 

Returns a new object of the derived type.

For instance the overriding of this method in SingleStumpLearner will be:

 return new SingleStumpLearner();
For that reason every learner must have an empty constructor. Use setArguments() if you must define some parameters of the learner.
Remarks:
It uses the trick described in http://www.parashift.com/c++-faq-lite/serialization.html#faq-36.8 for the auto-registering classes.
See also:
SingleStumpLearner::create()
Date:
14/11/2005

Implemented in MultiStumpLearner, and SingleStumpLearner.

InputData * createInputData  )  [virtual]
 

Creates an InputData object that it is good for the weak learner.

Override it if the weak learner requires another type of data to be loaded (which must be an extension of InputData).

Warning:
The object must be destroyed by the caller.
See also:
InputData
Date:
21/11/2005

Reimplemented in StumpLearner.

Definition at line 34 of file BaseLearner.cpp.

Referenced by AdaBoostLearner::run().

virtual void declareArguments nor_utils::Args args  )  [pure virtual]
 

Declare weak-learner-specific arguments.

These arguments will be added to the list of arguments under the group specific of the weak learner. It is called automatically in main, when the list of arguments is built up. Use this method to declare the arguments that belongs to the weak learner only.

Parameters:
args The Args class reference which can be used to declare additional arguments.
Date:
28/11/2005

Implemented in StumpLearner.

double getAlpha const double  eps_min,
const double  eps_pls,
double  theta
[protected, virtual]
 

Compute alpha with abstention and theta.

A helper function to compute the alpha for AdaBoost with abstention and with theta. The formula is:

\[ \alpha = \begin{cases} \ln\left( - \frac{\theta\epsilon^{(t)}_{0}}{2(1+\theta)\epsilon^{(t)}_{-}} + \sqrt{\left(\frac{\theta\epsilon^{(t)}_{0}}{2(1+\theta)\epsilon^{(t)}_{-}}\right)^2 + \frac{(1 - \theta)\epsilon^{(t)}_{+}}{(1 + \theta)\epsilon^{(t)}_{-}}}\right) & \mbox{ if } \epsilon^{(t)}_{-} > 0,\\ \ln\left( \frac{(1-\theta)\epsilon^{(t)}_{+}}{\theta\epsilon^{(t)}_{0}}\right) & \mbox{ if } \epsilon^{(t)}_{-} = 0. \end{cases} \]

Parameters:
eps_min The error rate of the weak learner.
eps_pls The correct rate of the weak learner.
theta The value of theta.
Remarks:
Use this function to update _alpha.

eps_min + eps_pls + eps_zero = 0!

See also:
_alpha
Date:
11/11/2005

Definition at line 55 of file BaseLearner.cpp.

References BaseLearner::_smallVal, BaseLearner::getAlpha(), and nor_utils::is_zero().

double getAlpha const double  eps_min,
const double  eps_pls
[protected, virtual]
 

Compute alpha with abstention.

A helper function to compute the alpha for AdaBoost with abstention but no theta. The formula is:

\[ \alpha = \frac{1}{2} \log \left( \frac{\epsilon_+ + \delta}{\epsilon_- + \delta} \right) \]

where $\delta$ is a smoothing value to avoid the zero on the denominator in case of no error (eps_min == 0).

Parameters:
eps_min The error rate of the weak learner.
eps_pls The correct rate of the weak learner.
Remarks:
Use this function to update _alpha.

eps_min + eps_pls + eps_zero = 0!

See also:
_alpha

setSmoothingVal

Date:
11/11/2005

Definition at line 48 of file BaseLearner.cpp.

References BaseLearner::_smoothingVal.

double getAlpha const double  error  )  [protected, virtual]
 

Compute alpha using the error.

A helper function to compute the alpha for the basic AdaBoost with no abstention and no theta. The formula is:

\[ \alpha = \frac{1}{2} \log \left( \frac{1-error}{error} \right) \]

Parameters:
error The error of the weak learner.
Remarks:
Use this function to update _alpha.
See also:
_alpha
Date:
11/11/2005

Definition at line 41 of file BaseLearner.cpp.

const double getAlpha  )  const [inline]
 

Get the value of alpha.

This must be computed by the algorithm in run()!

Returns:
The value of alpha.
Remarks:
You can use one of the helper function listed in See also to update _alpha.
Date:
11/11/2005
See also:
getAlpha(double)

getAlpha(double, double, double)

getAlpha(double, double, double, double)

Definition at line 236 of file BaseLearner.h.

References BaseLearner::_alpha.

Referenced by StumpLearner::doFullAbstention(), StumpLearner::doGreedyAbstention(), BaseLearner::getAlpha(), StumpLearner::getEnergy(), and AdaBoostLearner::updateWeights().

virtual void initOptions nor_utils::Args args  )  [inline, virtual]
 

Set the arguments of the algorithm using the standard interface of the arguments.

Call this to set the arguments asked by the user.

Parameters:
args The arguments defined by the user in the command line.
Remarks:
At this level the method does nothing. It is overridden (if necessary) in the derived classes.
Date:
14/11/2005

Reimplemented in StumpLearner.

Definition at line 160 of file BaseLearner.h.

void load nor_utils::StreamTokenizer st  )  [virtual]
 

Unserialize the object.

This method will load the information needed for the classification from the xml file loaded in a StreamTokenizer class.

Parameters:
st The stream tokenizer that returns tags and values as tokens.
See also:
save
Date:
13/11/2005

Reimplemented in MultiStumpLearner, SingleStumpLearner, and StumpLearner.

Definition at line 91 of file BaseLearner.cpp.

References BaseLearner::_alpha.

Referenced by StumpLearner::load().

static LearnersRegs& RegisteredLearners  )  [inline, static]
 

Map the registered basic learners.

This data is updated statically just by adding the macro REGISTER_LEARNER(X) where X is the name of the learner (which must match the class name) in the .cpp file. Example (in file StumpLearner.cpp):

 REGISTER_LEARNER(SingleStumpLearner)
Remarks:
Only non-abstract classes must be registered!

To prevent the "static initialization order fiasco" I am using the trick described in http://www.parashift.com/c++-faq-lite/ctors.html#faq-10.13

See also:
create()

LearnersRegs

Date:
14/11/2005

Definition at line 134 of file BaseLearner.h.

Referenced by AdaBoostLearner::AdaBoostLearner(), and AdaBoostLearner::run().

virtual void run InputData pData  )  [pure virtual]
 

Run the learner to build the classifier on the given data.

Parameters:
pData The pointer to the data.
Warning:
This function must update _alpha too! You can use the helper functions (the getAlpha with parameters) to update it.
See also:
getAlpha(double)

getAlpha(double, double)

getAlpha(double, double, double, double)

Implemented in MultiStumpLearner, and SingleStumpLearner.

void save ofstream &  outputStream,
const int  numTabs = 0
[virtual]
 

Serialize the object.

The object information needed for classification will be saved in xml format. This method should be overridden by the derived classes which will call the superclass first and then serialize their data.

Parameters:
outputStream The stream where the data will be saved.
numTabs The number of tabs before the tag. Useful for indentation.
Remarks:
At this level only _alpha is saved.
See also:
load
Date:
13/11/2005

Reimplemented in MultiStumpLearner, SingleStumpLearner, and StumpLearner.

Definition at line 83 of file BaseLearner.cpp.

References BaseLearner::_alpha, and Serialization::standardTag().

Referenced by Serialization::appendHypothesis(), and StumpLearner::save().

virtual void setSmoothingVal const double  smoothingVal  )  [inline, protected, virtual]
 

Set the smoothing value for alpha.

It is used with the formula to compute alpha without regularization. To avoid smoothing following the paper "Improved Boosting Algorithms using Confidence-rated Predictions", page 11 (http://www.cs.princeton.edu/~schapire/uncompress-papers.cgi/SchapireSi98.ps) the value should be set to 1/n, where n is the number of examples.

Parameters:
smoothingVal The new smoothing value.
See also:
getAlpha(double, double)
Date:
22/11/2005

Definition at line 274 of file BaseLearner.h.

References BaseLearner::_smoothingVal.


Member Data Documentation

double _smoothingVal [protected]
 

The smoothing value for alpha.

See also:
setSmoothingVal
Date:
22/11/2005

Definition at line 347 of file BaseLearner.h.

Referenced by BaseLearner::getAlpha(), and BaseLearner::setSmoothingVal().


The documentation for this class was generated from the following files:
Generated on Mon Nov 28 21:43:48 2005 for MultiBoost by  doxygen 1.4.5