InputData Class Reference

Handles the data. More...

#include <InputData.h>

Inheritance diagram for InputData:

SortedData List of all members.

Public Member Functions

 InputData ()
 The constructor.
virtual ~InputData ()
 Destructor of the class InputData.
virtual void initOptions (nor_utils::Args &args)
 Set the arguments of the algorithm using the standard interface of the arguments.
virtual void load (const string &fileName, const eInputType inputType=IT_TRAIN, const int verboseLevel=1)
 Load the given file.
virtual const int getClass (const int idx) const
 Gets the label of the given example.
virtual const double getValue (const int idx, const int columnIdx) const
 Get the value of the example idx and column columnIdx.
virtual const int getBinaryClass (const int idx, const int classIdx) const
 Gets the binary label of the given example.
virtual double getWeight (const int idx, const int classIdx) const
 Return the weight of class classIdx and example idx.
virtual void setWeight (const int idx, const int classIdx, const double value)
 Set the value of weight of class classIdx and example idx.
int getNumColumns () const
 Returns the number of columns.
int getNumExamples () const
 Returns the number of examples.
int getNumExamplesPerClass (const int classIdx) const
 Get the number of examples per class.

Protected Member Functions

virtual void initWeights ()
 Initialize the weights.

Protected Attributes

int _numColumns
 The number of columns (dimensions).
int _numExamples
 The number of examples.
vector< int > _nExamplesPerClass
 The number of examples per class.
bool _hasFileName
 true if each example has a filename as the very first column
bool _classInLastColumn
 true if the class is in the last column instead of the first
vector< Example_data
 the data of the examples.

Classes

struct  Example
 Holds the data of the single example. More...

Detailed Description

Handles the data.

This class not just holds the data information but also the weights on examples and labels. It also stores the sorted data (for decision stump algorithms) if necessary.

Here is an example of valid data (note: in this case the argument -hasfilename has been provided!):

/home/music/classical/classical.00078.au	classical	5.72939e+01	2.95128e+02	6.43395e+00
/home/music/disco/disco.00078.au	disco	1.98315e+02	1.31341e+03	-6.15398e+00
/home/music/reggae/reggae.00022.au	reggae	2.51418e+02	7.68241e+02	-5.66704e+00
/home/music/hiphop/hiphop.00080.au	hiphop	2.62773e+02	4.83971e+02	8.80924e-01
/home/music/rock/rock.00015.au	rock	2.03546e+02	9.31192e+02	-7.56387e+00	1.15847e+02
/home/music/hiphop/hiphop.00027.au	hiphop	2.37860e+02	1.03110e+03	2.50052e-01
/home/music/rock/rock.00094.au	rock	2.48359e+02	1.69432e+02	-1.66508e+01
Remarks:
Important: The columns are assumed to be separated by tabs and not just white spaces. The reason is that spaces in the class and in the filename are allowed.
Date:
05/11/2005

Definition at line 81 of file InputData.h.


Constructor & Destructor Documentation

InputData  )  [inline]
 

The constructor.

It does noting but initializing some variables.

Date:
12/11/2005

Definition at line 89 of file InputData.h.

~InputData  )  [virtual]
 

Destructor of the class InputData.

It erases the allocated memory of structure Example.

See also:
Example::pValues
Date:
12/11/2005

Definition at line 50 of file InputData.cpp.

References InputData::_data, and InputData::_numExamples.


Member Function Documentation

virtual const int getBinaryClass const int  idx,
const int  classIdx
const [inline, virtual]
 

Gets the binary label of the given example.

It is defined as

\[ y_{i,\ell} = \begin{cases} +1 & \mbox{ if $x_i$ belongs to class $\ell$} \\ -1 & \mbox{ otherwise}. \end{cases} \]

Parameters:
idx The index of the example
classIdx The index of the class
Returns:
The class of the example idx
Date:
10/11/2005

Definition at line 147 of file InputData.h.

References InputData::_data.

Referenced by OutputInfo::outputEdge(), and AdaBoostLearner::updateWeights().

virtual const int getClass const int  idx  )  const [inline, virtual]
 

Gets the label of the given example.

Parameters:
idx The index of the example
Returns:
The class of the example [idx]
Date:
10/11/2005

Definition at line 125 of file InputData.h.

References InputData::_data.

Referenced by SingleStumpLearner::findThreshold(), MultiStumpLearner::findThresholds(), and Classifier::getOverallError().

int getNumExamplesPerClass const int  classIdx  )  const [inline]
 

Get the number of examples per class.

Parameters:
classIdx The index of the class.
Date:
11/11/2005

Definition at line 176 of file InputData.h.

References InputData::_nExamplesPerClass.

virtual const double getValue const int  idx,
const int  columnIdx
const [inline, virtual]
 

Get the value of the example idx and column columnIdx.

Parameters:
idx The index of the example
columnIdx The index of the column
Date:
11/11/2005

Definition at line 133 of file InputData.h.

References InputData::_data.

Referenced by StumpLearner::classify().

virtual double getWeight const int  idx,
const int  classIdx
const [inline, virtual]
 

Return the weight of class classIdx and example idx.

Parameters:
idx The index of the example
classIdx The index of the class
Date:
10/11/2005

Definition at line 156 of file InputData.h.

References InputData::_data.

Referenced by SingleStumpLearner::findThreshold(), MultiStumpLearner::findThresholds(), OutputInfo::outputEdge(), and AdaBoostLearner::updateWeights().

void initOptions nor_utils::Args args  )  [virtual]
 

Set the arguments of the algorithm using the standard interface of the arguments.

Call this to set the arguments asked by the user.

Parameters:
args The arguments defined by the user in the command line. on the derived classes.
Date:
14/11/2005

Definition at line 56 of file InputData.cpp.

References InputData::_classInLastColumn, InputData::_hasFileName, and Args::hasArgument().

Referenced by AdaBoostLearner::run().

void initWeights  )  [protected, virtual]
 

Initialize the weights.

The weights initialization formula is defined as:

\[ w_{i,\ell}^{(1)} = \begin{cases} \frac{1}{2n} & \mbox{ if $\ell$ is the correct class (if $y_{i,\ell} = 1$),} \\ \frac{1}{2n(k-1)} & \mbox{ otherwise (if $y_{i,\ell} = -1$).} \end{cases} \]

where $n$ is the number of examples and $k$ the number of classes.

See also:
Example

_data

Date:
11/11/2005

Definition at line 187 of file InputData.cpp.

References InputData::_data, InputData::_numExamples, and ClassMappings::getNumClasses().

void load const string &  fileName,
const eInputType  inputType = IT_TRAIN,
const int  verboseLevel = 1
[virtual]
 

Load the given file.

Parameters:
fileName The name of the file to be loaded
inputType The type of input.
verboseLevel The level of verbosity.
See also:
eInputType
Date:
08/11/2005

Reimplemented in SortedData.

Definition at line 71 of file InputData.cpp.

Referenced by SortedData::load(), and AdaBoostLearner::run().

virtual void setWeight const int  idx,
const int  classIdx,
const double  value
[inline, virtual]
 

Set the value of weight of class classIdx and example idx.

Parameters:
idx The index of the example
classIdx The index of the class
value The new value for the weight
Date:
13/11/2005

Definition at line 165 of file InputData.h.

References InputData::_data.

Referenced by AdaBoostLearner::updateWeights().


The documentation for this class was generated from the following files:
Generated on Mon Nov 28 21:43:48 2005 for MultiBoost by  doxygen 1.4.5