This class implements a decision stump.
DecisionStump (const MatType &data, const arma::Row< size_t > &labels, const size_t classes, size_t inpBucketSize)
Constructor. DecisionStump (const DecisionStump<> &ds)
const arma::Col< size_t > BinLabels () const
Access the labels for each split bin. arma::Col< size_t > & BinLabels ()
Modify the labels for each split bin (be careful!). void Classify (const MatType &test, arma::Row< size_t > &predictedLabels)
Classification function. const arma::vec & Split () const
Access the splitting values. arma::vec & Split ()
Modify the splitting values (be careful!). int SplitAttribute () const
ModifyData(MatType& data, const arma::Row<double>& D);. int & SplitAttribute ()
Modify the splitting attribute (be careful!).
template<typename AttType , typename LabelType > double CalculateEntropy (arma::subview_row< LabelType > labels)
Calculate the entropy of the given attribute. template<typename rType > rType CountMostFreq (const arma::Row< rType > &subCols)
Count the most frequently occurring element in subCols. template<typename rType > int IsDistinct (const arma::Row< rType > &featureRow)
Returns 1 if all the values of featureRow are not same. void MergeRanges ()
After the 'split' matrix has been set up, merge ranges with identical class labels. double SetupSplitAttribute (const arma::rowvec &attribute, const arma::Row< size_t > &labels)
Sets up attribute as if it were splitting on it and finds entropy when splitting on attribute. template<typename rType > void TrainOnAtt (const arma::rowvec &attribute, const arma::Row< size_t > &labels)
After having decided the attribute on which to split, train on that attribute.
arma::Col< size_t > binLabels
Stores the labels for each splitting bin. size_t bucketSize
Size of bucket while determining splitting criterion. size_t numClass
Stores the number of classes. arma::vec split
Stores the splitting values after training. int splitAttribute
Stores the value of the attribute on which to split.
This class implements a decision stump.
It constructs a single level decision tree, i.e., a decision stump. It uses entropy to decide splitting ranges.
The stump is parameterized by a splitting attribute (the dimension on which points are split), a vector of bin split values, and a vector of labels for each bin. Bin i is specified by the range [split[i], split[i + 1]). The last bin has range up to (split[i + 1] does not exist in that case). Points that are below the first bin will take the label of the first bin.
Template Parameters:
MatType Type of matrix that is being used (sparse or dense).
Definition at line 44 of file decision_stump.hpp.
Constructor. Train on the provided data. Generate a decision stump from data.
Parameters:
data Input, training data.
labels Labels of training data.
classes Number of distinct classes in labels.
inpBucketSize Minimum size of bucket when splitting.
Access the labels for each split bin.
Definition at line 100 of file decision_stump.hpp.
References mlpack::decision_stump::DecisionStump< MatType >::binLabels.
Modify the labels for each split bin (be careful!).
Definition at line 102 of file decision_stump.hpp.
References mlpack::decision_stump::DecisionStump< MatType >::binLabels.
Calculate the entropy of the given attribute.
Parameters:
attribute The attribute of which we calculate the entropy.
labels Corresponding labels of the attribute.
Classification function. After training, classify test, and put the predicted classes in predictedLabels.
Parameters:
test Testing data or data to classify.
predictedLabels Vector to store the predicted classes after classifying test data.
Count the most frequently occurring element in subCols.
Parameters:
subCols The vector in which to find the most frequently occurring element.
Returns 1 if all the values of featureRow are not same.
Parameters:
featureRow The attribute which is checked for identical values.
After the 'split' matrix has been set up, merge ranges with identical class labels.
Sets up attribute as if it were splitting on it and finds entropy when splitting on attribute.
Parameters:
attribute A row from the training data, which might be a candidate for the splitting attribute.
Access the splitting values.
Definition at line 95 of file decision_stump.hpp.
References mlpack::decision_stump::DecisionStump< MatType >::split.
Modify the splitting values (be careful!).
Definition at line 97 of file decision_stump.hpp.
References mlpack::decision_stump::DecisionStump< MatType >::split.
ModifyData(MatType& data, const arma::Row<double>& D);. Access the splitting attribute.
Definition at line 90 of file decision_stump.hpp.
Modify the splitting attribute (be careful!).
Definition at line 92 of file decision_stump.hpp.
References mlpack::decision_stump::DecisionStump< MatType >::splitAttribute.
After having decided the attribute on which to split, train on that attribute.
Parameters:
attribute attribute is the attribute decided by the constructor on which we now train the decision stump.
Stores the labels for each splitting bin.
Definition at line 118 of file decision_stump.hpp.
Referenced by mlpack::decision_stump::DecisionStump< MatType >::BinLabels().
Size of bucket while determining splitting criterion.
Definition at line 112 of file decision_stump.hpp.
Stores the number of classes.
Definition at line 106 of file decision_stump.hpp.
Stores the splitting values after training.
Definition at line 115 of file decision_stump.hpp.
Referenced by mlpack::decision_stump::DecisionStump< MatType >::Split().
Stores the value of the attribute on which to split.
Definition at line 109 of file decision_stump.hpp.
Referenced by mlpack::decision_stump::DecisionStump< MatType >::SplitAttribute().
Generated automatically by Doxygen for MLPACK from the source code.