mlpack_optimization_SGD: Stochastic gradient descent is a technique for minimizing a function which can be expressed as a sum of other functions.

SYNOPSIS

Public Member Functions

SGD (DecomposableFunctionType &function, const double stepSize=0.01, const size_t maxIterations=100000, const double tolerance=1e-5, const bool shuffle=true)

Construct the SGD optimizer with the given function and parameters. const DecomposableFunctionType & Function () const

Get the instantiated function to be optimized. DecomposableFunctionType & Function ()

Modify the instantiated function. size_t MaxIterations () const

Get the maximum number of iterations (0 indicates no limit). size_t & MaxIterations ()

Modify the maximum number of iterations (0 indicates no limit). double Optimize (arma::mat &iterate)

Optimize the given function using stochastic gradient descent. template<> double Optimize (arma::mat ¶meters)

bool Shuffle () const

Get whether or not the individual functions are shuffled. bool & Shuffle ()

Modify whether or not the individual functions are shuffled. double StepSize () const

Get the step size. double & StepSize ()

Modify the step size. double Tolerance () const

Get the tolerance for termination. double & Tolerance ()

Modify the tolerance for termination. std::string ToString () const

Private Attributes

DecomposableFunctionType & function

The instantiated function. size_t maxIterations

The maximum number of allowed iterations. bool shuffle

Controls whether or not the individual functions are shuffled when iterating. double stepSize

The step size for each example. double tolerance

The tolerance for termination.

Detailed Description

template<typename DecomposableFunctionType>class mlpack::optimization::SGD< DecomposableFunctionType >

Stochastic Gradient Descent is a technique for minimizing a function which can be expressed as a sum of other functions.

That is, suppose we have

\[ f(A) = \sum_{i = 0}^{n} f_i(A) \].PP and our task is to minimize $ A $. Stochastic gradient descent iterates over each function $ f_i(A) $, producing the following update scheme:

\[ A_{j + 1} = A_j + \alpha \nabla f_i(A) \].PP where $ \alpha $ is a parameter which specifies the step size. $ i $ is chosen according to $ j $ (the iteration number). The SGD class supports either scanning through each of the $ n $ functions $ f_i(A) $ linearly, or in a random sequence. The algorithm continues until $ j $ reaches the maximum number of iterations -- or when a full sequence of updates through each of the $ n $ functions $ f_i(A) $ produces an improvement within a certain tolerance $ \psilon $. That is,

\[ | f(A_{j + n}) - f(A_j) | < \psilon. \].PP The parameter $\psilon$ is specified by the tolerance parameter to the constructor; $n$ is specified by the maxIterations parameter.

This class is useful for data-dependent functions whose objective function can be expressed as a sum of objective functions operating on an individual point. Then, SGD considers the gradient of the objective function operating on an individual point in its update of $ A $.

For SGD to work, a DecomposableFunctionType template parameter is required. This class must implement the following function:

size_t NumFunctions(); double Evaluate(const arma::mat& coordinates, const size_t i); void Gradient(const arma::mat& coordinates, const size_t i, arma::mat& gradient);

NumFunctions() should return the number of functions ( $n$), and in the other two functions, the parameter i refers to which individual function (or gradient) is being evaluated. So, for the case of a data-dependent function, such as NCA (see mlpack::nca::NCA), NumFunctions() should return the number of points in the dataset, and Evaluate(coordinates, 0) will evaluate the objective function on the first point in the dataset (presumably, the dataset is held internally in the DecomposableFunctionType).

Template Parameters:

DecomposableFunctionType Decomposable objective function type to be minimized.

Definition at line 86 of file sgd.hpp.

Constructor & Destructor Documentation

template<typename DecomposableFunctionType> \fBmlpack::optimization::SGD\fP< DecomposableFunctionType >::\fBSGD\fP (DecomposableFunctionType &function, const doublestepSize = \fC0.01\fP, const size_tmaxIterations = \fC100000\fP, const doubletolerance = \fC1e-5\fP, const boolshuffle = \fCtrue\fP)

Construct the SGD optimizer with the given function and parameters.

Parameters:

function Function to be optimized (minimized).

stepSize Step size for each iteration.

maxIterations Maximum number of iterations allowed (0 means no limit).

tolerance Maximum absolute tolerance to terminate algorithm.

shuffle If true, the function order is shuffled; otherwise, each function is visited in linear order.

Member Function Documentation

template<typename DecomposableFunctionType> const DecomposableFunctionType& \fBmlpack::optimization::SGD\fP< DecomposableFunctionType >::Function () const\fC [inline]\fP

Get the instantiated function to be optimized.

Definition at line 117 of file sgd.hpp.

template<typename DecomposableFunctionType> DecomposableFunctionType& \fBmlpack::optimization::SGD\fP< DecomposableFunctionType >::Function ()\fC [inline]\fP

Modify the instantiated function.

Definition at line 119 of file sgd.hpp.

template<typename DecomposableFunctionType> size_t \fBmlpack::optimization::SGD\fP< DecomposableFunctionType >::MaxIterations () const\fC [inline]\fP

Get the maximum number of iterations (0 indicates no limit).

Definition at line 127 of file sgd.hpp.

template<typename DecomposableFunctionType> size_t& \fBmlpack::optimization::SGD\fP< DecomposableFunctionType >::MaxIterations ()\fC [inline]\fP

Modify the maximum number of iterations (0 indicates no limit).

Definition at line 129 of file sgd.hpp.

template<typename DecomposableFunctionType> double \fBmlpack::optimization::SGD\fP< DecomposableFunctionType >::Optimize (arma::mat &iterate)

Optimize the given function using stochastic gradient descent. The given starting point will be modified to store the finishing point of the algorithm, and the final objective value is returned.

Parameters:

iterate Starting point (will be modified).

Returns:

Objective value of the final point.

template<> double \fBmlpack::optimization::SGD\fP< \fBmlpack::svd::RegularizedSVDFunction\fP >::Optimize (arma::mat ¶meters)

Used because the gradient affects only a small number of parameters per example, and thus the normal abstraction does not work as fast as we might like it to.

template<typename DecomposableFunctionType> bool \fBmlpack::optimization::SGD\fP< DecomposableFunctionType >::Shuffle () const\fC [inline]\fP

Get whether or not the individual functions are shuffled.

Definition at line 137 of file sgd.hpp.

template<typename DecomposableFunctionType> bool& \fBmlpack::optimization::SGD\fP< DecomposableFunctionType >::Shuffle ()\fC [inline]\fP

Modify whether or not the individual functions are shuffled.

Definition at line 139 of file sgd.hpp.

template<typename DecomposableFunctionType> double \fBmlpack::optimization::SGD\fP< DecomposableFunctionType >::StepSize () const\fC [inline]\fP

Get the step size.

Definition at line 122 of file sgd.hpp.

template<typename DecomposableFunctionType> double& \fBmlpack::optimization::SGD\fP< DecomposableFunctionType >::StepSize ()\fC [inline]\fP

Modify the step size.

Definition at line 124 of file sgd.hpp.

template<typename DecomposableFunctionType> double \fBmlpack::optimization::SGD\fP< DecomposableFunctionType >::Tolerance () const\fC [inline]\fP

Get the tolerance for termination.

Definition at line 132 of file sgd.hpp.

template<typename DecomposableFunctionType> double& \fBmlpack::optimization::SGD\fP< DecomposableFunctionType >::Tolerance ()\fC [inline]\fP

Modify the tolerance for termination.

Definition at line 134 of file sgd.hpp.

template<typename DecomposableFunctionType> std::string \fBmlpack::optimization::SGD\fP< DecomposableFunctionType >::ToString () const

Member Data Documentation

template<typename DecomposableFunctionType> DecomposableFunctionType& \fBmlpack::optimization::SGD\fP< DecomposableFunctionType >::function\fC [private]\fP

The instantiated function.

Definition at line 146 of file sgd.hpp.

template<typename DecomposableFunctionType> size_t \fBmlpack::optimization::SGD\fP< DecomposableFunctionType >::maxIterations\fC [private]\fP

The maximum number of allowed iterations.

Definition at line 152 of file sgd.hpp.

Referenced by mlpack::optimization::SGD< mlpack::svd::RegularizedSVDFunction >::MaxIterations().

template<typename DecomposableFunctionType> bool \fBmlpack::optimization::SGD\fP< DecomposableFunctionType >::shuffle\fC [private]\fP

Controls whether or not the individual functions are shuffled when iterating.

Definition at line 159 of file sgd.hpp.

Referenced by mlpack::optimization::SGD< mlpack::svd::RegularizedSVDFunction >::Shuffle().

template<typename DecomposableFunctionType> double \fBmlpack::optimization::SGD\fP< DecomposableFunctionType >::stepSize\fC [private]\fP

The step size for each example.

Definition at line 149 of file sgd.hpp.

Referenced by mlpack::optimization::SGD< mlpack::svd::RegularizedSVDFunction >::StepSize().

template<typename DecomposableFunctionType> double \fBmlpack::optimization::SGD\fP< DecomposableFunctionType >::tolerance\fC [private]\fP

The tolerance for termination.

Definition at line 155 of file sgd.hpp.

Referenced by mlpack::optimization::SGD< mlpack::svd::RegularizedSVDFunction >::Tolerance().

Author

Generated automatically by Doxygen for MLPACK from the source code.

mlpack_optimization_SGD (3)