isanet.optimizer.NCG¶
NCG Module. This module provides the the NCG class. In this case, the backpropagation compute the gradient on the following objective function (Loss)
Loss = 1/N sum_k (y_i -y_i(w)')^2 + kernel_regularizer*||w||^2
So the quantity that will be monitored in the interation log will be:
loss = loss_mse_reg
val_loss = val_loss_mse_reg
Update rule for parameter w with gradient g:
beta = a_beta_formula()
d = - g + beta*d
alpha = line_search_strong_wolfe
w += alpha*d
Note
For major details on the implementation refer to Wright and Nocedal, ‘Numerical Optimization’, 1999, pp. 121-125.
-
class
isanet.optimizer.NCG.
NCG
(beta_method='hs+', c1=0.0001, c2=0.9, restart=None, ln_maxiter=10, tol=None, n_iter_no_change=None, norm_g_eps=None, l_eps=None, debug=False)¶ Bases:
isanet.optimizer.optimizer.Optimizer
Nonlinear Conjugate Gradient (NCG).
- Parameters
beta_method (string, default="hs+") –
Beta formulas available for the NCG.
’fr’, Fletcher-Reeves formula.
’pr’, Polak-Ribière formula.
’hs’, Hestenes-Stiefel formula.
’pr+’, modified Polak-Ribière formula.
’hs+’, modified Hestenes-Stiefel formula.
c1 (float, default=1e-4) – Parameter for the Armijo-Wolfe line search.
c2 (float, default=0.9) – Parameter for the Armijo-Wolfe line search.
restart (integer, optional) – Every ‘restart’ iterations Beta is set to 0.
ln_maxiter (integer, default=10) – Maximum number of iterations of the Line Search.
tol (float, optional) – Tolerance for the optimization. When the loss on training is not improving by at least tol for ‘n_iter_no_change’ consecutive iterations convergence is considered to be reached and training stops.
n_iter_no_change (integer, optional) – Maximum number of iterations with no improvements > tol.
norm_g_eps (float, optional) – Threshold that is used to decide whether to stop the fitting of the model (it stops if the norm of the gradient reaches ‘norm_g_eps’).
l_eps (float, optional) – Threshold that is used to decide whether to stop the fitting of the model (it stops if the loss function reaches ‘l_eps’).
debug (boolean, default=False) – If True, allows you to perform iterations one at a time, pressing the Enter key.
-
history
¶ Save for each iteration some interesting values.
- Dictionary’s keys:
beta
Beta value.
alpha
Step size chosen by the line search.
norm_g
Gradient norm.
ls_conv
Specifies whether the line search was able to find an alpha.
ls_it
Number of iterations of the line search.
ls_time
Computational time of the line search (includes the computational time of the zoom method, if used).
zoom_used
Specifies whether the zoom method has been used.
zoom_conv
Specifies whether the zoom method was able to find an alpha.
zoom_it
Number of iterations of the zoom method.
- Type
dict
-
backpropagation
(model, weights, X, Y)¶ Computes the derivative of 1/n sum_n (y_i -y_i’)^2 + lamda*||weights||^2.
- Parameters
model (isanet.model.MLP) – Specify the Multilayer Perceptron object to optimize
weights (list) – List of arrays, the ith array represents all the weights of each neuron in the ith layer.
X (array-like of shape (n_samples, n_features)) – The input data.
Y (array-like of shape (n_samples, n_output)) – The target values.
- Returns
contains the gradients norm for each layer to be used in the delta rule. Each index in the list represents the ith layer. (from the first hidden layer to the output layer).:
E.g. 0 -> first hidden layer, ..., n+1 -> output layer where n is the number of hidden layer in the net.
- Return type
list
-
forward
(weights, X)¶ Uses the weights passed to the function to make the Feed-Forward step.
- Parameters
weights (list) – List of arrays, the ith array represents all the weights of each neuron in the ith layer.
X (array-like of shape (n_samples, n_features)) – The input data.
- Returns
Output of all neurons for input X.
- Return type
array-like
-
get_batch
(X_train, Y_train, batch_size)¶ - Parameters
X_train (array-like of shape (n_samples, n_features)) – The input data.
Y_train (array-like of shape (n_samples, n_output)) – The target values.
batch_size (integer) – Size of minibatches for the optimizer.
- Returns
Each key of the dictionary is a integer value from 0 to number_of_batch -1 and define a batch. Each element is a dictionary and has two key: ‘batch_x_train’ and ‘batch_y_train’ and refer to the portion of data and target respectively used for the training.
- Return type
dict of dict
-
optimize
(model, epochs, X_train, Y_train, validation_data=None, batch_size=None, es=None, verbose=0)¶ - Parameters
model (isanet.model.MLP) – Specify the Multilayer Perceptron object to optimize.
epochs (integer) – Maximum number of epochs.
X_train (array-like of shape (n_samples, n_features)) – The input data.
Y_train (array-like of shape (n_samples, n_output)) – The target values.
validation_data (list of arrays-like, [X_val, Y_val], optional) – Validation set.
batch_size (integer, optional) – Size of minibatches for the optimizer. When set to “none”, the optimizer will performe a full batch.
es (isanet.callbacks.EarlyStopping, optional) – When set to None it will only use the
epochs
to finish training. Otherwise, an EarlyStopping type object has been passed and will stop training if the model goes overfitting after a number of consecutive iterations. See docs in optimizier module for the EarlyStopping Class.verbose (integer, default=0) – Controls the verbosity: the higher, the more messages.
- Returns
- Return type
integer
-
step
(model, X, Y, verbose)¶ Implements the NCG step update method.
- Parameters
model (isanet.model.MLP) –
Specify the Multilayer Perceptron object to optimize
- Xarray-like of shape (n_samples, n_features)
The input data.
Y (array-like of shape (n_samples, n_output)) – The target values.
verbose (integer, default=0) – Controls the verbosity: the higher, the more messages.
- Returns
The gradient norm.
- Return type
float