Adam Optimiser¶
adam_optimiser_type
adam_optimiser_type(
learning_rate=0.01,
beta1=0.9,
beta2=0.999,
epsilon=1.0e-8,
num_params=...,
regulariser=...,
clip_dict=...,
lr_decay=...
)
Adaptive Moment Estimation (Adam) optimiser combines ideas from RMSprop and momentum.
The update rule:
where \(\eta\) is the learning rate, \(\beta_1\) and \(\beta_2\) are exponential decay rates, and \(\epsilon\) prevents division by zero.
Arguments¶
learning_rate (real): Step size for parameter updates. Default:
0.01.beta1 (real): Exponential decay rate for first moment estimates. Default:
0.9.beta2 (real): Exponential decay rate for second moment estimates. Default:
0.999.epsilon (real): Small constant for numerical stability. Default:
1.0e-8.num_params (integer): Number of parameters to optimise.
regulariser (class(base_regulariser_type)): Regularisation method (e.g., L2 regularisation).
clip_dict (type(clip_type)): Gradient clipping configuration.
lr_decay (class(base_lr_decay_type)): Learning rate decay schedule.
Notes:¶
Adam is one of the most popular optimisation algorithms for deep learning due to its adaptive learning rates and robustness to hyperparameter choices.