Gradient Clipping¶
clip_type
Prevents gradient explosion by limiting gradient magnitudes during backpropagation.
clip_type(
clip_min=...,
clip_max=...,
clip_norm=...
)
Arguments¶
clip_min (real, optional): Minimum allowed gradient value
clip_max (real, optional): Maximum allowed gradient value
clip_norm (real, optional): Maximum allowed L2-norm
Value clipping and norm clipping can be used independently or together.
Usage¶
Norm Clipping¶
use athena
type(clip_type) :: clipper
clipper = clip_type(clip_norm=1.0)
call network%compile( &
optimiser_type=adam_optimiser_type( &
learning_rate=0.001, &
clip_dict=clipper), &
loss_method="mse")
Value Clipping¶
clipper = clip_type(clip_min=-0.5, clip_max=0.5)
call network%compile( &
optimiser_type=sgd_optimiser_type( &
learning_rate=0.01, &
clip_dict=clipper), &
loss_method="categorical_crossentropy")
Combined Clipping¶
clipper = clip_type( &
clip_min=-1.0, &
clip_max=1.0, &
clip_norm=5.0)
Typical Values¶
RNNs/LSTMs:
clip_norm= 0.5 to 2.0GRUs:
clip_norm= 1.0 to 5.0CNNs:
clip_norm= 5.0 to 10.0GNNs:
clip_norm= 0.5 to 2.0PINNs:
clip_min= -0.1,clip_max= 0.1
See Also¶
Training Configuration: Overview
Learning Rate Decay: Gradually reducing learning rate
Regularisation: Preventing overfitting