Negative Log Likelihood Loss¶

nll_loss_type

nll_loss_type()

Negative Log Likelihood (NLL) loss is used for multi-class classification when the model outputs log probabilities.

\[L = -\frac{1}{N} \sum_{i=1}^{N} \log(\hat{y}_{i,c_i})\]

where: - \(\hat{y}_{i,c_i}\) is the predicted log probability for the correct class \(c_i\) - \(N\) is the number of samples

Use Cases¶

Multi-class classification with log-softmax output
When working with pre-computed log probabilities
Maximum likelihood estimation
Statistical modeling

Example¶

use athena__loss

type(nll_loss_type) :: loss
type(array_type), dimension(:,:) :: predicted, expected
type(array_type), pointer :: loss_value

! Initialise loss function
loss = nll_loss_type()

! Compute loss (predicted should be log probabilities)
loss_value => loss%compute(predicted, expected)

Relationship to Cross Entropy¶

NLL loss with log-softmax output is mathematically equivalent to categorical cross entropy with softmax output:

\[\text{CCE}(\text{softmax}(x), y) = \text{NLL}(\text{log\_softmax}(x), y)\]

However, the log-softmax + NLL combination is often more numerically stable.

Notes¶

Assumes input predictions are log probabilities (not raw logits or probabilities)
Typically used with log-softmax activation
More numerically stable than CCE with softmax for large magnitude logits
Equivalent to minimizing negative log likelihood in maximum likelihood estimation

Numerical Stability¶

Using log probabilities directly avoids numerical issues:

No need to exponentiate (which can overflow)
Logarithm of very small probabilities is still representable
More stable gradient computation

See Also¶

CCE Loss - Related loss with raw probabilities
BCE Loss - Binary version