Creating Custom Activation Functions¶

You can extend athena with custom activation functions by implementing the base_actv_type interface.

Base Activation Type¶

All activation functions inherit from base_actv_type and must implement four key procedures:

Required Procedures¶

apply: Apply the activation function to input values
reset: Reset any internal state or attributes
apply_attributes: Load parameters from ONNX attributes
export_attributes: Export parameters as ONNX attributes

Essential Structure¶

type, extends(base_actv_type) :: custom_actv_type
   ! Add custom parameters here
   real(real32) :: custom_param = 1.0_real32
 contains
   procedure, pass(this) :: apply => apply_custom
   procedure, pass(this) :: reset => reset_custom
   procedure, pass(this) :: apply_attributes => apply_attributes_custom
   procedure, pass(this) :: export_attributes => export_attributes_custom
end type custom_actv_type

Example: Custom Parametric Activation¶

Here’s a complete example of a custom activation function:

module my_custom_activation
  use coreutils, only: real32
  use athena__misc_types, only: base_actv_type, onnx_attribute_type
  use diffstruc, only: array_type
  implicit none

  type, extends(base_actv_type) :: parametric_actv_type
     real(real32) :: alpha = 0.1_real32  ! Custom parameter
   contains
     procedure, pass(this) :: apply => apply_parametric
     procedure, pass(this) :: reset => reset_parametric
     procedure, pass(this) :: apply_attributes => apply_attributes_parametric
     procedure, pass(this) :: export_attributes => export_attributes_parametric
  end type parametric_actv_type

  interface parametric_actv_type
     module function setup(alpha) result(activation)
       real(real32), optional, intent(in) :: alpha
       type(parametric_actv_type) :: activation
     end function setup
  end interface

contains

  function setup(alpha) result(activation)
    real(real32), optional, intent(in) :: alpha
    type(parametric_actv_type) :: activation

    activation%name = "parametric"
    if(present(alpha)) activation%alpha = alpha
  end function setup

  function apply_parametric(this, val) result(output)
    class(parametric_actv_type), intent(in) :: this
    type(array_type), intent(in) :: val
    type(array_type), pointer :: output

    ! Example: f(x) = x if x > 0, alpha * x otherwise
    allocate(output)
    output = merge(val, val * this%alpha, val > 0.0_real32)
  end function apply_parametric

  subroutine reset_parametric(this)
    class(parametric_actv_type), intent(inout) :: this

    ! Reset to default values if needed
    this%alpha = 0.1_real32
    this%scale = 1.0_real32
  end subroutine reset_parametric

  subroutine apply_attributes_parametric(this, attributes)
    class(parametric_actv_type), intent(inout) :: this
    type(onnx_attribute_type), dimension(:), intent(in) :: attributes
    integer :: i

    do i = 1, size(attributes)
      select case(trim(attributes(i)%name))
      case("alpha")
        read(attributes(i)%val, *) this%alpha
      end select
    end do
  end subroutine apply_attributes_parametric

  pure function export_attributes_parametric(this) result(attributes)
    class(parametric_actv_type), intent(in) :: this
    type(onnx_attribute_type), allocatable, dimension(:) :: attributes
    character(20) :: alpha_str

    allocate(attributes(1))
    write(alpha_str, '(F20.10)') this%alpha
    attributes(1) = onnx_attribute_type("alpha", "float", trim(alpha_str))
  end function export_attributes_parametric

end module my_custom_activation

Registering Your Activation¶

To use your custom activation in athena networks, add it to the activation setup:

! In mod_activation.f90, add to activation_setup function:
case("parametric")
  activation = parametric_actv_type()

Mathematical Considerations¶

Gradient Computation¶

When implementing activation functions for training, consider the gradient:

The apply function works with array_type which tracks gradients automatically
Ensure your function is differentiable where needed
Use merge for conditional operations to maintain gradient flow

Numerical Stability¶

Avoid operations that can cause overflow (e.g., large exponentials)
Add small epsilon values where division by zero might occur
Consider the range of expected input values

Best Practices¶

Naming: Use descriptive names that indicate the activation’s behavior
Parameters: Provide sensible defaults for any parameters
Documentation: Include the mathematical formula in comments
Testing: Test with various input ranges including edge cases
ONNX Support: Implement attribute import/export for model portability