Graph Neural Operator Layer¶
graph_nop_layer_type
graph_nop_layer_type(
num_outputs,
coord_dim,
kernel_hidden=...,
num_inputs=...,
use_bias=.true.,
activation="none",
kernel_initialiser=...,
bias_initialiser=...,
verbose=0
)
The graph_nop_layer_type derived type provides a graph neural operator layer for irregular meshes and general sparse graphs.
It combines a local linear bypass with a learned kernel evaluated on per-edge geometric features, typically relative coordinates:
where:
\(h_i^{(l)} \in \mathbb{R}^{F_{in}}\) is the input feature vector at node \(i\)
\(\Delta x_{ij} \in \mathbb{R}^{d}\) is the edge geometry for the directed edge \(j \to i\)
\(\kappa_\theta : \mathbb{R}^{d} \to \mathbb{R}^{F_{out} \times F_{in}}\) is a learnable kernel network
\(\mathbf{W} \in \mathbb{R}^{F_{out} \times F_{in}}\) is the learnable local bypass matrix
\(\mathbf{b} \in \mathbb{R}^{F_{out}}\) is the bias vector when
use_bias=.true.\(\sigma\) is the activation function
The kernel network is a one-hidden-layer MLP applied independently to every directed edge:
where:
\(\Delta x\) is supplied as an edge feature vector, for example \(x_i - x_j\)
\(\mathbf{U} \in \mathbb{R}^{H \times d}\)
\(\mathbf{V} \in \mathbb{R}^{(F_{out} F_{in}) \times H}\)
\(\mathbf{b}_u \in \mathbb{R}^{H}\)
\(\mathbf{b}_v \in \mathbb{R}^{F_{out} F_{in}}\)
\(H\) is
kernel_hidden
This makes the layer suitable for operator learning on point clouds, unstructured meshes, and graph-structured discretisations where neighbor interactions depend on geometry.
Arguments¶
num_outputs (
integer): Number of output node features \(F_{out}\).coord_dim (
integer): Dimensionality of the edge geometric features \(d\).kernel_hidden (
integer): Hidden width \(H\) of the kernel MLP. If not provided, it defaults tonum_outputs.num_inputs (
integer): Number of input node features \(F_{in}\). If not provided, it is inferred when the layer is initialised.use_bias (
logical): If.false., the layer omits the output bias term. Default:.true..activation (
class(*)): Activation function applied after aggregation and bypass combination.Accepts
character(*)orclass(base_actv_type).See Activation Functions for available options.
Default:
none_actv_type.
kernel_initialiser (
class(*)): Initialiser for the kernel MLP and bypass weights (see Initialisers).If
activationisselu_actv_type, default:lecun_normal_init_type.If
activationis a version ofrelu_actv_type, default:he_normal_init_type.For all other activations, default:
glorot_uniform_init_type.
bias_initialiser (
class(*)): Initialiser for the biases (see Initialisers). Default:zeros_init_type.verbose (
integer, optional): Verbosity level for initialisation. Default:0.
Shape¶
This layer consumes graph-structured data with two input channels per sample:
input(1, s): Node features of shape(num_inputs, num_vertices)input(2, s): Edge geometric features of shape(coord_dim, num_edges)
The output is node-level for vertex features and preserves edge geometry:
output(1, s): Updated node features of shape(num_outputs, num_vertices)output(2, s): Propagated edge features of shape(coord_dim, num_edges)
The graph connectivity is not inferred from the edge-feature tensor. It must be provided separately through set_graph(...) on the layer, or by passing graph-valued training data through network%train(...) so the network can propagate adjacency information to the message-passing layer.
When stacking GNO layers, output(2, s) is forwarded unchanged so later GNO blocks continue to receive the same geometric edge descriptors.
Parameters¶
The layer contains the following learnable parameters:
U: First kernel MLP weight matrix of shape
(kernel_hidden, coord_dim).b_u: First kernel MLP bias vector of shape
(kernel_hidden).V: Second kernel MLP weight matrix of shape
(num_outputs * num_inputs, kernel_hidden).b_v: Kernel output bias vector of shape
(num_outputs * num_inputs).W: Local bypass matrix of shape
(num_outputs, num_inputs).b: Output bias vector of shape
(num_outputs)whenuse_bias=.true..
Let \(F = num_outputs \cdot num_inputs\). The total number of learnable parameters is:
With bias:
kernel_hidden * coord_dim + kernel_hidden + F * kernel_hidden + 2 * F + num_outputsWithout bias:
kernel_hidden * coord_dim + kernel_hidden + F * kernel_hidden + 2 * F
Examples¶
Single GNO layer on a graph:
use athena
type(graph_type), dimension(1) :: graph
type(graph_nop_layer_type) :: layer
type(array_type), allocatable :: input(:,:)
layer = graph_nop_layer_type( &
num_inputs=3, &
num_outputs=8, &
coord_dim=2, &
kernel_hidden=16, &
activation="relu")
call layer%set_graph(graph)
allocate(input(2, 1))
call input(1,1)%allocate(array_shape=[3, graph(1)%num_vertices])
call input(2,1)%allocate(array_shape=[2, graph(1)%num_edges])
call layer%forward(input)
Stacked GNO network:
use athena
type(network_type) :: network
call network%add(graph_nop_layer_type( &
num_inputs=1, &
num_outputs=8, &
coord_dim=1, &
kernel_hidden=8, &
activation="relu"))
call network%add(graph_nop_layer_type( &
num_outputs=2, &
coord_dim=1, &
kernel_hidden=8, &
activation="none"))
call network%compile( &
optimiser=base_optimiser_type(learning_rate=0.01_real32), &
loss_method="mse", &
metrics=["loss"])
Notes¶
This layer extends
msgpass_layer_typeand preserves node-level outputs rather than performing graph-level pooling.The learnable kernel depends on edge geometry, so translation-invariant interactions can be represented by supplying relative coordinates such as \(x_i - x_j\) as edge features.
The forward pass is built entirely from
array_type-based differentiable operations, making the layer compatible with ATHENA’s autodiff workflows, including physics-informed use cases.For stable performance on sparse graphs, ensure the graph adjacency is prepared before calling
forwardor training the enclosing network.
See Also¶
neural_operator_layer_type - Mean-field neural operator layer on regular discretisations
fixed_lno_layer_type - Laplace neural operator layer with fixed encoder/decoder bases and spectral mixing
kipf_msgpass_layer_type - Degree-normalised message passing layer