Duvenaud Message Passing Layer¶
duvenaud_msgpass_layer_type
duvenaud_msgpass_layer_type(
num_vertex_features,
num_edge_features,
num_time_steps,
max_vertex_degree,
num_outputs,
min_vertex_degree=1,
message_activation="none",
readout_activation="none",
kernel_initialiser=...,
verbose=0
)
The duvenaud_msgpass_layer_type implements the message passing architecture from Duvenaud et al. (2015) for learning on molecular graphs.
This layer aggregates information from neighboring nodes through message passing iterations and performs graph-level readout to produce a fixed-size vector representation.
The operation is performed in two stages:
Message Passing: For each time step \(t\), update node features by aggregating neighbor information:
where \(h_v\) is the feature vector for node \(v\), \(N(v)\) is the set of neighbors, \(e_{vu}\) are edge features, and \(M\) is a learned message function.
Graph Readout: Aggregate all node features into a graph-level representation:
where \(D\) is the maximum vertex degree, \(T\) is the number of time steps, and \(W_d\) are degree-specific weight matrices.
Arguments¶
num_vertex_features (integer, dimension(:)): Dimensionality of vertex features at each layer.
If single value: all time steps use same feature dimension
If array: specifies input and output dimensions (e.g.,
[16, 32]means 16→32)Defines the evolution of feature dimensions through message passing
num_edge_features (integer, dimension(:)): Dimensionality of edge features.
Similar to
num_vertex_featuresEdge features are incorporated into message function
Essential for molecular graphs (bond types, bond orders)
num_time_steps (integer): Number of message passing iterations.
Controls how far information propagates across the graph
Typical values: 2-6 for molecular graphs
More steps = larger receptive field but slower computation
max_vertex_degree (integer): Maximum degree of any vertex in the graph.
Used to allocate degree-specific weight matrices for readout
Should be set to maximum degree in your dataset
Example: for molecules, often 4-6 (typical valence)
num_outputs (integer): Dimensionality of graph-level output after readout.
Output size of the graph representation vector
This output typically feeds into dense layers for prediction
Does not affect node-level features during message passing
min_vertex_degree (integer, optional): Minimum degree of vertices to consider in readout. Default:
1.Vertices with degree less than this are excluded from readout
Useful for ignoring isolated nodes
Typically set to 1 (include all connected nodes)
message_activation (class(*)): Activation function for message updates.
Accepts
character(*)orclass(base_actv_type)See Activation Functions for available options
Default:
"none"(linear)Common choices:
"relu","leaky_relu","tanh"
readout_activation (class(*)): Activation function for graph readout aggregation.
Applied to the aggregated graph representation
Default:
"none"(linear)Common choices:
"softmax"(weighted sum),"sigmoid""softmax"creates an attention-like mechanism over node degrees
kernel_initialiser (character(*)): Initialiser for the weight matrices (see Initialisers).
Default depends on activation function (Glorot/He initialisation)
All weight matrices (message and readout) use the same initialiser
verbose (integer, optional): Verbosity level for initialisation. Default:
0.
Shape¶
Input: Graph structure represented by
graph_type:vertex_features:(num_vertex_features(1), num_vertices)- Node featuresedge_features:(num_edge_features(1), num_edges)- Edge featuresadjacency_matrix: Sparse or dense connectivity matrixBatch dimension handled internally
Output:
(num_outputs, batch_size)- Graph-level representation vector
Key Features¶
Degree-aware readout: Uses separate weights for nodes of different degrees
Edge feature incorporation: Considers both node and edge attributes in message passing
Permutation invariant: Graph-level output is invariant to node ordering
Differentiable: Supports backpropagation through the graph structure
Usage Example¶
Molecular Property Prediction¶
use athena
type(network_type) :: network
type(graph_type), allocatable :: molecules(:,:)
! Add Duvenaud message passing layer
call network%add(duvenaud_msgpass_layer_type( &
num_vertex_features=[6, 16, 16, 16], & ! 4 time steps
num_edge_features=[4, 8, 8, 8], &
num_time_steps=4, &
max_vertex_degree=6, &
num_outputs=32, &
min_vertex_degree=1, &
message_activation="relu", &
readout_activation="softmax", &
kernel_initialiser="glorot_normal", &
batch_size=8))
! Add dense layers for prediction
call network%add(full_layer_type( &
num_inputs=32, num_outputs=128, activation="relu"))
call network%add(full_layer_type( &
num_outputs=64, activation="relu"))
call network%add(full_layer_type( &
num_outputs=1, activation="none"))
! Compile and train
call network%compile( &
optimiser=adam_optimiser_type(), &
loss_method="mse", &
batch_size=8)
call network%train(molecules, target_energies, num_epochs=100)
Notes¶
- Graph Data Requirements
Graphs should have
add_self_loops()called and be converted to sparse format for efficiency:call molecule%add_self_loops() call molecule%convert_to_sparse()
- Computational Complexity
Time complexity: \(O(T \cdot |E| \cdot F)\) where \(T\) is time steps, \(|E|\) is edges, \(F\) is feature dimension
Space complexity: \(O(D \cdot F \cdot F_{\text{out}})\) for readout weights
- When to Use
Molecular property prediction (energy, solubility, toxicity)
Chemical reaction prediction
Graph classification tasks
When edge features are important
When graph-level (not node-level) predictions are needed
See Also¶
kipf_msgpass_layer_type - Alternative message passing (GCN)
Message Passing Example - Complete tutorial with examples
concat_layer_type - For combining with skip connections
References¶
Duvenaud, D. K., et al. (2015). “Convolutional Networks on Graphs for Learning Molecular Fingerprints.” Advances in Neural Information Processing Systems, 28, DOI: 10.48550/arXiv.1509.09292.