Causality-Aware Graph Neural Networks¶
Prerequisites¶
First, we need to set up our Python environment that has PyTorch, PyTorch Geometric and PathpyG installed. Depending on where you are executing this notebook, this might already be (partially) done. E.g. Google Colab has PyTorch installed by default so we only need to install the remaining dependencies. The DevContainer that is part of our GitHub Repository on the other hand already has all of the necessary dependencies installed.
In the following, we install the packages for usage in Google Colab using Jupyter magic commands. For other environments comment in or out the commands as necessary. For more details on how to install pathpyG
especially if you want to install it with GPU-support, we refer to our documentation. Note that %%capture
discards the full output of the cell to not clutter this tutorial with unnecessary installation details. If you want to print the output, you can comment %%capture
out.
%%capture
!pip install torch
!pip install torch_geometric
!pip install git+https://github.com/pathpy/pathpyG.git
Motivation and Learning Objectives¶
In previous tutorials, we have introduced causal paths in temporal graphs, and how we can use them to generate higher-order De Bruijn graph models that capture temporal-topological patterns in time series data. In this tutorial, we will show how we can use De Bruijn Graph Neural Networks, a causality-aware deep learning architecture for temporal graph data. The details of this approach are introduced in this paper. The architecture is implemented in pathpyG and can be readily applied to temporal graph data.
Below we illustrate this mthod in a supervised node classification task, i.e. given a temporal graph we will use the temporal-topological patterns in the graph to classify nodes.
We start by importing a few modules:
import numpy as np
import matplotlib.pyplot as plt
import scipy as sp
import torch
from torch_geometric.transforms import RandomNodeSplit
from sklearn.metrics import balanced_accuracy_score
from sklearn.manifold import TSNE
import pathpyG as pp
from pathpyG.nn.dbgnn import DBGNN
pp.config['torch']['device'] = 'cuda' if torch.cuda.is_available() else 'cpu'
device = pp.config['torch']['device']
Temporal-Topological Clusters in Temporal Graphs¶
Let us load a small synthetic toy example for a temporal graph with 60.000 time-stamped interactions between 30 nodes. We use the TemporalGraph
class to load this example from a file containing edges with discrete time-stamps.
t = pp.io.read_csv_temporal_graph('../data/temporal_clusters.tedges', header=False)
This example has created in such a way that the nodes naturally form three clusters, which are highlighted in the interactive visualization below:
style = {}
style['node_color'] = ['green']*10+['red']*10+['blue']*10
pp.plot(t, **style, edge_size=4, edge_color='gray');