This repo provides the code and datasets used in the paper Classifying graphs as images with Convolutional Neural Networks (Tixier, Nikolentzos, Meladianos and Vazirgiannis, 2017). Note that the paper was published at the ICANN 2019 conference under the title Graph classification with 2D convolutional neural networks. As its name suggests, the paper introduces a technique to perform graph classification with standard Convolutional Neural Networks for images (2D CNNs).
We encode graphs as stacks of 2D histograms of their node embeddings, and pass them to a classical 2D CNN architecture designed for images. The bins of the histograms can be viewed as pixels, and the value of a given pixel is the number of nodes falling into the associated bin.
For instance, below are the node embeddings and corresponding bivariate histograms for graph ID #10001 (577 nodes, 1320 edges) of the REDDIT-12K dataset: The full image representation of a graph is given by stacking its n_channels bivariate histograms (where n_channels can be 2,5...). Each pixel is thus associated with a n_channels-dimensional vector of counts.
Despite its simplicity, our method proves very competitive to state-of-the-art graph kernels, and even outperforms them by a wide margin on some datasets.
10-fold CV average test set classification accuracy of state-of-the-art graph kernel and graph CNN baselines (top), vs our 2D CNN approach (bottom):
The results reported in the paper (without data augmentation) are available in the /datasets/results/
subdirectory, with slight variations due to the stochasticity of the approach. You can read them using the read_results.py
script.
We can summarize the advantages of our approach as follows:
get_node2vec.py
computes the node2vec embeddings of the graphs from their adjacency matrices (parallelized over graphs)get_histograms.py
computes the image representations of the graphs (stacks of 2D histograms) from their node2vec embeddings (parallelized over graphs)main.py
reproduces the experiments in the paper (classification of graphs as images with a 2D CNN architecture, using a 10-fold cross validation scheme)main_data_augmentation.py
is like main.py
, but it implements the data augmentation scheme described in the paper (smoothed bootstrap)Command line examples and descriptions of the parameters are available within each script.
Code was developed and tested under Ubuntu 16.04.2 LTS 64-bit operating system and Python 2.7 with Keras 1.2.2 and tensorflow 1.1.0 backend.
If you use some of the code in this repository in your work, please cite:
Conference version (ICANN 2019):
@inproceedings{tixier2019graph,
title={Graph classification with 2d convolutional neural networks},
author={Tixier, Antoine J-P and Nikolentzos, Giannis and Meladianos, Polykarpos and Vazirgiannis, Michalis},
booktitle={International Conference on Artificial Neural Networks},
pages={578--593},
year={2019},
organization={Springer}
}
Pre-print version (2017):
@article{tixier2017classifying,
title={Classifying Graphs as Images with Convolutional Neural Networks},
author={Tixier, Antoine Jean-Pierre and Nikolentzos, Giannis and Meladianos, Polykarpos and Vazirgiannis, Michalis},
journal={arXiv preprint arXiv:1708.02218},
year={2017}
}