Heterogeneous Graph Support

What is a heterogeneous graph?

A heterogeneous graph is a graph whose nodes and edges are typed:

Models that work on heterogeneous graphs?

Models using Heterogeneous Graph API:

Dataset	RMSE (DGL)	RMSE (Official)	Speed (DGL)	Speed (Official)	Speed Comparison
MovieLens-100K	0.9077	0.910	0.0246s/epoch	0.1008s/epoch	5x
MovieLens-1M	0.8377	0.832	0.0695s/epoch	1.538s/epoch	22x
MovieLens-10M (full-graph training)	0.7875	0.777	0.6480s/epoch	OOM	-

R-GCN [Code in PyTorch]
- We provide an R-GCN model for heterograph input. The new code can train the model for the AM dataset (>5M edges) using one GPU, while the original implementation can only run on CPU and consume 32GB memory.
- The original implementation takes 51.88s to train one epoch on CPU. The new R-GCN based on heterograph takes only 0.1781s for one epoch on V100 GPU (291x faster !!).
Heterogeneous Attention Networks [Code in PyTorch]
Metapath2vec [Code in PyTorch]
- The metapath sampler is twice as fast as the original implementation.

How could I play with a heterogeneous graph?

Here is an example for creating and manipulating a heterogeneous graph:

import dgl
import torch
import dgl.function as fn

g = dgl.heterograph({
    ('user', 'follows', 'user'): [(0, 1), (1, 2)],
    ('user', 'plays', 'game'): [(0, 0), (1, 0), (1, 1), (2, 1)],
    ('game', 'attracts', 'user'): [(0, 0), (0, 1), (1, 1), (1, 2)],
    ('developer', 'develops', 'game'): [(0, 0), (1, 1)],
    })

# Here the user nodes have a single feature named x, and game nodes have a single feature named y
x = torch.randn(3, 5)
y = torch.randn(2, 4)
g.nodes['user'].data['x'] = x
g.nodes['game'].data['y'] = y

# Edge features are similar
a = torch.randn(2, 5)
b = torch.randn(4, 7)
g.edges['follows'].data['a'] = a
g.edges['plays'].data['b'] = b

# One can also perform message passing.
# The following code performs a full message passing on the "plays" edges.
g['follows'].update_all(fn.copy_u('x', 'm'), fn.sum('m', 'z'))
z = g.nodes['game'].data['z']
assert torch.allclose(z[0], x[0])
assert torch.allclose(z[1], x[0] + x[1])
assert torch.allclose(z[2], x[1])

# Moreover, one can also perform message passing on multiple types at the same time, aggregating the results
g.multi_update_all({
    'follows': (fn.copy_u('x', 'm'), fn.sum('m', 'w')),
    'attracts': (fn.copy_u('a', 'm'), fn.sum('m', 'w')),
    }, 'sum')

Checkout our heterograph tutorial: Working with Heterogeneous Graphs in DGL

Checkout the full API reference.

Knowledge Graph Models

We also released DGL-KE, a subpackage of DGL that trains embeddings on knowledge graphs. This package is adapted from the KnowledgeGraphEmbedding package. We made it fast and scalable while still maintaining the flexibility of the original package. Using a single NVIDIA V100 GPU, DGL-KE can train TransE on FB15k in 6.85 mins, substantially outperforming existing tools such as GraphVite. For graphs with hundreds of millions of edges (such as the full Freebase graph), it takes a couple of hours on one EC2 x1.32xlarge machine.

Currently, the following models are supported:

TransE
DistMult
ComplEx

And the following training schemas are supported:

CPU training
GPU training
Joint CPU & GPU training
Multiprocessing training on CPUs

Training results on FB15k using one NVIDIA V100 GPU

Training Speed:

Models	TransE	DistMult	ComplEx
MAX_STEPS	20000	100000	100000
TIME	411s	690s	806s

Training accuracy:

Models	MR	MRR	HITS@1	HITS@3	HITS@10
TransE	69.12	0.656	0.567	0.718	0.802
DistMult	43.35	0.783	0.713	0.837	0.897
ComplEx	51.99	0.785	0.720	0.832	0.889

In comparison, GraphVite uses 4 GPUs and takes 14 minutes. Thus, DGL-KE trains TransE on FB15k 2x times faster than GraphVite while using much fewer resources.

For more information, please refer to this directory

Miscellaneous

New builtin message function: dot product (u_dot_v etc. #831 @classicsong )
More efficient data format and serialization (#728 @VoVAllen )
ClusterGCN (#877 , @Zardinality )
CoraFull, Amazon, KarateClub, Coauthor datasets (#855 @VoVAllen )
More performance improvements
More bugfixes

08 October

By DGLTeam, in release

Blog Details

DGL v0.4 Release (heterogeneous graph update)

Heterogeneous Graph Support

What is a heterogeneous graph?

Models that work on heterogeneous graphs?

How could I play with a heterogeneous graph?

Knowledge Graph Models

Miscellaneous

Follow Us

Quick Links

Materials

Contact Us