DGL v0.3 Release
V0.3 release includes many crucial updates:
- Fused message passing kernels that greatly boost the training speed of GNNs on large graphs. Please refer to our blogpost for more details.
- Add components to enable distributed training of GNNs on giant graphs with graph sampling. Please see our blogpost for more details.
- New models and NN modules.
- Many other bugfixes and other enhancement.
As a result, please be aware of the following changes:
Installation
Previous installation methods with pip and conda, i.e.:
pip install dgl
conda install -c dglteam dgl
now only install CPU builds (works for Linux/MacOS/Windows).
Installing CUDA builds with pip
Pip users could install the DGL CUDA builds with the following:
pip install <package-url>
where <package-url> is one of the following:
| CUDA 9.0 | CUDA 10.0 | |
|---|---|---|
| Linux + Py35 | pip install https://data.dgl.ai/wheels/cuda9.0/dgl-0.3-cp35-cp35m-manylinux1_x86_64.whl | pip install https://data.dgl.ai/wheels/cuda10.0/dgl-0.3-cp35-cp35m-manylinux1_x86_64.whl | 
| Linux + Py36 | pip install https://data.dgl.ai/wheels/cuda9.0/dgl-0.3-cp36-cp36m-manylinux1_x86_64.whl | pip install https://data.dgl.ai/wheels/cuda10.0/dgl-0.3-cp36-cp36m-manylinux1_x86_64.whl | 
| Linux + Py37 | pip install https://data.dgl.ai/wheels/cuda9.0/dgl-0.3-cp37-cp37m-manylinux1_x86_64.whl | pip install https://data.dgl.ai/wheels/cuda10.0/dgl-0.3-cp37-cp37m-manylinux1_x86_64.whl | 
| Win + Py35 | pip install https://data.dgl.ai/wheels/cuda9.0/dgl-0.3-cp35-cp35m-win_amd64.whl | pip install https://data.dgl.ai/wheels/cuda10.0/dgl-0.3-cp35-cp35m-win_amd64.whl | 
| Win + Py36 | pip install https://data.dgl.ai/wheels/cuda9.0/dgl-0.3-cp36-cp36m-win_amd64.whl | pip install https://data.dgl.ai/wheels/cuda10.0/dgl-0.3-cp36-cp36m-win_amd64.whl | 
| Win + Py37 | pip install https://data.dgl.ai/wheels/cuda9.0/dgl-0.3-cp37-cp37m-win_amd64.whl | pip install https://data.dgl.ai/wheels/cuda10.0/dgl-0.3-cp37-cp37m-win_amd64.whl | 
| MacOS | N/A | N/A | 
Installing CUDA builds with conda
Conda users could install the CUDA builds with
conda install -c dglteam dgl-cuda9.0   # For CUDA 9.0
conda install -c dglteam dgl-cuda10.0  # For CUDA 10.0
DGL currently support CUDA 9.0 (dgl-cuda9.0) and 10.0 (dgl-cuda10.0). To find your CUDA version, use nvcc --version. To install from source, checkout our installation guide.
New built-in message and reduce functions
We have expanded the list of built-in message and reduce functions to cover more use cases. Previously, DGL only has copy_src, copy_edge, src_mul_edge. With the v0.3 release, we support more combinations. Here is a demonstration of some of the new builtin functions.
import dgl
import dgl.function as fn
import torch as th
g = ... # create a DGLGraph
g.ndata['h'] = th.randn((g.number_of_nodes(), 10)) # each node has feature size 10
g.edata['w'] = th.randn((g.number_of_edges(), 1))  # each edge has feature size 1
# collect features from source nodes and aggregate them in destination nodes
g.update_all(fn.copy_u('h', 'm'), fn.sum('m', 'h_sum'))
# multiply source node features with edge weights and aggregate them in destination nodes
g.update_all(fn.u_mul_e('h', 'w', 'm'), fn.max('m', 'h_max'))
# compute edge embedding by multiplying source and destination node embeddings
g.apply_edges(fn.u_mul_v('h', 'h', 'w_new'))
As you can see, the syntax is quite straight-forward. u_mul_e means multiplying the source node data with the edge data; u_mul_v means multiplying the source node data with the destination node data, and so on and so forth. Each builtin combination will be mapped to a CPU/CUDA kernel and broadcasting and gradient computation are also supported. Checkout our document for more details.
Training giant graphs
We added new components shared-memory DGLGraph and distributed samplers to support distributed and multi-processing training of graph neural networks.
Two new tutorials are now live:
- Train GNNs by neighbor sampling and its variants (link).
- Scale the sampler-trainer architecture to giant graphs using distributed graph store (link).
We also provide scripts on how to setup such distributed setting (link).
Enhancement and bugfix
- NN modules
    - dgl.nn.[mxnet|pytorch].edge_softmaxnow directly returns the normalized scores on edges.
- Fix a memory leak bug when graph is passed as the input.
 
- Graph
    - DGLGraphnow supports direct conversion from scipy csr matrix rather than conversion to coo matrix first.
- Readonly graph can now be batched via dgl.batch.
- DGLGraphnow supports node/edge removal via- DGLGraph.remove_nodesand- DGLGraph.remove_edges(doc).
- A new API DGLGraph.to(device)that can move all node/edge data to the given device.
- A new API dgl.to_simplethat can convert a graph to a simple graph with no multi-edges.
- A new API dgl.to_bidirectedthat can convert a graph to a bidirectional graph.
- A new API dgl.contrib.sampling.random_walkthat can generate random walks from a graph.
- Allow DGLGraphto be constructed from anotherDGLGraph.
 
- New model examples
    - APPNP
- GIN
- PinSage (slow version)
- DGI
 
- Bugfix
    - Fix a bug where numpy integer is passed in as the argument.
- Fix a bug when constructing from a networkx graph that has no edge.
- Fix a bug in nodeflow where id is not correctly converted sometimes.
- Fix a bug in MiniGC dataset where the number of nodes is not consistent.
- Fix a bug in RGCN example when bfs_level=0.
- Fix a bug where DLContext is not correctly exposed in CFFI.
- Fix a crash during Cython build.
- Fix a bug in sendwhen the given message function is a builtin.
 
12 June

