.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "tutorials/large/L0_neighbor_sampling_overview.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_tutorials_large_L0_neighbor_sampling_overview.py: Introduction of Neighbor Sampling for GNN Training ================================================== In :doc:`previous tutorials <../blitz/1_introduction>` you have learned how to train GNNs by computing the representations of all nodes on a graph. However, sometimes your graph is too large to fit the computation of all nodes in a single GPU. By the end of this tutorial, you will be able to - Understand the pipeline of stochastic GNN training. - Understand what is neighbor sampling and why it yields a bipartite graph for each GNN layer. .. GENERATED FROM PYTHON SOURCE LINES 19-65 Message Passing Review ---------------------- Recall that in `Gilmer et al. `__ (also in :doc:`message passing tutorial <../blitz/3_message_passing>`), the message passing formulation is as follows: .. math:: m_{u\to v}^{(l)} = M^{(l)}\left(h_v^{(l-1)}, h_u^{(l-1)}, e_{u\to v}^{(l-1)}\right) .. math:: m_{v}^{(l)} = \sum_{u\in\mathcal{N}(v)}m_{u\to v}^{(l)} .. math:: h_v^{(l)} = U^{(l)}\left(h_v^{(l-1)}, m_v^{(l)}\right) where DGL calls :math:`M^{(l)}` the *message function*, :math:`\sum` the *reduce function* and :math:`U^{(l)}` the *update function*. Note that :math:`\sum` here can represent any function and is not necessarily a summation. Essentially, the :math:`l`-th layer representation of a single node depends on the :math:`(l-1)`-th layer representation of the same node, as well as the :math:`(l-1)`-th layer representation of the neighboring nodes. Those :math:`(l-1)`-th layer representations then depend on the :math:`(l-2)`-th layer representation of those nodes, as well as their neighbors. The following animation shows how a 2-layer GNN is supposed to compute the output of node 5: |image1| You can see that to compute node 5 from the second layer, you will need its direct neighbors’ first layer representations (colored in yellow), which in turn needs their direct neighbors’ (i.e. node 5’s second-hop neighbors’) representations (colored in green). .. |image1| image:: https://data.dgl.ai/tutorial/img/sampling.gif .. GENERATED FROM PYTHON SOURCE LINES 68-91 Neighbor Sampling Overview -------------------------- You can also see from the previous example that computing representation for a small number of nodes often requires input features of a significantly larger number of nodes. Taking all neighbors for message aggregation is often too costly since the nodes needed for input features would easily cover a large portion of the graph, especially for real-world graphs which are often `scale-free `__. Neighbor sampling addresses this issue by selecting a subset of the neighbors to perform aggregation. For instance, to compute :math:`\boldsymbol{h}_5^{(2)}`, you can choose two of the neighbors instead of all of them to aggregate, as in the following animation: |image2| You can see that this method uses much fewer nodes needed in message passing for a single minibatch. .. |image2| image:: https://data.dgl.ai/tutorial/img/bipartite.gif .. GENERATED FROM PYTHON SOURCE LINES 94-107 You can also notice in the animation above that the computation dependencies in the animation above can be described as a series of bipartite graphs. The output nodes (called *destination nodes*) are on one side and all the nodes necessary for inputs (called *source nodes*) are on the other side. The arrows indicate how the sampled neighbors propagates messages to the nodes. DGL calls such graphs *message flow graphs* (MFG). Note that some GNN modules, such as `SAGEConv`, need to use the destination nodes' features on the previous layer to compute the outputs. Without loss of generality, DGL always includes the destination nodes themselves in the source nodes. .. GENERATED FROM PYTHON SOURCE LINES 110-116 What’s next? ------------ :doc:`Stochastic GNN Training for Node Classification in DGL ` .. GENERATED FROM PYTHON SOURCE LINES 116-120 .. code-block:: Python # Thumbnail credits: Understanding graph embedding methods and their applications, Mengjia Xu # sphinx_gallery_thumbnail_path = '_static/large_L0_neighbor_sampling_overview.png' .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.000 seconds) .. _sphx_glr_download_tutorials_large_L0_neighbor_sampling_overview.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: L0_neighbor_sampling_overview.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: L0_neighbor_sampling_overview.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: L0_neighbor_sampling_overview.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_