FusedCSCSamplingGraph๏ƒ

class dgl.graphbolt.FusedCSCSamplingGraph(c_csc_graph: ScriptObject)[source]๏ƒ

Bases: SamplingGraph

A sampling graph in CSC format.

copy_to_shared_memory(shared_memory_name: str)[source]๏ƒ

Copy the graph to shared memory.

Parameters:

shared_memory_name (str) โ€“ Name of the shared memory.

Returns:

The copied FusedCSCSamplingGraph object on shared memory.

Return type:

FusedCSCSamplingGraph

in_subgraph(nodes: Tensor | Dict[str, Tensor]) โ†’ SampledSubgraphImpl[source]๏ƒ

Return the subgraph induced on the inbound edges of the given nodes.

An in subgraph is equivalent to creating a new graph using the incoming edges of the given nodes. Subgraph is compacted according to the order of passed-in nodes.

Parameters:

nodes (torch.Tensor or Dict[str, torch.Tensor]) โ€“

IDs of the given seed nodes.
  • If nodes is a tensor: It means the graph is homogeneous graph, and ids inside are homogeneous ids.

  • If nodes is a dictionary: The keys should be node type and ids inside are heterogeneous ids.

Returns:

The in subgraph.

Return type:

SampledSubgraphImpl

Examples

>>> import dgl.graphbolt as gb
>>> import torch
>>> total_num_nodes = 5
>>> total_num_edges = 12
>>> ntypes = {"N0": 0, "N1": 1}
>>> etypes = {
...     "N0:R0:N0": 0, "N0:R1:N1": 1, "N1:R2:N0": 2, "N1:R3:N1": 3}
>>> indptr = torch.LongTensor([0, 3, 5, 7, 9, 12])
>>> indices = torch.LongTensor([0, 1, 4, 2, 3, 0, 1, 1, 2, 0, 3, 4])
>>> node_type_offset = torch.LongTensor([0, 2, 5])
>>> type_per_edge = torch.LongTensor(
...     [0, 0, 2, 2, 2, 1, 1, 1, 3, 1, 3, 3])
>>> graph = gb.fused_csc_sampling_graph(indptr, indices,
...     node_type_offset=node_type_offset,
...     type_per_edge=type_per_edge,
...     node_type_to_id=ntypes,
...     edge_type_to_id=etypes)
>>> nodes = {"N0":torch.LongTensor([1]), "N1":torch.LongTensor([1, 2])}
>>> in_subgraph = graph.in_subgraph(nodes)
>>> print(in_subgraph.sampled_csc)
{'N0:R0:N0': CSCFormatBase(indptr=tensor([0, 0]),
      indices=tensor([], dtype=torch.int64),
), 'N0:R1:N1': CSCFormatBase(indptr=tensor([0, 1, 2]),
            indices=tensor([1, 0]),
), 'N1:R2:N0': CSCFormatBase(indptr=tensor([0, 2]),
            indices=tensor([0, 1]),
), 'N1:R3:N1': CSCFormatBase(indptr=tensor([0, 1, 3]),
            indices=tensor([0, 1, 2]),
)}
pin_memory_()[source]๏ƒ

Copy FusedCSCSamplingGraph to the pinned memory in-place. Returns the same object modified in-place.

sample_layer_neighbors(nodes: Tensor | Dict[str, Tensor], fanouts: Tensor, replace: bool = False, probs_name: str | None = None) โ†’ SampledSubgraphImpl[source]๏ƒ

Sample neighboring edges of the given nodes and return the induced subgraph via layer-neighbor sampling from the NeurIPS 2023 paper Layer-Neighbor Sampling โ€“ Defusing Neighborhood Explosion in GNNs

Parameters:
  • nodes (torch.Tensor or Dict[str, torch.Tensor]) โ€“

    IDs of the given seed nodes.
    • If nodes is a tensor: It means the graph is homogeneous graph, and ids inside are homogeneous ids.

    • If nodes is a dictionary: The keys should be node type and ids inside are heterogeneous ids.

  • fanouts (torch.Tensor) โ€“

    The number of edges to be sampled for each node with or without considering edge types.

    • When the length is 1, it indicates that the fanout applies to all neighbors of the node as a collective, regardless of the edge type.

    • Otherwise, the length should equal to the number of edge types, and each fanout value corresponds to a specific edge type of the nodes.

    The value of each fanout should be >= 0 or = -1.
    • When the value is -1, all neighbors (with non-zero probability, if weighted) will be sampled once regardless of replacement. It is equivalent to selecting all neighbors with non-zero probability when the fanout is >= the number of neighbors (and replace is set to false).

    • When the value is a non-negative integer, it serves as a minimum threshold for selecting neighbors.

  • replace (bool) โ€“ Boolean indicating whether the sample is preformed with or without replacement. If True, a value can be selected multiple times. Otherwise, each value can be selected only once.

  • probs_name (str, optional) โ€“ An optional string specifying the name of an edge attribute. This attribute tensor should contain (unnormalized) probabilities corresponding to each neighboring edge of a node. It must be a 1D floating-point or boolean tensor, with the number of elements equalling the total number of edges.

Returns:

The sampled subgraph.

Return type:

SampledSubgraphImpl

Examples

>>> import dgl.graphbolt as gb
>>> import torch
>>> ntypes = {"n1": 0, "n2": 1}
>>> etypes = {"n1:e1:n2": 0, "n2:e2:n1": 1}
>>> indptr = torch.LongTensor([0, 2, 4, 6, 7, 9])
>>> indices = torch.LongTensor([2, 4, 2, 3, 0, 1, 1, 0, 1])
>>> node_type_offset = torch.LongTensor([0, 2, 5])
>>> type_per_edge = torch.LongTensor([1, 1, 1, 1, 0, 0, 0, 0, 0])
>>> graph = gb.fused_csc_sampling_graph(indptr, indices,
...     node_type_offset=node_type_offset,
...     type_per_edge=type_per_edge,
...     node_type_to_id=ntypes,
...     edge_type_to_id=etypes)
>>> nodes = {'n1': torch.LongTensor([0]), 'n2': torch.LongTensor([0])}
>>> fanouts = torch.tensor([1, 1])
>>> subgraph = graph.sample_layer_neighbors(nodes, fanouts)
>>> print(subgraph.sampled_csc)
{'n1:e1:n2': CSCFormatBase(indptr=tensor([0, 1]),
            indices=tensor([0]),
), 'n2:e2:n1': CSCFormatBase(indptr=tensor([0, 1]),
            indices=tensor([2]),
)}
sample_negative_edges_uniform(edge_type, node_pairs, negative_ratio)[source]๏ƒ

Sample negative edges by randomly choosing negative source-destination pairs according to a uniform distribution. For each edge (u, v), it is supposed to generate negative_ratio pairs of negative edges (u, v'), where v' is chosen uniformly from all the nodes in the graph. As u is exactly same as the corresponding positive edges, it returns None for negative sources.

Parameters:
  • edge_type (str) โ€“ The type of edges in the provided node_pairs. Any negative edges sampled will also have the same type. If set to None, it will be considered as a homogeneous graph.

  • node_pairs (Tuple[Tensor, Tensor]) โ€“ A tuple of two 1D tensors that represent the source and destination of positive edges, with โ€˜positiveโ€™ indicating that these edges are present in the graph. Itโ€™s important to note that within the context of a heterogeneous graph, the ids in these tensors signify heterogeneous ids.

  • negative_ratio (int) โ€“ The ratio of the number of negative samples to positive samples.

Returns:

A tuple consisting of two 1D tensors represents the source and destination of negative edges. In the context of a heterogeneous graph, both the input nodes and the selected nodes are represented by heterogeneous IDs, and the formed edges are of the input type edge_type. Note that negative refers to false negatives, which means the edge could be present or not present in the graph.

Return type:

Tuple[Tensor, Tensor]

sample_negative_edges_uniform_2(edge_type, node_pairs, negative_ratio)[source]๏ƒ

Sample negative edges by randomly choosing negative source-destination edges according to a uniform distribution. For each edge (u, v), it is supposed to generate negative_ratio pairs of negative edges (u, v'), where v' is chosen uniformly from all the nodes in the graph. u is exactly same as the corresponding positive edges. It returns positive edges concatenated with negative edges. In negative edges, negative sources are constructed from the corresponding positive edges.

Parameters:
  • edge_type (str) โ€“ The type of edges in the provided node_pairs. Any negative edges sampled will also have the same type. If set to None, it will be considered as a homogeneous graph.

  • node_pairs (torch.Tensor) โ€“ A 2D tensors that represent the N pairs of positive edges in source-destination format, with โ€˜positiveโ€™ indicating that these edges are present in the graph. Itโ€™s important to note that within the context of a heterogeneous graph, the ids in these tensors signify heterogeneous ids.

  • negative_ratio (int) โ€“ The ratio of the number of negative samples to positive samples.

Returns:

A 2D tensors represents the N pairs of positive and negative source-destination node pairs. In the context of a heterogeneous graph, both the input nodes and the selected nodes are represented by heterogeneous IDs, and the formed edges are of the input type edge_type. Note that negative refers to false negatives, which means the edge could be present or not present in the graph.

Return type:

torch.Tensor

sample_neighbors(nodes: Tensor | Dict[str, Tensor], fanouts: Tensor, replace: bool = False, probs_name: str | None = None) โ†’ SampledSubgraphImpl[source]๏ƒ

Sample neighboring edges of the given nodes and return the induced subgraph.

Parameters:
  • nodes (torch.Tensor or Dict[str, torch.Tensor]) โ€“

    IDs of the given seed nodes.
    • If nodes is a tensor: It means the graph is homogeneous graph, and ids inside are homogeneous ids.

    • If nodes is a dictionary: The keys should be node type and ids inside are heterogeneous ids.

  • fanouts (torch.Tensor) โ€“

    The number of edges to be sampled for each node with or without considering edge types.

    • When the length is 1, it indicates that the fanout applies to all neighbors of the node as a collective, regardless of the edge type.

    • Otherwise, the length should equal to the number of edge types, and each fanout value corresponds to a specific edge type of the nodes.

    The value of each fanout should be >= 0 or = -1.
    • When the value is -1, all neighbors (with non-zero probability, if weighted) will be sampled once regardless of replacement. It is equivalent to selecting all neighbors with non-zero probability when the fanout is >= the number of neighbors (and replace is set to false).

    • When the value is a non-negative integer, it serves as a minimum threshold for selecting neighbors.

  • replace (bool) โ€“ Boolean indicating whether the sample is preformed with or without replacement. If True, a value can be selected multiple times. Otherwise, each value can be selected only once.

  • probs_name (str, optional) โ€“ An optional string specifying the name of an edge attribute used. This attribute tensor should contain (unnormalized) probabilities corresponding to each neighboring edge of a node. It must be a 1D floating-point or boolean tensor, with the number of elements equalling the total number of edges.

Returns:

The sampled subgraph.

Return type:

SampledSubgraphImpl

Examples

>>> import dgl.graphbolt as gb
>>> import torch
>>> ntypes = {"n1": 0, "n2": 1}
>>> etypes = {"n1:e1:n2": 0, "n2:e2:n1": 1}
>>> indptr = torch.LongTensor([0, 2, 4, 6, 7, 9])
>>> indices = torch.LongTensor([2, 4, 2, 3, 0, 1, 1, 0, 1])
>>> node_type_offset = torch.LongTensor([0, 2, 5])
>>> type_per_edge = torch.LongTensor([1, 1, 1, 1, 0, 0, 0, 0, 0])
>>> graph = gb.fused_csc_sampling_graph(indptr, indices,
...     node_type_offset=node_type_offset,
...     type_per_edge=type_per_edge,
...     node_type_to_id=ntypes,
...     edge_type_to_id=etypes)
>>> nodes = {'n1': torch.LongTensor([0]), 'n2': torch.LongTensor([0])}
>>> fanouts = torch.tensor([1, 1])
>>> subgraph = graph.sample_neighbors(nodes, fanouts)
>>> print(subgraph.sampled_csc)
{'n1:e1:n2': CSCFormatBase(indptr=tensor([0, 1]),
            indices=tensor([0]),
), 'n2:e2:n1': CSCFormatBase(indptr=tensor([0, 1]),
            indices=tensor([2]),
)}
temporal_sample_neighbors(nodes: Tensor | Dict[str, Tensor], input_nodes_timestamp: Tensor | Dict[str, Tensor], fanouts: Tensor, replace: bool = False, probs_name: str | None = None, node_timestamp_attr_name: str | None = None, edge_timestamp_attr_name: str | None = None) โ†’ ScriptObject[source]๏ƒ

Temporally Sample neighboring edges of the given nodes and return the induced subgraph.

If node_timestamp_attr_name or edge_timestamp_attr_name is given, the sampled neighbor or edge of an input node must have a timestamp that is smaller than that of the input node.

Parameters:
  • nodes (torch.Tensor) โ€“ IDs of the given seed nodes.

  • input_nodes_timestamp (torch.Tensor) โ€“ Timestamps of the given seed nodes.

  • fanouts (torch.Tensor) โ€“

    The number of edges to be sampled for each node with or without considering edge types.

    • When the length is 1, it indicates that the fanout applies to all neighbors of the node as a collective, regardless of the edge type.

    • Otherwise, the length should equal to the number of edge types, and each fanout value corresponds to a specific edge type of the nodes.

    The value of each fanout should be >= 0 or = -1.
    • When the value is -1, all neighbors (with non-zero probability, if weighted) will be sampled once regardless of replacement. It is equivalent to selecting all neighbors with non-zero probability when the fanout is >= the number of neighbors (and replace is set to false).

    • When the value is a non-negative integer, it serves as a minimum threshold for selecting neighbors.

  • replace (bool) โ€“ Boolean indicating whether the sample is preformed with or without replacement. If True, a value can be selected multiple times. Otherwise, each value can be selected only once.

  • probs_name (str, optional) โ€“ An optional string specifying the name of an edge attribute. This attribute tensor should contain (unnormalized) probabilities corresponding to each neighboring edge of a node. It must be a 1D floating-point or boolean tensor, with the number of elements equalling the total number of edges.

  • node_timestamp_attr_name (str, optional) โ€“ An optional string specifying the name of an node attribute.

  • edge_timestamp_attr_name (str, optional) โ€“ An optional string specifying the name of an edge attribute.

Returns:

The sampled subgraph.

Return type:

SampledSubgraphImpl

to(device: device) โ†’ None[source]๏ƒ

Copy FusedCSCSamplingGraph to the specified device.

property csc_indptr: tensor๏ƒ

Returns the indices pointer in the CSC graph.

Returns:

The indices pointer in the CSC graph. An integer tensor with shape (total_num_nodes+1,).

Return type:

torch.tensor

property edge_attributes: Dict[str, Tensor] | None๏ƒ

Returns the edge attributes dictionary.

Returns:

If present, returns a dictionary of edge attributes. Each key represents the attributeโ€™s name, while the corresponding value holds the attributeโ€™s specific value. The length of each value should match the total number of edges.โ€

Return type:

Dict[str, torch.Tensor] or None

property edge_type_to_id: Dict[str, int] | None๏ƒ

Returns the edge type to id dictionary if present.

Returns:

If present, returns a dictionary mapping edge type to edge type id.

Return type:

Dict[str, int] or None

property indices: tensor๏ƒ

Returns the indices in the CSC graph.

Returns:

The indices in the CSC graph. An integer tensor with shape (total_num_edges,).

Return type:

torch.tensor

Notes

It is assumed that edges of each node are already sorted by edge type ids.

property node_attributes: Dict[str, Tensor] | None๏ƒ

Returns the node attributes dictionary.

Returns:

If present, returns a dictionary of node attributes. Each key represents the attributeโ€™s name, while the corresponding value holds the attributeโ€™s specific value. The length of each value should match the total number of nodes.โ€

Return type:

Dict[str, torch.Tensor] or None

property node_type_offset: Tensor | None๏ƒ

Returns the node type offset tensor if present. Do not modify the returned tensor in place.

Returns:

If present, returns a 1D integer tensor of shape (num_node_types + 1,). The tensor is in ascending order as nodes of the same type have continuous IDs, and larger node IDs are paired with larger node type IDs. The first value is 0 and last value is the number of nodes. And nodes with IDs between node_type_offset_[i]~node_type_offset_[i+1] are of type id โ€˜iโ€™.

Return type:

torch.Tensor or None

property node_type_to_id: Dict[str, int] | None๏ƒ

Returns the node type to id dictionary if present.

Returns:

If present, returns a dictionary mapping node type to node type id.

Return type:

Dict[str, int] or None

property num_edges: int | Dict[str, int]๏ƒ

The number of edges in the graph. - If the graph is homogenous, returns an integer. - If the graph is heterogenous, returns a dictionary.

Returns:

The number of edges. Integer indicates the total edges number of a homogenous graph; dict indicates edges number per edge types of a heterogenous graph.

Return type:

Union[int, Dict[str, int]]

Examples

>>> import dgl.graphbolt as gb, torch
>>> total_num_nodes = 5
>>> total_num_edges = 12
>>> ntypes = {"N0": 0, "N1": 1}
>>> etypes = {"N0:R0:N0": 0, "N0:R1:N1": 1,
...     "N1:R2:N0": 2, "N1:R3:N1": 3}
>>> indptr = torch.LongTensor([0, 3, 5, 7, 9, 12])
>>> indices = torch.LongTensor([0, 1, 4, 2, 3, 0, 1, 1, 2, 0, 3, 4])
>>> node_type_offset = torch.LongTensor([0, 2, 5])
>>> type_per_edge = torch.LongTensor(
...     [0, 0, 2, 2, 2, 1, 1, 1, 3, 1, 3, 3])
>>> metadata = gb.GraphMetadata(ntypes, etypes)
>>> graph = gb.fused_csc_sampling_graph(indptr, indices, node_type_offset,
...     type_per_edge, None, metadata)
>>> print(graph.num_edges)
{'N0:R0:N0': 2, 'N0:R1:N1': 1, 'N1:R2:N0': 2, 'N1:R3:N1': 3}
property num_nodes: int | Dict[str, int]๏ƒ

The number of nodes in the graph. - If the graph is homogenous, returns an integer. - If the graph is heterogenous, returns a dictionary.

Returns:

The number of nodes. Integer indicates the total nodes number of a homogenous graph; dict indicates nodes number per node types of a heterogenous graph.

Return type:

Union[int, Dict[str, int]]

Examples

>>> import dgl.graphbolt as gb, torch
>>> total_num_nodes = 5
>>> total_num_edges = 12
>>> ntypes = {"N0": 0, "N1": 1}
>>> etypes = {"N0:R0:N0": 0, "N0:R1:N1": 1,
...     "N1:R2:N0": 2, "N1:R3:N1": 3}
>>> indptr = torch.LongTensor([0, 3, 5, 7, 9, 12])
>>> indices = torch.LongTensor([0, 1, 4, 2, 3, 0, 1, 1, 2, 0, 3, 4])
>>> node_type_offset = torch.LongTensor([0, 2, 5])
>>> type_per_edge = torch.LongTensor(
...     [0, 0, 2, 2, 2, 1, 1, 1, 3, 1, 3, 3])
>>> graph = gb.fused_csc_sampling_graph(indptr, indices,
...     node_type_offset=node_type_offset,
...     type_per_edge=type_per_edge,
...     node_type_to_id=ntypes,
...     edge_type_to_id=etypes)
>>> print(graph.num_nodes)
{'N0': 2, 'N1': 3}
property total_num_edges: int๏ƒ

Returns the number of edges in the graph.

Returns:

The number of edges in the graph.

Return type:

int

property total_num_nodes: int๏ƒ

Returns the number of nodes in the graph.

Returns:

The number of rows in the dense format.

Return type:

int

property type_per_edge: Tensor | None๏ƒ

Returns the edge type tensor if present.

Returns:

If present, returns a 1D integer tensor of shape (total_num_edges,) containing the type of each edge in the graph.

Return type:

torch.Tensor or None