dgl.edge_label_informativeness
- dgl.edge_label_informativeness(graph, y, eps=1e-08)[source]
Label informativeness (
) is a characteristic of labeled graphs proposed in the Characterizing Graph Datasets for Node Classification: Homophily-Heterophily Dichotomy and BeyondLabel informativeness shows how much information about a node’s label we get from knowing its neighbor’s label. Formally, assume that we sample an edge
. The class labels of nodes and are then random variables and . We want to measure the amount of knowledge the label gives for predicting . The entropy measures the hardness’ of predicting the label of :math:xi` without knowing . Given , this value is reduced to the conditional entropy . In other words, reveals information about the label. To make the obtained quantity comparable across different datasets, label informativeness is defined as the normalized mutual information of and :Depending on the distribution used for sampling an edge
, several variants of label informativeness can be obtained. Two of them are particularly intuitive: in edge label informativeness ( ), edges are sampled uniformly at random, and in node label informativeness ( ), first a node is sampled uniformly at random and then an edge incident to it is sampled uniformly at random. These two versions of label informativeness differ in how they weight high/low-degree nodes. In edge label informativeness, averaging is over the edges, thus high-degree nodes are given more weight. In node label informativeness, averaging is over the nodes, so all nodes are weighted equally.This function computes edge label informativeness.
- Parameters:
- Returns:
The edge label informativeness value.
- Return type:
Examples
>>> import dgl >>> import torch
>>> graph = dgl.graph(([0, 1, 2, 2, 3, 4], [1, 2, 0, 3, 4, 5])) >>> y = torch.tensor([0, 0, 0, 0, 1, 1]) >>> dgl.edge_label_informativeness(graph, y) 0.25177597999572754