dedensify#

dedensify(G, threshold, prefix=None, copy=True)[source]#

压缩高度数节点周围的邻域

通过添加压缩节点(这些节点概括了指向高度数节点(度数大于给定阈值的节点)的多个同类型边),减少了指向高度数节点的边数量。去稠密化还有额外的好处,即减少高度数节点周围的边数量。当前实现支持只有一种边类型的图。

参数:
G: 图

一个 networkx 图

threshold: 整数

节点被视为高度数节点的最小度数阈值。该阈值必须大于或等于2。

prefix: 字符串或 None,可选 (默认值: None)

用于表示压缩节点的可选前缀

copy: 布尔值,可选 (默认值: True)

指示是否应进行原地去稠密化

返回:
去稠密化的 networkx 图(图, 集合)

包含去稠密化图和压缩节点集合的 2 元组

注意

根据[1]中的算法,通过添加压缩节点来压缩/解压缩高度数节点周围的邻域,从而移除图中的边,这些压缩节点概括了指向高度数节点的多个同类型边。去稠密化仅在这样做能减少给定图中的总边数时才会添加压缩节点。当前实现支持只有一种边类型的图。

参考文献

[1]

Maccioni, A., & Abadi, D. J. (2016, August). Scalable pattern matching over compressed graphs via dedensification. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1755-1764). http://www.cs.umd.edu/~abadi/papers/graph-dedense.pdf

示例

去稠密化仅在这样做会导致边数减少时才会添加压缩节点

>>> original_graph = nx.DiGraph()
>>> original_graph.add_nodes_from(
...     ["1", "2", "3", "4", "5", "6", "A", "B", "C"]
... )
>>> original_graph.add_edges_from(
...     [
...         ("1", "C"), ("1", "B"),
...         ("2", "C"), ("2", "B"), ("2", "A"),
...         ("3", "B"), ("3", "A"), ("3", "6"),
...         ("4", "C"), ("4", "B"), ("4", "A"),
...         ("5", "B"), ("5", "A"),
...         ("6", "5"),
...         ("A", "6")
...     ]
... )
>>> c_graph, c_nodes = nx.dedensify(original_graph, threshold=2)
>>> original_graph.number_of_edges()
15
>>> c_graph.number_of_edges()
14

去稠密化的有向图可以“稠密化”以重建原始图

>>> original_graph = nx.DiGraph()
>>> original_graph.add_nodes_from(
...     ["1", "2", "3", "4", "5", "6", "A", "B", "C"]
... )
>>> original_graph.add_edges_from(
...     [
...         ("1", "C"), ("1", "B"),
...         ("2", "C"), ("2", "B"), ("2", "A"),
...         ("3", "B"), ("3", "A"), ("3", "6"),
...         ("4", "C"), ("4", "B"), ("4", "A"),
...         ("5", "B"), ("5", "A"),
...         ("6", "5"),
...         ("A", "6")
...     ]
... )
>>> c_graph, c_nodes = nx.dedensify(original_graph, threshold=2)
>>> # re-densifies the compressed graph into the original graph
>>> for c_node in c_nodes:
...     all_neighbors = set(nx.all_neighbors(c_graph, c_node))
...     out_neighbors = set(c_graph.neighbors(c_node))
...     for out_neighbor in out_neighbors:
...         c_graph.remove_edge(c_node, out_neighbor)
...     in_neighbors = all_neighbors - out_neighbors
...     for in_neighbor in in_neighbors:
...         c_graph.remove_edge(in_neighbor, c_node)
...         for out_neighbor in out_neighbors:
...             c_graph.add_edge(in_neighbor, out_neighbor)
...     c_graph.remove_node(c_node)
...
>>> nx.is_isomorphic(original_graph, c_graph)
True