论文标题

开采大型准式库,并提供来自顶点社区的质量保证

Mining Large Quasi-cliques with Quality Guarantees from Vertex Neighborhoods

论文作者

Konar, Aritra, Sidiropoulos, Nicholas D.

论文摘要

挖掘密集的子图是一系列图形挖掘任务的重要原始性。在这项工作中,我们正式确定了现实图形的两个经常性特征,即重型度分布和较大的聚类系数,这意味着存在具有高边缘密度的基本较大的顶点社区。该观察结果提出了一种非常简单的方法来提取大型准级别:只需扫描顶点社区,计算每个顶点的聚类系数,然后输出最佳的此类子图。这种方法的实现需要计算图中的三角形,这在图挖掘中是一个充分研究的问题。当经验测试在许多现实世界图上进行了经验测试时,这种方法揭示了一个惊喜:顶点社区包括非平凡尺寸的最大集团,而最佳邻域的密度通常与专用算法产生的子图相比较,以最大程度地提高子图的密度。对于具有小聚类系数的图形,我们证明可以使用局部搜索方法对小顶点邻域进行完善,以``成长''较大的集团和近乎固有的群体。我们的结果表明,与最糟糕的理论结果相反,从现实世界图中的非平凡尺寸的采矿集团和准清算通常并不是一个困难的问题,并为进一步的工作提供了动力,以更好地解释这些经验成功。

Mining dense subgraphs is an important primitive across a spectrum of graph-mining tasks. In this work, we formally establish that two recurring characteristics of real-world graphs, namely heavy-tailed degree distributions and large clustering coefficients, imply the existence of substantially large vertex neighborhoods with high edge-density. This observation suggests a very simple approach for extracting large quasi-cliques: simply scan the vertex neighborhoods, compute the clustering coefficient of each vertex, and output the best such subgraph. The implementation of such a method requires counting the triangles in a graph, which is a well-studied problem in graph mining. When empirically tested across a number of real-world graphs, this approach reveals a surprise: vertex neighborhoods include maximal cliques of non-trivial sizes, and the density of the best neighborhood often compares favorably to subgraphs produced by dedicated algorithms for maximizing subgraph density. For graphs with small clustering coefficients, we demonstrate that small vertex neighborhoods can be refined using a local-search method to ``grow'' larger cliques and near-cliques. Our results indicate that contrary to worst-case theoretical results, mining cliques and quasi-cliques of non-trivial sizes from real-world graphs is often not a difficult problem, and provides motivation for further work geared towards a better explanation of these empirical successes.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源