论文标题

几何模型中的社区恢复

Community Recovery in the Geometric Block Model

论文作者

Galhotra, Sainyam, Mazumdar, Arya, Pal, Soumyabrata, Saha, Barna

论文摘要

为了捕获许多社区检测问题的固有几何特征,我们建议使用称为几何块模型的新的随机图模型。几何块模型建立在随机几何图上(Gilbert,1961),这是空间网络的随机图的基本模型之一,就像研究良好的随机块模型在Erdős-r \'{en} yi随机图上构建一样。它也是受社区发现中最新的理论和实际进步启发的随机社区模型的自然扩展。为了分析几何块模型,我们首先为随机几何图的概括提供了新的连接结果。自引入以来,已经研究了几何图的连接性能,并且由于相关的边缘形成而导致的Erdős-r \'{en} yi对应物比其ERDőS-R \'{en} yi对应物更加困难。 然后,我们使用随机环形图的连通性结果,为几何块模型有效地恢复社区的必要条件。我们表明,一种简单的三角计数算法来检测几何模型中的社区几乎是最佳的。为此,我们考虑以下两个图密度的态度。 在图表的平均程度随着顶点数量增长的状态中,我们表明我们的算法在理论上和实际上都表现出色。相比之下,对数学度方案中随机块模型的三角计数算法远非最佳。我们在真实和合成数据集上模拟我们的结果,以显示新模型以及我们的算法的卓越性能。

To capture the inherent geometric features of many community detection problems, we propose to use a new random graph model of communities that we call a Geometric Block Model. The geometric block model builds on the random geometric graphs (Gilbert, 1961), one of the basic models of random graphs for spatial networks, in the same way that the well-studied stochastic block model builds on the Erdős-R\'{en}yi random graphs. It is also a natural extension of random community models inspired by the recent theoretical and practical advancements in community detection. To analyze the geometric block model, we first provide new connectivity results for random annulus graphs which are generalizations of random geometric graphs. The connectivity properties of geometric graphs have been studied since their introduction, and analyzing them has been more difficult than their Erdős-R\'{en}yi counterparts due to correlated edge formation. We then use the connectivity results of random annulus graphs to provide necessary and sufficient conditions for efficient recovery of communities for the geometric block model. We show that a simple triangle-counting algorithm to detect communities in the geometric block model is near-optimal. For this we consider the following two regimes of graph density. In the regime where the average degree of the graph grows logarithmically with the number of vertices, we show that our algorithm performs extremely well, both theoretically and practically. In contrast, the triangle-counting algorithm is far from being optimum for the stochastic block model in the logarithmic degree regime. We simulate our results on both real and synthetic datasets to show superior performance of both the new model as well as our algorithm.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源