论文标题
异构网络表示学习:一个统一的框架和基准测试
Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark
论文作者
论文摘要
由于现实世界的对象及其交互通常是多模式和多类型的,因此,异质网络已被广泛用作传统同质网络的更强大,更现实和通用的超类(图)。同时,最近对表示学习(\ aka〜嵌入)进行了深入研究,并显示出对各种网络挖掘和分析任务的有效性。在这项工作中,我们旨在提供一个统一的框架,以深入总结和评估异类网络嵌入(HNE)的现有研究,其中包括但超出了正常的调查。由于已经有了广泛的HNE算法,因此作为这项工作的第一个贡献,我们为各种现有HNE算法的优点提供了系统分类和分析的通用范式。此外,现有的HNE算法虽然主要声称是通用的,但通常在不同的数据集上进行评估。由于HNE的应用有利,可以理解的是,这种间接比较在很大程度上阻碍了改进的任务性能的适当归因于有效的数据预处理和新颖的技术设计,尤其是考虑到从现实世界应用数据中构建异质网络的各种方法。因此,作为第二个贡献,我们创建了四个基准数据集,这些数据集具有各种属性,涉及规模,结构,属性/标签可用性以及\等。作为第三个贡献,我们仔细地进行了重构并修改了实现,并为13种流行的HNE算法创建了友好的接口,并在多个任务和实验设置中提供了全方位比较。
Since real-world objects and their interactions are often multi-modal and multi-typed, heterogeneous networks have been widely used as a more powerful, realistic, and generic superclass of traditional homogeneous networks (graphs). Meanwhile, representation learning (\aka~embedding) has recently been intensively studied and shown effective for various network mining and analytical tasks. In this work, we aim to provide a unified framework to deeply summarize and evaluate existing research on heterogeneous network embedding (HNE), which includes but goes beyond a normal survey. Since there has already been a broad body of HNE algorithms, as the first contribution of this work, we provide a generic paradigm for the systematic categorization and analysis over the merits of various existing HNE algorithms. Moreover, existing HNE algorithms, though mostly claimed generic, are often evaluated on different datasets. Understandable due to the application favor of HNE, such indirect comparisons largely hinder the proper attribution of improved task performance towards effective data preprocessing and novel technical design, especially considering the various ways possible to construct a heterogeneous network from real-world application data. Therefore, as the second contribution, we create four benchmark datasets with various properties regarding scale, structure, attribute/label availability, and \etc.~from different sources, towards handy and fair evaluations of HNE algorithms. As the third contribution, we carefully refactor and amend the implementations and create friendly interfaces for 13 popular HNE algorithms, and provide all-around comparisons among them over multiple tasks and experimental settings.