在节点分类中评估节点嵌入方法的过程

论文标题

在节点分类中评估节点嵌入方法的过程

A Process for the Evaluation of Node Embedding Methods in the Context of Node Classification

论文作者

Martin, Christoph, Riebeling, Meike

论文摘要

节点嵌入方法可以找到潜在的低维表示，这些表示用作机器学习模型中的功能。在过去的几年中，作为替代手动功能工程的替代者，这些方法变得非常流行。由于作者使用各种方法来评估节点嵌入方法，因此很少有效，准确地比较现有的研究。我们通过为节点嵌入过程进行公平和客观评估的过程来解决这个问题。W.R.T.节点分类。该过程支持研究人员和从业人员以可复制的方式比较新的和现有的方法。我们将此过程应用于四种流行的节点嵌入方法，并进行有价值的观察。通过适当的超参数组合，即使使用较低维度的嵌入，也可以达到良好的性能，这对于下游机器学习任务和嵌入算法的运行时间是阳性的。多个高参数组合产生相似的性能。因此，在大多数情况下，不需要广泛的，耗时的搜索即可实现合理的绩效。

Node embedding methods find latent lower-dimensional representations which are used as features in machine learning models. In the last few years, these methods have become extremely popular as a replacement for manual feature engineering. Since authors use various approaches for the evaluation of node embedding methods, existing studies can rarely be efficiently and accurately compared. We address this issue by developing a process for a fair and objective evaluation of node embedding procedures w.r.t. node classification. This process supports researchers and practitioners to compare new and existing methods in a reproducible way. We apply this process to four popular node embedding methods and make valuable observations. With an appropriate combination of hyperparameters, good performance can be achieved even with embeddings of lower dimensions, which is positive for the run times of the downstream machine learning task and the embedding algorithm. Multiple hyperparameter combinations yield similar performance. Thus, no extensive, time-consuming search is required to achieve reasonable performance in most cases.

下载PDF全文

下载文献需遵守相关版权规定

论文标题