论文标题
重新审视神经代码搜索:通过自然语言意图增强代码摘要检索
Neural Code Search Revisited: Enhancing Code Snippet Retrieval through Natural Language Intent
论文作者
论文摘要
在这项工作中,我们提出和研究注释的代码搜索:检索代码片段与使用自然语言查询的简短描述。在三个基准数据集上,我们研究了如何通过利用描述更好地捕获代码段的意图来改进代码检索系统。在转移学习和自然语言处理方面的最新进展的基础上,我们创建了一个特定于域的检索模型,用于用自然语言描述注释的代码。我们发现,与不使用描述的最新代码检索方法相比,我们的模型产生的相关搜索结果明显更相关的搜索结果(绝对增益高达20.6%),但试图计算仅从未经通知的代码中计算snippets的意图。
In this work, we propose and study annotated code search: the retrieval of code snippets paired with brief descriptions of their intent using natural language queries. On three benchmark datasets, we investigate how code retrieval systems can be improved by leveraging descriptions to better capture the intents of code snippets. Building on recent progress in transfer learning and natural language processing, we create a domain-specific retrieval model for code annotated with a natural language description. We find that our model yields significantly more relevant search results (with absolute gains up to 20.6% in mean reciprocal rank) compared to state-of-the-art code retrieval methods that do not use descriptions but attempt to compute the intent of snippets solely from unannotated code.