CODEEXP：解释代码文档生成

论文标题

CODEEXP：解释代码文档生成

CodeExp: Explanatory Code Document Generation

论文作者

Cui, Haotian, Wang, Chenglong, Huang, Junjie, Inala, Jeevana Priya, Mytkowicz, Todd, Wang, Bo, Gao, Jianfeng, Duan, Nan

论文摘要

开发可以自动生成详细代码解释的模型可以极大地使软件维护和编程教育受益。但是，现有的代码到文本生成模型通常仅产生高级代码摘要，这些摘要不会捕获这些情况必不可少的实现级别的选择。为了填补此空白，我们提出了代码解释生成任务。我们首先进行了人类研究，以确定代码高质量解释性docstring的标准。基于此，我们收集并完善了大规模的代码Docstring语料库，并制定了最适合人类评估的自动评估指标。最后，我们为任务提供了多阶段的微调策略和基线模型。我们的实验表明，（1）与较大的未精制数据（15倍更大）相比，我们的精制培训数据集使模型可以在解释生成任务中实现更好的性能，并且（2）微型模型可以生成结构良好的长长文档，可与人撰写的模型相当。我们设想我们的培训数据集，人类评估协议，推荐的指标和微调策略可以促进未来的代码解释研究。代码和注释数据可在https://github.com/subercui/codeexp上获得。

Developing models that can automatically generate detailed code explanation can greatly benefit software maintenance and programming education. However, existing code-to-text generation models often produce only high-level summaries of code that do not capture implementation-level choices essential for these scenarios. To fill in this gap, we propose the code explanation generation task. We first conducted a human study to identify the criteria for high-quality explanatory docstring for code. Based on that, we collected and refined a large-scale code docstring corpus and formulated automatic evaluation metrics that best match human assessments. Finally, we present a multi-stage fine-tuning strategy and baseline models for the task. Our experiments show that (1) our refined training dataset lets models achieve better performance in the explanation generation tasks compared to larger unrefined data (15x larger), and (2) fine-tuned models can generate well-structured long docstrings comparable to human-written ones. We envision our training dataset, human-evaluation protocol, recommended metrics, and fine-tuning strategy can boost future code explanation research. The code and annotated data are available at https://github.com/subercui/CodeExp.

下载PDF全文

下载文献需遵守相关版权规定

论文标题