论文标题
使用有毒的自动识别有毒代码审查
Automated Identification of Toxic Code Reviews Using ToxiCR
论文作者
论文摘要
软件开发互动期间的有毒对话可能会对免费开源软件(FOSS)开发项目产生严重影响。例如,有毒对话的受害者可能会害怕表达自己,因此会丧失自己的能力,并最终可能离开该项目。自动过滤有毒的对话可能有助于福斯社区保持其成员之间健康的互动。但是,现成的毒性探测器在软件工程(SE)数据集上的表现较差,例如从代码审查评论中策划的一个。为了遇到这一挑战,我们提出了毒性,这是一种基于学习的基于学习的毒性识别工具,用于代码审查互动。有毒物质包括选择一种监督学习算法之一,选择文本矢量化技术,八个预处理步骤以及一个大规模标记的数据集,其中包括19,571个代码评论评论。这八个预处理步骤中有两个是特定于SE域。通过对模型的严格评估,这些模型具有各种预处理步骤和矢量化技术的组合,我们确定了数据集的最佳组合,可提高95.8%的精度和88.9%的F1得分。毒性在我们的数据集中显着优于现有的毒性探测器。我们发布了我们的数据集,预培训模型,评估结果和源代码,网址为:https://github.com/wsu-seal/toxicr
Toxic conversations during software development interactions may have serious repercussions on a Free and Open Source Software (FOSS) development project. For example, victims of toxic conversations may become afraid to express themselves, therefore get demotivated, and may eventually leave the project. Automated filtering of toxic conversations may help a FOSS community to maintain healthy interactions among its members. However, off-the-shelf toxicity detectors perform poorly on Software Engineering (SE) datasets, such as one curated from code review comments. To encounter this challenge, we present ToxiCR, a supervised learning-based toxicity identification tool for code review interactions. ToxiCR includes a choice to select one of the ten supervised learning algorithms, an option to select text vectorization techniques, eight preprocessing steps, and a large-scale labeled dataset of 19,571 code review comments. Two out of those eight preprocessing steps are SE domain specific. With our rigorous evaluation of the models with various combinations of preprocessing steps and vectorization techniques, we have identified the best combination for our dataset that boosts 95.8% accuracy and 88.9% F1 score. ToxiCR significantly outperforms existing toxicity detectors on our dataset. We have released our dataset, pre-trained models, evaluation results, and source code publicly available at: https://github.com/WSU-SEAL/ToxiCR