论文标题

良好的分类措施以及如何找到它们

Good Classification Measures and How to Find Them

论文作者

Gösgens, Martijn, Zhiyanov, Anton, Tikhonov, Alexey, Prokhorenkova, Liudmila

论文摘要

可以使用多种绩效指标来评估分类结果:准确性,F量和许多其他措施。我们能说其中一些比其他更好,还是选择一种在所有情况下最好的措施?为了回答这个问题,我们对分类绩效指标进行系统分析:我们正式定义了理论性属性和理论分析,该属性可以测量满足属性的满足。我们还证明了不可能的定理:无法同时满足某些理想的属性。最后,我们提出了一个新的一系列措施,这些措施满足了除一个属性以外的所有理想特性。这个家族包括Matthews相关系数和所谓的对称平衡精度,以前在分类文献中不使用。我们认为,我们的系统方法为从业者提供了一个重要的工具,以充分评估分类结果。

Several performance measures can be used for evaluating classification results: accuracy, F-measure, and many others. Can we say that some of them are better than others, or, ideally, choose one measure that is best in all situations? To answer this question, we conduct a systematic analysis of classification performance measures: we formally define a list of desirable properties and theoretically analyze which measures satisfy which properties. We also prove an impossibility theorem: some desirable properties cannot be simultaneously satisfied. Finally, we propose a new family of measures satisfying all desirable properties except one. This family includes the Matthews Correlation Coefficient and a so-called Symmetric Balanced Accuracy that was not previously used in classification literature. We believe that our systematic approach gives an important tool to practitioners for adequately evaluating classification results.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源