论文标题
Xtractree:一种简单有效的方法,用于验证零售银行中使用的行李方式的验证
XtracTree: a Simple and Effective Method for Regulator Validation of Bagging Methods Used in Retail Banking
论文作者
论文摘要
Bootstrap聚合(称为包装)是机器学习(ML)中最受欢迎的合奏方法之一。合奏方法是一种ML方法,它结合了多个假设,形成用于预测的单个假设。包装算法结合了多个分类器,这些分类器在相同数据集的不同子样本上建立,以构建一个大型分类器。如今,银行及其零售银行活动正在利用ML算法的力量,包括决策树和随机森林,以优化其流程。但是,银行必须遵守监管机构和治理,因此,提供有效的ML解决方案是一项艰巨的任务。它始于银行的验证和治理部门,然后在生产环境中部署解决方案,直到国家金融监管机构的外部验证。每个提出的ML模型都必须验证,并且必须为每个基于算法的决策的明确规则证明是合理的。在这种情况下,我们提出了Xtractree,这是一种能够有效地将ML包装分类器(例如随机森林)转换为简单的“如果然后”规则,满足模型验证要求的简单“ IF-IF-IF”规则。我们使用来自Kaggle的公共贷款数据集来说明我们方法的有用性。我们的实验表明,使用Xtractree,可以将ML模型转换为基于规则的算法,从而使国家金融监管机构和银行验证部更轻松地验证模型验证。拟议的方法使我们的银行机构可以将AI解决方案交付给最终用户的50%。
Bootstrap aggregation, known as bagging, is one of the most popular ensemble methods used in machine learning (ML). An ensemble method is a ML method that combines multiple hypotheses to form a single hypothesis used for prediction. A bagging algorithm combines multiple classifiers modeled on different sub-samples of the same data set to build one large classifier. Banks, and their retail banking activities, are nowadays using the power of ML algorithms, including decision trees and random forests, to optimize their processes. However, banks have to comply with regulators and governance and, hence, delivering effective ML solutions is a challenging task. It starts with the bank's validation and governance department, followed by the deployment of the solution in a production environment up to the external validation of the national financial regulator. Each proposed ML model has to be validated and clear rules for every algorithm-based decision must be justified. In this context, we propose XtracTree, an algorithm capable of efficiently converting an ML bagging classifier, such as a random forest, into simple "if-then" rules satisfying the requirements of model validation. We use a public loan data set from Kaggle to illustrate the usefulness of our approach. Our experiments demonstrate that using XtracTree, one can convert an ML model into a rule-based algorithm, leading to easier model validation by national financial regulators and the bank's validation department. The proposed approach allowed our banking institution to reduce up to 50% the time of delivery of our AI solutions to the end-user.