增强双向GAN的培训：一种破坏税收欺诈的方法

论文标题

增强双向GAN的培训：一种破坏税收欺诈的方法

Enhancement to Training of Bidirectional GAN : An Approach to Demystify Tax Fraud

论文作者

Mehta, Priya, Kumar, Sandeep, Kumar, Ravi, Babu, Ch. Sobhan

论文摘要

异常检测是一项具有挑战性的活动。文献中提出了几种机器学习技术，以进行异常检测。在本文中，我们为双向GAN（BIGAN）提出了一种新的培训方法，以检测异常值。为了验证拟议的方法，我们采用拟议的培训方法来培训一个Bigan，以检测正在操纵其纳税申报表的纳税人。对于每个纳税人，我们从他/她提交的纳税申报表中得出了六个相关参数和三个比率参数。我们在这九个派生的基础真实数据集上采用拟议的培训方法培训了一个Bigan。接下来，我们使用$ encoder $（使用$ encoder $编码此数据集）生成此数据集的潜在表示，并使用$ Generator $（使用$ Generator $解码）将此潜在表示作为输入来重新生成此数据集。对于每个纳税人，计算其基地数据和再生数据之间的余弦相似性。具有较低余弦相似性措施的纳税人是潜在的回程操纵者。我们应用了我们的方法来分析印度特兰加纳政府商业税部提供的钢铁纳税人数据集。

Outlier detection is a challenging activity. Several machine learning techniques are proposed in the literature for outlier detection. In this article, we propose a new training approach for bidirectional GAN (BiGAN) to detect outliers. To validate the proposed approach, we train a BiGAN with the proposed training approach to detect taxpayers, who are manipulating their tax returns. For each taxpayer, we derive six correlation parameters and three ratio parameters from tax returns submitted by him/her. We train a BiGAN with the proposed training approach on this nine-dimensional derived ground-truth data set. Next, we generate the latent representation of this data set using the $encoder$ (encode this data set using the $encoder$) and regenerate this data set using the $generator$ (decode back using the $generator$) by giving this latent representation as the input. For each taxpayer, compute the cosine similarity between his/her ground-truth data and regenerated data. Taxpayers with lower cosine similarity measures are potential return manipulators. We applied our method to analyze the iron and steel taxpayers data set provided by the Commercial Taxes Department, Government of Telangana, India.

下载PDF全文

下载文献需遵守相关版权规定

论文标题