基于周期符合的生成对抗网络对基于文本的验证的端到端攻击

论文标题

基于周期符合的生成对抗网络对基于文本的验证的端到端攻击

An End-to-End Attack on Text-based CAPTCHAs Based on Cycle-Consistent Generative Adversarial Network

论文作者

Li, Chunhui, Chen, Xingshu, Wang, Haizhou, Zhang, Yu, Wang, Peiming

论文摘要

作为一种广泛部署的安全计划，基于文本的验证码变得越来越难以抵抗基于机器学习的攻击。到目前为止，许多研究人员已经对不同公司（例如Microsoft，Amazon和Apple）部署的基于文本的验证码进行了攻击研究并取得了某些结果。但是，这些攻击中的大多数都有一些缺点，例如较差的攻击方法，需要一系列数据预处理步骤，并依靠大量标记的Captchas。在本文中，我们提出了一种基于周期一致的生成对抗网络的高效且简单的端到端攻击方法。与以前的研究相比，我们的方法大大降低了数据标记的成本。另外，此方法具有很高的便携性。它只能通过修改一些配置参数来攻击基于文本的通用验证方案，这使攻击更加容易。首先，我们基于自行车训练训练验证码合成器，以生成一些假样品。基于卷积复发性神经网络的基本识别器已通过虚假数据进行训练。随后，采用了一种主动传输学习方法来优化基本的识别器，利用少量标记的现实世界验证码样本。我们的方法有效地破解了由10个受欢迎的网站部署的验证码方案，这表明我们的攻击可能非常笼统。此外，我们分析了当前最流行的抗识别机制。结果表明，更多的反认可机制的结合可以提高验证码的安全性，但改进是有限的。相反，产生更复杂的验证码可能会花费更多的资源并降低验证码的可用性。

As a widely deployed security scheme, text-based CAPTCHAs have become more and more difficult to resist machine learning-based attacks. So far, many researchers have conducted attacking research on text-based CAPTCHAs deployed by different companies (such as Microsoft, Amazon, and Apple) and achieved certain results.However, most of these attacks have some shortcomings, such as poor portability of attack methods, requiring a series of data preprocessing steps, and relying on large amounts of labeled CAPTCHAs. In this paper, we propose an efficient and simple end-to-end attack method based on cycle-consistent generative adversarial networks. Compared with previous studies, our method greatly reduces the cost of data labeling. In addition, this method has high portability. It can attack common text-based CAPTCHA schemes only by modifying a few configuration parameters, which makes the attack easier. Firstly, we train CAPTCHA synthesizers based on the cycle-GAN to generate some fake samples. Basic recognizers based on the convolutional recurrent neural network are trained with the fake data. Subsequently, an active transfer learning method is employed to optimize the basic recognizer utilizing tiny amounts of labeled real-world CAPTCHA samples. Our approach efficiently cracked the CAPTCHA schemes deployed by 10 popular websites, indicating that our attack is likely very general. Additionally, we analyzed the current most popular anti-recognition mechanisms. The results show that the combination of more anti-recognition mechanisms can improve the security of CAPTCHA, but the improvement is limited. Conversely, generating more complex CAPTCHAs may cost more resources and reduce the availability of CAPTCHAs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题