论文标题
文本感知:朝向端到端任意形状的文本斑点
Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting
论文作者
论文摘要
最近提出了许多方法来检测不规则的场景文本并取得了令人鼓舞的结果。但是,他们的本地化结果可能不太满足以下文本识别部分,主要是因为两个原因:1)识别任意形状的文本仍然是一项艰巨的任务,而2)普遍的文本检测和文本识别之间普遍的不可促的管道策略将导致次优性能。为了解决这个不兼容的问题,在本文中,我们提出了一种名为Text Perceptron的端到端可训练的文本斑点方法。具体而言,文本PENCEPTRON首先采用了有效的基于分割的文本检测器,该检测器了解潜在的文本阅读顺序和边界信息。然后,设计了一个新颖的形状变换模块(Abbr。STM),以将检测到的特征区域转换为没有额外参数的常规形态。它将文本检测和以下识别部分团结到整个框架中,并帮助整个网络实现全局优化。实验表明,我们的方法在两个标准文本基准(即ICDAR 2013和ICDAR 2015)上实现了竞争性能,并且显然在不规则的文本基准Scut-CTW1500和Total-Text上显然优于现有方法。
Many approaches have recently been proposed to detect irregular scene text and achieved promising results. However, their localization results may not well satisfy the following text recognition part mainly because of two reasons: 1) recognizing arbitrary shaped text is still a challenging task, and 2) prevalent non-trainable pipeline strategies between text detection and text recognition will lead to suboptimal performances. To handle this incompatibility problem, in this paper we propose an end-to-end trainable text spotting approach named Text Perceptron. Concretely, Text Perceptron first employs an efficient segmentation-based text detector that learns the latent text reading order and boundary information. Then a novel Shape Transform Module (abbr. STM) is designed to transform the detected feature regions into regular morphologies without extra parameters. It unites text detection and the following recognition part into a whole framework, and helps the whole network achieve global optimization. Experiments show that our method achieves competitive performance on two standard text benchmarks, i.e., ICDAR 2013 and ICDAR 2015, and also obviously outperforms existing methods on irregular text benchmarks SCUT-CTW1500 and Total-Text.