阿拉伯文本识别的混合深度学习模型

论文标题

阿拉伯文本识别的混合深度学习模型

A Hybrid Deep Learning Model for Arabic Text Recognition

论文作者

Fasha, Mohammad, Hammo, Bassam, Obeid, Nadim, Widian, Jabir

论文摘要

阿拉伯文本识别是一项具有挑战性的任务，因为阿拉伯写作系统的草书性质，其联合写作计划，大量的韧带和许多其他挑战。深度学习DL模型在包括计算机视觉和序列建模在内的众多领域取得了重大进展。本文提出了一个模型，该模型可以识别使用多种字体类型（包括模仿阿拉伯语手写脚本的字体）打印的阿拉伯文本。所提出的模型采用了混合DL网络，该网络可以识别阿拉伯语印刷文本而无需字符分割。该模型是在使用18种不同阿拉伯字体类型生成的200万个单词样本组成的自定义数据集上测试的。测试过程的目的是评估模型能力，以识别代表各种草书风格的各种阿拉伯字体。该模型在识别字符和单词方面取得了良好的结果，并且在对角色进行测试时，它在识别角色时也获得了有希望的结果。已公开提供准备的模型，自定义数据集和用于生成类似数据集的工具包，这些工具可用于准备识别其他字体类型的模型，并进一步扩展和增强所提出模型的性能。

Arabic text recognition is a challenging task because of the cursive nature of Arabic writing system, its joint writing scheme, the large number of ligatures and many other challenges. Deep Learning DL models achieved significant progress in numerous domains including computer vision and sequence modelling. This paper presents a model that can recognize Arabic text that was printed using multiple font types including fonts that mimic Arabic handwritten scripts. The proposed model employs a hybrid DL network that can recognize Arabic printed text without the need for character segmentation. The model was tested on a custom dataset comprised of over two million word samples that were generated using 18 different Arabic font types. The objective of the testing process was to assess the model capability in recognizing a diverse set of Arabic fonts representing a varied cursive styles. The model achieved good results in recognizing characters and words and it also achieved promising results in recognizing characters when it was tested on unseen data. The prepared model, the custom datasets and the toolkit for generating similar datasets are made publicly available, these tools can be used to prepare models for recognizing other font types as well as to further extend and enhance the performance of the proposed model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题