CDEC-NET：复合可变形的级联网络，用于文档图像中的表检测

论文标题

CDEC-NET：复合可变形的级联网络，用于文档图像中的表检测

CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images

论文作者

Agarwal, Madhav, Mondal, Ajoy, Jawahar, C. V.

论文摘要

本地化页面元素/对象（例如表，图形，方程式）是从文档图像中提取信息的主要步骤。我们提出了一个新颖的端到端可训练的深层网络（CDEC-NET），用于检测文档中存在的表。所提出的网络由蒙版R-CNN的多阶段扩展组成，其双主链具有可变形的卷积，用于检测表尺度上有变化的表，并且在较高的IOU阈值下具有很高的检测准确性。我们在所有公开可用的基准数据集上进行了经验评估CDEC-NET-ICDAR-2013，ICDAR-2017，ICDAR-2019，ICDAR-2019，UNLV，MARMOT，PUBLAYNET和TACEBANK-进行了广泛的实验。我们的解决方案具有三个重要的属性：（i）单个训练的模型CDEC-NET‡在所有流行的基准数据集中表现良好；（ii）我们报告了跨多个的出色表现，包括IOU的较高阈值；（iii）通过遵循每个基准的最新论文方案，我们始终证明了出色的定量性能。我们的代码和模型将公开发布，以实现结果的可重复性。

Localizing page elements/objects such as tables, figures, equations, etc. is the primary step in extracting information from document images. We propose a novel end-to-end trainable deep network, (CDeC-Net) for detecting tables present in the documents. The proposed network consists of a multistage extension of Mask R-CNN with a dual backbone having deformable convolution for detecting tables varying in scale with high detection accuracy at higher IoU threshold. We empirically evaluate CDeC-Net on all the publicly available benchmark datasets - ICDAR-2013, ICDAR-2017, ICDAR-2019,UNLV, Marmot, PubLayNet, and TableBank - with extensive experiments. Our solution has three important properties: (i) a single trained model CDeC-Net‡ performs well across all the popular benchmark datasets; (ii) we report excellent performances across multiple, including higher, thresholds of IoU; (iii) by following the same protocol of the recent papers for each of the benchmarks, we consistently demonstrate the superior quantitative performance. Our code and models will be publicly released for enabling the reproducibility of the results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题