论文标题

Imagesubject:一个大规模数据集用于主题检测

ImageSubject: A Large-scale Dataset for Subject Detection

论文作者

Miao, Xin, Liu, Jiayi, Wang, Huayan, Fu, Jun

论文摘要

主要主题通常存在于图像或视频中,因为它们是摄影师想要突出的对象。人类观众可以轻松地识别它们,但是算法通常会使它们与其他对象混淆。检测主要主题是一项重要技术,可以帮助机器了解图像和视频的内容。我们提出了一个新数据集,其目标是培训模型,以了解对象的布局和图像的上下文,然后在其中找到主要主题。这是在三个方面实现的。通过从具有专业拍摄技巧的导演创建的电影镜头中收集图像,我们以强大的多样性收集数据集,特别是,它包含107 \,700张图像,来自21 \,540个电影镜头。我们将它们标记为两个类别的边界框标签:主题和非主体前景对象。我们提供了对数据集的详细分析,并将任务与显着性检测和对象检测进行了比较。 ImagesUbject是第一个试图将主题定位在摄影师想要突出显示的图像中的数据集。此外,我们发现基于变压器的检测模型在其他流行的模型体系结构中提供了最佳结果。最后,我们讨论了潜在的应用程序,并以数据集的重要性得出结论。

Main subjects usually exist in the images or videos, as they are the objects that the photographer wants to highlight. Human viewers can easily identify them but algorithms often confuse them with other objects. Detecting the main subjects is an important technique to help machines understand the content of images and videos. We present a new dataset with the goal of training models to understand the layout of the objects and the context of the image then to find the main subjects among them. This is achieved in three aspects. By gathering images from movie shots created by directors with professional shooting skills, we collect the dataset with strong diversity, specifically, it contains 107\,700 images from 21\,540 movie shots. We labeled them with the bounding box labels for two classes: subject and non-subject foreground object. We present a detailed analysis of the dataset and compare the task with saliency detection and object detection. ImageSubject is the first dataset that tries to localize the subject in an image that the photographer wants to highlight. Moreover, we find the transformer-based detection model offers the best result among other popular model architectures. Finally, we discuss the potential applications and conclude with the importance of the dataset.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源