插件和播放视频对象检测的卷积回归跟踪器

论文标题

插件和播放视频对象检测的卷积回归跟踪器

Plug & Play Convolutional Regression Tracker for Video Object Detection

论文作者

Lyu, Ye, Yang, Michael Ying, Vosselman, George, Xia, Gui-Song

论文摘要

视频对象检测目标可以同时定位对象的边界框并在给定的视频中标识其类。视频对象检测的一个挑战是始终检测整个视频中的所有对象。由于物体的外观可能会在某些框架中恶化，因此通常使用来自其他帧的特征或检测来增强预测。在本文中，我们为视频对象检测任务提出了一个插件量表自动卷积回归跟踪器，可以轻松且兼容地植入到当前的最新检测网络中。当跟踪器重复检测器的功能时，它是检测网络的非常轻巧的增量。整个网络以接近标准对象检测器的速度执行。借助我们的新视频对象检测管道设计，图像对象检测器可以轻松地将其变成有效的视频对象检测器，而无需修改任何参数。在大型Imakenet VID数据集上评估了性能。我们的插件设计将图像检测器的地图得分提高了约5％，只有很少的速度下降。

Video object detection targets to simultaneously localize the bounding boxes of the objects and identify their classes in a given video. One challenge for video object detection is to consistently detect all objects across the whole video. As the appearance of objects may deteriorate in some frames, features or detections from the other frames are commonly used to enhance the prediction. In this paper, we propose a Plug & Play scale-adaptive convolutional regression tracker for the video object detection task, which could be easily and compatibly implanted into the current state-of-the-art detection networks. As the tracker reuses the features from the detector, it is a very light-weighted increment to the detection network. The whole network performs at the speed close to a standard object detector. With our new video object detection pipeline design, image object detectors can be easily turned into efficient video object detectors without modifying any parameters. The performance is evaluated on the large-scale ImageNet VID dataset. Our Plug & Play design improves mAP score for the image detector by around 5% with only little speed drop.

下载PDF全文

下载文献需遵守相关版权规定

论文标题