论文标题
端到端优化的视频压缩,并带有MV占用预测
End-to-end Optimized Video Compression with MV-Residual Prediction
论文作者
论文摘要
在本文中,我们提出了一个可用于P框架压缩的端到端训练框架。联合运动矢量(MV)和残留预测网络MV二氧化型旨在通过将两个连续的帧视为输入来提取运动表示和残留信息的结合特征。潜在表示的先前概率是由高位自动编码器建模的,并与MV二氧化网联合训练。特别是,将空间放置的卷积应用于视频框架预测,其中通过将内核在源图像中的位移位置应用于位置,学习了每个像素的运动内核来生成预测的像素。最后,考虑到挑战的限制,新颖的速率分配和后处理策略用于产生最终的压缩位。验证集的实验结果表明,所提出的优化框架可以为P框架压缩竞争生成最高的MS-SSIM。
We present an end-to-end trainable framework for P-frame compression in this paper. A joint motion vector (MV) and residual prediction network MV-Residual is designed to extract the ensembled features of motion representations and residual information by treating the two successive frames as inputs. The prior probability of the latent representations is modeled by a hyperprior autoencoder and trained jointly with the MV-Residual network. Specially, the spatially-displaced convolution is applied for video frame prediction, in which a motion kernel for each pixel is learned to generate predicted pixel by applying the kernel at a displaced location in the source image. Finally, novel rate allocation and post-processing strategies are used to produce the final compressed bits, considering the bits constraint of the challenge. The experimental results on validation set show that the proposed optimized framework can generate the highest MS-SSIM for P-frame compression competition.