论文标题

学习删除视频的图表和丢失的数据

Learning Disentangled Representations of Video with Missing Data

论文作者

Comas-Massagué, Armand, Zhang, Chi, Feric, Zlatan, Camps, Octavia, Yu, Rose

论文摘要

丢失的数据在学习视频序列表示表示时会带来重大挑战。我们提出了分离的估算的视频自动编码器(DIVE),这是一个深层生成模型,在缺少数据的情况下插入并预测未来的视频帧。具体而言,Dive引入了一个缺失的潜在变量,将隐藏的视频表示形式分解为每个对象的静态和动态外观,姿势和缺失因素。潜水渗出每个对象的轨迹,其中丢失了数据。在具有各种缺失方案的移动MNIST数据集中,潜水的表现可以大大优于最高的基线。我们还提供了现实世界中的Motschallenge行人数据集的比较,该数据集在更现实的环境中证明了我们方法的实际价值。我们的代码和数据可以在https://github.com/rose-stl-lab/dive上找到。

Missing data poses significant challenges while learning representations of video sequences. We present Disentangled Imputed Video autoEncoder (DIVE), a deep generative model that imputes and predicts future video frames in the presence of missing data. Specifically, DIVE introduces a missingness latent variable, disentangles the hidden video representations into static and dynamic appearance, pose, and missingness factors for each object. DIVE imputes each object's trajectory where data is missing. On a moving MNIST dataset with various missing scenarios, DIVE outperforms the state of the art baselines by a substantial margin. We also present comparisons for real-world MOTSChallenge pedestrian dataset, which demonstrates the practical value of our method in a more realistic setting. Our code and data can be found at https://github.com/Rose-STL-Lab/DIVE.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源