了解人类在互联网规模上接触的人

论文标题

了解人类在互联网规模上接触的人

Understanding Human Hands in Contact at Internet Scale

论文作者

Shan, Dandan, Geng, Jiaqi, Shu, Michelle, Fouhey, David F.

论文摘要

双手是人类操纵自己的世界并能够从参与其中的人类的互联网视频中可靠地提取手的信息的核心手段，有可能为可以从视频数据的pb中学习的系统铺平道路。本文提出了迈向这一步骤的步骤，通过推断出参与互动方法的丰富代表，其中包括：手，侧面，接触状态和接触物中的物体周围的盒子。为了支持这项工作，我们收集了一个大规模数据集，与包括131天录像的物体以及100k注释的手接触视频框架数据集接触。该数据集上的博学模型可以作为视频中手接接触理解的基础。我们自行定量评估它，并为从人类手的3D网格中进行预测和学习。

Hands are the central means by which humans manipulate their world and being able to reliably extract hand state information from Internet videos of humans engaged in their hands has the potential to pave the way to systems that can learn from petabytes of video data. This paper proposes steps towards this by inferring a rich representation of hands engaged in interaction method that includes: hand location, side, contact state, and a box around the object in contact. To support this effort, we gather a large-scale dataset of hands in contact with objects consisting of 131 days of footage as well as a 100K annotated hand-contact video frame dataset. The learned model on this dataset can serve as a foundation for hand-contact understanding in videos. We quantitatively evaluate it both on its own and in service of predicting and learning from 3D meshes of human hands.

下载PDF全文

下载文献需遵守相关版权规定

论文标题