基于动态时空关系的人类活动识别

论文标题

基于动态时空关系的人类活动识别

Human Activity Recognition based on Dynamic Spatio-Temporal Relations

论文作者

Liu, Zhenyu, Yao, Yaqiang, Liu, Yan, Zhu, Yuening, Tao, Zhenchao, Wang, Lei, Feng, Yuhong

论文摘要

通常由几种行动组成的人类活动通常涵盖人员和 /或物体之间的互动。特别是，人类行为涉及某些空间和时间关系，是更复杂活动的组成部分，并且随着时间的流逝而动态发展。因此，对人类行动的单一行动的描述和连续行动的演变的建模是人类活动识别的两个主要问题。在本文中，我们开发了一种解决这两个问题的人类活动识别方法。在提出的方法中，将活动分为以时空模式为代表的几个连续动作，这些动作的演变由顺序模型捕获。精致的综合时空图表被用来代表一个动作，这是人类行动的定性表示，该动作既包含参与者对象的空间和时间关系。接下来，应用离散的隐藏马尔可夫模型来对动作序列的演变进行建模。此外，提出了一种全自动分区方法，将长期人类活动视频分为基于变异对象和定性空间关系的几种人类行为。最后，引入了人体的分层分解，以获得单个动作的歧视性表示。康奈尔活动数据集的实验结果证明了拟议方法的效率和有效性，这将使人类活动的长期视频得到更好的认识。

Human activity, which usually consists of several actions, generally covers interactions among persons and or objects. In particular, human actions involve certain spatial and temporal relationships, are the components of more complicated activity, and evolve dynamically over time. Therefore, the description of a single human action and the modeling of the evolution of successive human actions are two major issues in human activity recognition. In this paper, we develop a method for human activity recognition that tackles these two issues. In the proposed method, an activity is divided into several successive actions represented by spatio temporal patterns, and the evolution of these actions are captured by a sequential model. A refined comprehensive spatio temporal graph is utilized to represent a single action, which is a qualitative representation of a human action incorporating both the spatial and temporal relations of the participant objects. Next, a discrete hidden Markov model is applied to model the evolution of action sequences. Moreover, a fully automatic partition method is proposed to divide a long-term human activity video into several human actions based on variational objects and qualitative spatial relations. Finally, a hierarchical decomposition of the human body is introduced to obtain a discriminative representation for a single action. Experimental results on the Cornell Activity Dataset demonstrate the efficiency and effectiveness of the proposed approach, which will enable long videos of human activity to be better recognized.

下载PDF全文

下载文献需遵守相关版权规定

论文标题