首页 | 本学科首页   官方微博 | 高级检索  
     


Learning motion and content-dependent features with convolutions for action recognition
Authors:Cong Liu  Weisheng Xu  Qidi Wu  Gelan Yang
Abstract:A variety of recognizing architectures based on deep convolutional neural networks have been devised for labeling videos containing human motion with action labels. However, so far, most works cannot properly deal with the temporal dynamics encoded in multiple contiguous frames, which distinguishes action recognition from other recognition tasks. This paper develops a temporal extension of convolutional neural networks to exploit motion-dependent features for recognizing human action in video. Our approach differs from other recent attempts in that it uses multiplicative interactions between convolutional outputs to describe motion information across contiguous frames. Interestingly, the representation of image content arises when we are at work on extracting motion pattern, which makes our model effectively incorporate both of them to analysis video. Additional theoretical analysis proves that motion and content-dependent features arise simultaneously from the developed architecture, whereas previous works mostly deal with the two separately. Our architecture is trained and evaluated on the standard video actions benchmarks of KTH and UCF101, where it matches the state of the art and has distinct advantages over previous attempts to use deep convolutional architectures for action recognition.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号