Unsupervised view and rate invariant clustering of video sequences |
| |
Affiliation: | 1. Guangxi Key Lab of Multi-source Information Mining & Security, Guangxi Normal University, Guilin 541004, China;2. Department of Computer Science, Guangxi Normal University, Guilin 541004, China;3. Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing, Guangxi Normal University, Guilin 541004, China;1. Institute of Information Science, Beijing Jiaotong University, Beijing, 100044, China;2. Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing, 100044, China;3. School of Electronic and Information Engineering, Liaoning University of Technology, Jinzhou, 121001, China;1. School of Computer and Control Engineering, University of Chinese Academy of Sciences (CAS), Beijing 100190, China;2. Key Lab. of Intell. Info. Process., Inst. of Comput. Tech., CAS, Beijing 100080, China;3. Key Laboratory of Big Data Mining and Knowledge Management, CAS, China;4. Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia, School of Computer Science, Beijing University of Posts and Telecommunications, 100876, Beijing, China |
| |
Abstract: | Videos play an ever increasing role in our everyday lives with applications ranging from news, entertainment, scientific research, security and surveillance. Coupled with the fact that cameras and storage media are becoming less expensive, it has resulted in people producing more video content than ever before. This necessitates the development of efficient indexing and retrieval algorithms for video data. Most state-of-the-art techniques index videos according to the global content in the scene such as color, texture, brightness, etc. In this paper, we discuss the problem of activity-based indexing of videos. To address the problem, first we describe activities as a cascade of dynamical systems which significantly enhances the expressive power of the model while retaining many of the computational advantages of using dynamical models. Second, we also derive methods to incorporate view and rate-invariance into these models so that similar actions are clustered together irrespective of the viewpoint or the rate of execution of the activity. We also derive algorithms to learn the model parameters from a video stream and demonstrate how a single video sequence may be clustered into different clusters where each cluster represents an activity. Experimental results for five different databases show that the clusters found by the algorithm correspond to semantically meaningful activities. |
| |
Keywords: | |
本文献已被 ScienceDirect 等数据库收录! |
|