首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Understanding human behavior from motion imagery   总被引:3,自引:0,他引:3  
Computer vision is gradually making the transition from image understanding to video understanding. This is due to the enormous success in analyzing sequences of images that has been achieved in recent years. The main shift in the paradigm has been from recognition followed by reconstruction (shape from X) to motion-based recognition. Since most videos are about people, this work has focused on the analysis of human motion. In this paper, I present my perspective on understanding human behavior. Automatically understanding human behavior from motion imagery involves extraction of relevant visual information from a video sequence, representation of that information in a suitable form, and interpretation of visual information for the purpose of recognition and learning about human behavior. Significant progress has been made in human tracking over the last few years. As compared with tracking, not much progress has been made in understanding human behavior, and the issue of representation has largely been ignored. I present my opinion on possible reasons and hurdles for slower progress in understanding human behavior, briefly present our work in tracking, representation, and recognition, and comment on the next steps in all three areas.Published online: 28 August 2003  相似文献   

2.
视频图像理解的一般性框架研究*   总被引:2,自引:2,他引:0  
视频图像理解侧重于对视频序列进行解释,既涉及到图像的空间特性,也涉及到视频序列的时间特性,是目前计算机视觉领域的一个研究热点。回顾了视频图像理解方法的研究现状,提出视频图像理解的一般性框架,包括层次结构、涉及的技术领域和应用的系统结构,并以一个实际应用作为示例解释该框架的层次结构。  相似文献   

3.
4.
This paper proposes a new examplar-based method for real-time human motion recognition using Motion Capture (MoCap) data. We have formalized streamed recognizable actions, coming from an online MoCap engine, into a motion graph that is similar to an animation motion graph. This graph is used as an automaton to recognize known actions as well as to add new ones. We have defined and used a spatio-temporal metric for similarity measurements to achieve more accurate feedbacks on classification. The proposed method has the advantage of being linear and incremental, making the recognition process very fast and the addition of a new action straightforward. Furthermore, actions can be recognized with a score even before they are fully completed. Thanks to the use of a skeleton-centric coordinate system, our recognition method has become view-invariant. We have successfully tested our action recognition method on both synthetic and real data. We have also compared our results with four state-of-the-art methods using three well known datasets for human action recognition. In particular, the comparisons have clearly shown the advantage of our method through better recognition rates.  相似文献   

5.
6.
基于视觉的人行为理解综述*   总被引:1,自引:0,他引:1  
基于视觉的人体运动分析是计算机领域中备受关注的前沿方向之一,而人行为理解由于在智能监控、人机交互、虚拟现实和基于内容的视频检索等方面有着广泛的应用前景更是成为了未来研究的前瞻性方向之一。行为理解问题一般遵从如下基本过程:特征提取与运动表征;行为识别;高层行为与场景理解。着重从这三个方面逐一回顾了近年来人行为理解研究的发展现状和常用方法,并对当前该研究方向上亟待解决的问题和未来趋势作了较为详细的分析。  相似文献   

7.
Inger Lytje 《AI & Society》1989,4(4):276-290
The article argues that cognitive linguistic theory may prove an alternative to the Montague paradigm for designing natural language understanding systems. Within this framework it describes a system which models language understanding as a dialogical process between user and computer. The system operates with natural language texts as input and represent language meaning as entity-relationship diagrams.  相似文献   

8.
This paper presents a real-time video understanding system which automatically recognises activities occurring in environments observed through video surveillance cameras. Our approach consists in three main stages: Scene Tracking, Coherence Maintenance, and Scene Understanding. The main challenges are to provide a robust tracking process to be able to recognise events in outdoor and in real applications conditions, to allow the monitoring of a large scene through a camera network, and to automatically recognise complex events involving several actors interacting with each others. This approach has been validated for Airport Activity Monitoring in the framework of the European project AVITRACK.  相似文献   

9.
10.
In this paper, we describe a technique for representing and recognizing human motions using directional motion history images. A motion history image is a single human motion image produced by superposing binarized successive motion image frames so that older frames may have smaller weights. It has, however, difficulty that the latest motion overwrites older motions, resulting in inexact motion representation and therefore incorrect recognition. To overcome this difficulty, we propose directional motion history images which describe a motion with respect to four directions of movement, i.e. up, down, right and left, employing optical flow. The directional motion history images are thus a set of four motion history images defined on four optical flow images. Experimental results show that the proposed technique achieves better performance in the recognition of human motions than the existent motion history images. This work was presented in part at the 13th International Symposium on Artificial Life and Robotics, Oita, Japan, January 31–February 2, 2008  相似文献   

11.
In this paper we propose a new optimization framework that unites some of the existing tensor based methods for face recognition on a common mathematical basis. Tensor based approaches rely on the ability to decompose an image into its constituent factors (i.e. person, lighting, viewpoint, etc.) and then utilizing these factor spaces for recognition. We first develop a multilinear optimization problem relating an image to its constituent factors and then develop our framework by formulating a set of strategies that can be followed to solve this optimization problem. The novelty of our research is that the proposed framework offers an effective methodology for explicit non-empirical comparison of the different tensor methods as well as providing a way to determine the applicability of these methods in respect to different recognition scenarios. Importantly, the framework allows the comparative analysis on the basis of quality of solutions offered by these methods. Our theoretical contribution has been validated by extensive experimental results using four benchmark datasets which we present along with a detailed discussion.  相似文献   

12.
This study focuses on the question of how humans can be inherently integrated into cyber-physical systems (CPS) to reinforce their involvement in the increasingly automated industrial processes. After a use-case oriented review of the related research literature, a human-integration framework and associated data models are presented as part of a multi-agent IoT middleware called CHARIOT. The framework enables human actors to be semantically represented and registered, together with other IoT entities, in a common service directory, thereby facilitating their inclusion in complex service chains. To validate and evaluate the proposed framework, a user study is conducted on a setup where a human and a robot arm collaborate on a “pick-assemble-place” job on a conveyor belt. Based on the human skill set parameters obtained from the user study, online and offline variants of task assignment on the conveyor belt setup are implemented and analyzed over the presented framework. The results illustrate possible efficiency gains through the consolidated online monitoring and control of all cyber-physical system entities, including human actors.  相似文献   

13.
Learning activities interactions between small groups is a key step in understanding team sports videos. Recent research focusing on team sports videos can be strictly regarded from the perspective of the audience rather than the athlete. For team sports videos such as volleyball and basketball videos, there are plenty of intra-team and inter-team relations. In this paper, a new task named Group Scene Graph Generation is introduced to better understand intra-team relations and inter-team relations in sports videos. To tackle this problem, a novel Hierarchical Relation Network is proposed. After all players in a video are finely divided into two teams, the feature of the two teams’ activities and interactions will be enhanced by Graph Convolutional Networks, which are finally recognized to generate Group Scene Graph. For evaluation, built on Volleyball dataset with additional 9660 team activity labels, a Volleyball+ dataset is proposed. A baseline is set for better comparison and our experimental results demonstrate the effectiveness of our method. Moreover, the idea of our method can be directly utilized in another video-based task, Group Activity Recognition. Experiments show the priority of our method and display the link between the two tasks. Finally, from the athlete’s view, we elaborately present an interpretation that shows how to utilize Group Scene Graph to analyze teams’ activities and provide professional gaming suggestions.  相似文献   

14.
ABSTRACT

We describe a motion recognition method for reducing recognition errors. The method has a two-layer structure: a lower layer for motion recognition that is affected by the distribution of topics used as context information in the upper layer and an upper layer for the topic distribution that is affected by motion recognition in the bottom layer. We introduce an algorithm for the bottom layer to integrate the motion likelihood calculated using a hidden Markov model and motion appearance probabilities obtained by a topic model. We also use a set of particles to estimate and update contexts on the basis of the result of motion recognition in the bottom layer. The set of particles presents a probabilistic distribution of motion topics, and motion recognition and particle update procedures are performed on each particle. We experimented with two types of sequential motion: a sequence of 33 daily motions and complex motion sequences assuming actual observation. The results showed that the proposed method reduced recognition errors and tracked latent topics better than conventional methods did.  相似文献   

15.
A motion field generation algorithm using block matching of edge flag histograms has been developed aiming at its application to motion recognition systems. Use of edge flags instead of pixel intensities has made the algorithm robust against illumination changes. In order to detect local motions of interest effectively, a new adaptive frame interval adjustment scheme has been introduced in which only the edge flags due to local motions present in the frame are accumulated and utilized in block matching. These edge flags are projected onto x and y axes to generate histograms and the motion in x and y directions are determined by histogram matching. As a result, the computational cost for best match search has been substantially reduced. A vector representation of the motion field, which is called projected principal-motion distribution (PPMD), has also been proposed. It was applied to preliminary motion recognition experiments using Hidden Markov Models (HMMs) and its effectiveness has been confirmed. Moreover the advantage of the proposed motion field generation method over the simple optical flow as well as the conventional block matching method using pixel intensities has been demonstrated.  相似文献   

16.
Current research in content-based semantic image understanding is largely confined to exemplar-based approaches built on low-level feature extraction and classification. The ability to extract both low-level and semantic features and perform knowledge integration of different types of features is expected to raise semantic image understanding to a new level. Belief networks, or Bayesian networks (BN), have proven to be an effective knowledge representation and inference engine in artificial intelligence and expert systems research. Their effectiveness is due to the ability to explicitly integrate domain knowledge in the network structure and to reduce a joint probability distribution to conditional independence relationships. In this paper, we present a general-purpose knowledge integration framework that employs BN in integrating both low-level and semantic features. The efficacy of this framework is demonstrated via three applications involving semantic understanding of pictorial images. The first application aims at detecting main photographic subjects in an image, the second aims at selecting the most appealing image in an event, and the third aims at classifying images into indoor or outdoor scenes. With these diverse examples, we demonstrate that effective inference engines can be built within this powerful and flexible framework according to specific domain knowledge and available training data to solve inherently uncertain vision problems.  相似文献   

17.
人口老龄化是对当今大多数国家的未来产生重大影响的社会因素.为了解决由此而来的巨大经济和社会压力,目前现实可行的方法是依靠高度发展的信息和通信技术,自动识别和理解人们在家居环境下的日常活动(ADL),向老人提供日常生活辅助以尽可能地延长老人在家中独立生活的时间.由于日常生活发生的物理环境是非结构化的自然环境.ADL识别与理解的任务就是动态上下境下,在时间和空间上对用户的日常活动进行观察、处理、分析、推理和决策的过程.这本质上是要求系统具有类似人类的认知能力.因此,ADL的识别和理解是传统的信息处理方法特别是计算机视觉方法、认知计算和推理等基础理论研究,和未来老龄化社会重大应用的交汇点.开展ADL识别与理解的研究将有利于推动学科的发展和社会的进步.本文旨在讨论ADL的识别和理解中的技术挑战和基本科学问题,通过分析相关研究现状以及讨论与ADL识别需求之间的差距,为探索研究思路提供新的方向.  相似文献   

18.
《Graphical Models》2014,76(3):116-127
Motion blur effects are important to motion perception in visual arts, interactive games and animation applications. Usually, such motion blur rendering is quite time consuming, thus blocking the online/interactive use of the effects. Motivated by the human perception in relation to moving objects, this paper presents simplified geometric models that enable to speedup motion blur rendering, which has not been tracked in motion blur rendering specifically. We develop a novel algorithm to simplify models with motion-aware, to preserve the features whose characteristics are perceivable in motion. We deduce the formula to outline the level of detail simplification by the object moving velocity. Using our simplified models, methods for motion blur rendering can achieve the rendering quality as using the original models, and obtain the processing acceleration mostly. The experimental results have shown the effectiveness of our approach, more acceleration with the larger models or faster motion (e.g. for the dragon model with over a million facets, the motion-blur rendering via hierarchical stochastic rasterization is sped up by over 27 times).  相似文献   

19.
A combined 2D, 3D approach is presented that allows for robust tracking of moving people and recognition of actions. It is assumed that the system observes multiple moving objects via a single, uncalibrated video camera. Low-level features are often insufficient for detection, segmentation, and tracking of non-rigid moving objects. Therefore, an improved mechanism is proposed that integrates low-level (image processing), mid-level (recursive 3D trajectory estimation), and high-level (action recognition) processes. A novel extended Kalman filter formulation is used in estimating the relative 3D motion trajectories up to a scale factor. The recursive estimation process provides a prediction and error measure that is exploited in higher-level stages of action recognition. Conversely, higher-level mechanisms provide feedback that allows the system to reliably segment and maintain the tracking of moving objects before, during, and after occlusion. Heading-guided recognition (HGR) is proposed as an efficient method for adaptive classification of activity. The HGR approach is demonstrated using “motion history images” that are then recognized via a mixture-of-Gaussians classifier. The system is tested in recognizing various dynamic human outdoor activities: running, walking, roller blading, and cycling. In addition, experiments with real and synthetic data sets are used to evaluate stability of the trajectory estimator with respect to noise.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号