首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A novel method based on fusion of texture and shape information is proposed for facial expression and Facial Action Unit (FAU) recognition from video sequences. Regarding facial expression recognition, a subspace method based on Discriminant Non-negative Matrix Factorization (DNMF) is applied to the images, thus extracting the texture information. In order to extract the shape information, the system firstly extracts the deformed Candide facial grid that corresponds to the facial expression depicted in the video sequence. A Support Vector Machine (SVM) system designed on an Euclidean space, defined over a novel metric between grids, is used for the classification of the shape information. Regarding FAU recognition, the texture extraction method (DNMF) is applied on the differences images of the video sequence, calculated taking under consideration the neutral and the expressive frame. An SVM system is used for FAU classification from the shape information. This time, the shape information consists of the grid node coordinate displacements between the neutral and the expressed facial expression frame. The fusion of texture and shape information is performed using various approaches, among which are SVMs and Median Radial Basis Functions (MRBFs), in order to detect the facial expression and the set of present FAUs. The accuracy achieved using the Cohn–Kanade database is 92.3% when recognizing the seven basic facial expressions (anger, disgust, fear, happiness, sadness, surprise and neutral), and 92.1% when recognizing the 17 FAUs that are responsible for facial expression development.  相似文献   

2.
Many information fusion applications are often characterized by a high degree of complexity because: (1) data are often acquired from sensors of different modalities and with different degrees of uncertainty; (2) decisions must be made efficiently; and (3) the world situation evolves over time. To address these issues, we propose an information fusion framework based on dynamic Bayesian networks to provide active, dynamic, purposive and sufficing information fusion in order to arrive at a reliable conclusion with reasonable time and limited resources. The proposed framework is suited to applications where the decision must be made efficiently from dynamically available information of diverse and disparate sources.  相似文献   

3.
4.
This paper addresses the problem of scene understanding for driver assistance systems. To recognize the large number of objects that may be found on the road, several sensors and decision algorithms have to be used. The proposed approach is based on the representation of all available information in over-segmented image regions. The main novelty of the framework is its capability to incorporate new classes of objects and to include new sensors or detection methods while remaining robust to sensor failures. Several classes such as ground, vegetation or sky are considered, as well as three different sensors. The approach was evaluated on real publicly available urban driving scene data.  相似文献   

5.
Automatic analysis of human facial expression is a challenging problem with many applications. Most of the existing automated systems for facial expression analysis attempt to recognize a few prototypic emotional expressions, such as anger and happiness. Instead of representing another approach to machine analysis of prototypic facial expressions of emotion, the method presented in this paper attempts to handle a large range of human facial behavior by recognizing facial muscle actions that produce expressions. Virtually all of the existing vision systems for facial muscle action detection deal only with frontal-view face images and cannot handle temporal dynamics of facial actions. In this paper, we present a system for automatic recognition of facial action units (AUs) and their temporal models from long, profile-view face image sequences. We exploit particle filtering to track 15 facial points in an input face-profile sequence, and we introduce facial-action-dynamics recognition from continuous video input using temporal rules. The algorithm performs both automatic segmentation of an input video into facial expressions pictured and recognition of temporal segments (i.e., onset, apex, offset) of 27 AUs occurring alone or in a combination in the input face-profile video. A recognition rate of 87% is achieved.  相似文献   

6.
Information extraction of facial expressions deals with facial-feature detection, feature tracking, and capture of the spatiotemporal relationships among features. It is a fundamental task in facial expression analysis and will ultimately determine the performance of expression recognition. For a real-world facial expression sequence, there are three challenges: (1) detection failure of some or all facial features due to changes in illumination and rapid head movement; (2) nonrigid object tracking resulting from facial expression change; and (3) feature occlusion due to out-of-plane head rotation. In this paper, a new approach is proposed to tackle these challenges. First, we use an active infrared (IR) illumination to reliably detect pupils under variable lighting conditions and head orientations. The pupil positions are then used to guide the entire information-extraction process. The simultaneous use of a global head motion constraint and Kalman filtering can robustly track individual facial features even in condition of rapid head motion and significant expression change. To handle feature occlusion, we propose a warping-based reliability propagation method. The reliable neighbor features and the spatial semantics among these features are used to detect and infer occluded features through an interframe warping transformation. Experimental results show that accurate information extraction can be achieved for video sequences with real-world facial expressions.Received: 16 August 2003, Accepted: 20 September 2004, Published online: 20 December 2004 Correspondence to: Qiang Ji  相似文献   

7.
In this paper, we present a fully-automatic and real-time approach for person-independent recognition of facial expressions from dynamic sequences of 3D face scans. In the proposed solution, first a set of 3D facial landmarks are automatically detected, then the local characteristics of the face in the neighborhoods of the facial landmarks and their mutual distances are used to model the facial deformation. Training two hidden Markov models for each facial expression to be recognized, and combining them to form a multiclass classifier, an average recognition rate of 79.4 % has been obtained for the 3D dynamic sequences showing the six prototypical facial expressions of the Binghamton University 4D Facial Expression database. Comparisons with competitor approaches on the same database show that our solution is able to obtain effective results with the advantage of being capable to process facial sequences in real-time.  相似文献   

8.
This paper describes a set of methods that make it possible to estimate the position of a feature inside a three-dimensional (3D) space by starting from a sequence of two-dimensional (2D) acoustic images of the seafloor acquired with a sonar system. Typical sonar imaging systems are able to generate just 2D images, and the acquisition of 3D information involves sharp increases in complexity and costs. The front-scan sonar proposed in this paper is a new equipment devoted to acquiring a 2D image of the seafloor to sail over, and allows one to collect a sequence of images showing a specific feature during the approach of the ship. This fact seems to make it possible to recover the 3D position of a feature by comparing the feature positions along the sequence of images acquired from different (known) ship positions. This opportunity is investigated in the paper, where it is shown that encouraging results have been obtained by a processing chain composed of some blocks devoted to low-level processing, feature extraction and analysis, a Kalman filter for robust feature tracking, and some ad hoc equations for depth estimation and averaging. A statistical error analysis demonstrated the great potential of the proposed system also if some inaccuracies affect the sonar measures and the knowledge of the ship position. This was also confirmed by several tests performed on both simulated and real sequences, obtaining satisfactory results on both the feature tracking and, above all, the estimation of the 3D position.  相似文献   

9.
Fan  Guodong  Hua  Zhen  Li  Jinjiang 《Applied Intelligence》2021,51(10):7262-7280
Applied Intelligence - According to the atmospheric physical model, we can use accurate transmittance and atmospheric light information to convert a hazy image into a clean one. The scene-depth...  相似文献   

10.
This paper presents an approach to understanding general 3-D motion of a rigid body from image sequences. Based on dynamics, a locally constant angular momentum (LCAM) model is introduced. The model is local in the sense that it is applied to a limited number of image frames at a time. Specifically, the model constrains the motion, over a local frame subsequence, to be a superposition of precession and translation. Thus, the instantaneous rotation axis of the object is allowed to change through the subsequence. The trajectory of the rotation center is approximated by a vector polynomial. The parameters of the model evolve in time so that they can adapt to long term changes in motion characteristics. The nature and parameters of short term motion can be estimated continuously with the goal of understanding motion through the image sequence. The estimation algorithm presented in this paper is linear, i.e., the algorithm consists of solving simultaneous linear equations. Based on the assumption that the motion is smooth, object positions and motion in the near future can be predicted, and short missing subsequences can be recovered. Noise smoothing is achieved by overdetermination and a leastsquares criterion. The framework is flexible in the sense that it allows both overdetermination in number of feature points and the number of image frames.  相似文献   

11.
12.
Wang  Shan  Shen  Xukun  Zhang  Yan 《Multimedia Tools and Applications》2018,77(17):22231-22246
Multimedia Tools and Applications - Large-scale multimedia datasets such as the Internet image and video collections provide new opportunities to understand and analyze human actions, among which...  相似文献   

13.
An approach to the analysis of dynamic facial images for the purposes of estimating and resynthesizing dynamic facial expressions is presented. The approach exploits a sophisticated generative model of the human face originally developed for realistic facial animation. The face model which may be simulated and rendered at interactive rates on a graphics workstation, incorporates a physics-based synthetic facial tissue and a set of anatomically motivated facial muscle actuators. The estimation of dynamical facial muscle contractions from video sequences of expressive human faces is considered. An estimation technique that uses deformable contour models (snakes) to track the nonrigid motions of facial features in video images is developed. The technique estimates muscle actuator controls with sufficient accuracy to permit the face model to resynthesize transient expressions  相似文献   

14.
In this paper, we present definitions for a dynamic knowledge-based image understanding system. From a sequence of grey level images, the system produces a flow of image interpretations. We use a semantic network to represent the knowledge embodied in the system. Dynamic representation is achieved by ahypotheses network. This network is a graph in which nodes represent information and arcs relations. A control strategy performs a continuous update of this network. The originality of our work lies in the control strategy: it includes astructure tracking phase, using the representation structure obtained from previous images to reduce the computational complexity of understanding processes. We demonstrate that in our case the computational complexity, which is exponential if we only use a purely data-driven bottom-up scheme, is polynomial when using the hypotheses tracking mechanism. This is to say that gain improvement in computation time is a major reason for dynamic understanding. The proposed system is implemented; experimental results of road mark detection and tracking are given.  相似文献   

15.
图像融合的运动目标检测算法研究   总被引:1,自引:0,他引:1  
为了改进常用的运动目标检测算法易受噪声和光线变化的影响、易出空洞、阴影和假边缘等现象,提出一种基于连续五帧帧间差分与Surendra背景边缘差分相融合的运动目标检测算法。该方法先采用Surendra自适应背景提取算法建立运动区域模型,通过优化的Canny算子进行背景边缘检测差分运算,再与五帧差分法相融合,通过双向模板填充和后期处理获得完整、准确的运动目标区域并完成背景的实时更新。实验结果表明,该算法快速、准确,能满足实时性检测的要求。  相似文献   

16.
The primary goal in motion vision is to extract information about the motion and shape of an object in a scene that is encoded in the optic flow. While many solutions to this problem, both iterative and in closed form, have been proposed, practitioners still view the problem as unsolved, since these methods, for the most part, cannot deal with some important aspects of realistic scenes. Among these are complex unsegmented scenes, nonsmooth objects, and general motion of the camera. In addition, the performance of many methods degrades ungracefully as the quality of the data deteriorates.Here, we will derive a closed-form solution for motion estimation based on thefirst-order information from two image regions with distinct flow structures. A unique solution is guaranteed when these corespond to two surface patches with different normal vectors. Given an image sequence, we will show how the image may be segmented into regions with the necessary properties, optical flow is computed for these regions, and motion parameters are calculated. The method can be applied to arbitrary scenes and any camera motion. We will show theoretically why the method is more robust than other proposed techniques that require the knowledge of the full flow or information up to the second-order terms of it. Experimental results are presented to support the theoretical derivations.  相似文献   

17.
18.
In this paper, an analysis of the effect of partial occlusion on facial expression recognition is investigated. The classification from partially occluded images in one of the six basic facial expressions is performed using a method based on Gabor wavelets texture information extraction, a supervised image decomposition method based on Discriminant Non-negative Matrix Factorization and a shape-based method that exploits the geometrical displacement of certain facial features. We demonstrate how partial occlusion affects the above mentioned methods in the classification of the six basic facial expressions, and indicate the way partial occlusion affects human observers when recognizing facial expressions. An attempt to specify which part of the face (left, right, lower or upper region) contains more discriminant information for each facial expression, is also made and conclusions regarding the pairs of facial expressions misclassifications that each type of occlusion introduces, are drawn.  相似文献   

19.
Often captured images are not focussed everywhere. Many applications of pattern recognition and computer vision require all parts of the image to be well-focussed. The all-in-focus image obtained, through the improved image fusion scheme, is useful for downstream tasks of image processing such as image enhancement, image segmentation, and edge detection. Mostly, fusion techniques have used feature-level information extracted from spatial or transform domain. In contrast, we have proposed a random forest (RF)-based novel scheme that has incorporated feature and decision levels information. In the proposed scheme, useful features are extracted from both spatial and transform domains. These features are used to train randomly generated trees of RF algorithm. The predicted information of trees is aggregated to construct more accurate decision map for fusion. Our proposed scheme has yielded better-fused image than the fused image produced by principal component analysis and Wavelet transform-based previous approaches that use simple feature-level information. Moreover, our approach has generated better-fused images than Support Vector Machine and Probabilistic Neural Network-based individual Machine Learning approaches. The performance of proposed scheme is evaluated using various qualitative and quantitative measures. The proposed scheme has reported 98.83, 97.29, 98.97, 97.78, and 98.14 % accuracy for standard images of Elaine, Barbara, Boat, Lena, and Cameraman, respectively. Further, this scheme has yielded 97.94, 98.84, 97.55, and 98.09 % accuracy for the real blurred images of Calendar, Leaf, Tree, and Lab, respectively.  相似文献   

20.
The widespread usage of image fusion causes an increase in the importance of assessing the performance of different fusion algorithms. The problem of introducing a suitable quality measure for image fusion lies in the difficulty of defining an ideal fused image. In this paper, we propose a non-reference objective image fusion metric based on mutual information which calculates the amount of information conducted from the source images to the fused image. The considered information is represented by image features like gradients or edges, which are often in the form of two-dimensional signals. In this paper, a method of estimating the joint probability distribution from marginal distributions is also presented which is employed in calculation of mutual information. The proposed method is compared with the most popular existing algorithms. Various experiments, performed on several databases, certify the efficiency of our proposed method which is more consistent with the subjective criteria.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号