期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Texture and shape information fusion for facial expression and facial action unit recognition

Irene Kotsia Stefanos Zafeiriou Ioannis Pitas 《Pattern recognition》2008,41(3):833-851

A novel method based on fusion of texture and shape information is proposed for facial expression and Facial Action Unit (FAU) recognition from video sequences. Regarding facial expression recognition, a subspace method based on Discriminant Non-negative Matrix Factorization (DNMF) is applied to the images, thus extracting the texture information. In order to extract the shape information, the system firstly extracts the deformed Candide facial grid that corresponds to the facial expression depicted in the video sequence. A Support Vector Machine (SVM) system designed on an Euclidean space, defined over a novel metric between grids, is used for the classification of the shape information. Regarding FAU recognition, the texture extraction method (DNMF) is applied on the differences images of the video sequence, calculated taking under consideration the neutral and the expressive frame. An SVM system is used for FAU classification from the shape information. This time, the shape information consists of the grid node coordinate displacements between the neutral and the expressed facial expression frame. The fusion of texture and shape information is performed using various approaches, among which are SVMs and Median Radial Basis Functions (MRBFs), in order to detect the facial expression and the set of present FAUs. The accuracy achieved using the Cohn–Kanade database is 92.3% when recognizing the seven basic facial expressions (anger, disgust, fear, happiness, sadness, surprise and neutral), and 92.1% when recognizing the 17 FAUs that are responsible for facial expression development. 相似文献

2.

Active and dynamic information fusion for multisensor systems with dynamic Bayesian networks.

Yongmian Zhang Qiang Ji 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2006,36(2):467-472

Many information fusion applications are often characterized by a high degree of complexity because: (1) data are often acquired from sensors of different modalities and with different degrees of uncertainty; (2) decisions must be made efficiently; and (3) the world situation evolves over time. To address these issues, we propose an information fusion framework based on dynamic Bayesian networks to provide active, dynamic, purposive and sufficing information fusion in order to arrive at a reliable conclusion with reasonable time and limited resources. The proposed framework is suited to applications where the decision must be made efficiently from dynamically available information of diverse and disparate sources. 相似文献

3.

Transformation of dynamic facial image sequences using static 2D prototypes

B. Tiddeman D. Perrett 《The Visual computer》2002,18(4):218-225

Published online: 15 March 2002 相似文献

4.

Multimodal information fusion for urban scene understanding

Philippe Xu Franck Davoine Jean-Baptiste Bordes Huijing Zhao Thierry Denœux 《Machine Vision and Applications》2016,27(3):331-349

This paper addresses the problem of scene understanding for driver assistance systems. To recognize the large number of objects that may be found on the road, several sensors and decision algorithms have to be used. The proposed approach is based on the representation of all available information in over-segmented image regions. The main novelty of the framework is its capability to incorporate new classes of objects and to include new sensors or detection methods while remaining robust to sensor failures. Several classes such as ground, vegetation or sky are considered, as well as three different sensors. The approach was evaluated on real publicly available urban driving scene data. 相似文献

5.

Dynamics of facial expression: recognition of facial actions and their temporal segments from face profile image sequences.

Maja Pantic Ioannis Patras 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2006,36(2):433-449

Automatic analysis of human facial expression is a challenging problem with many applications. Most of the existing automated systems for facial expression analysis attempt to recognize a few prototypic emotional expressions, such as anger and happiness. Instead of representing another approach to machine analysis of prototypic facial expressions of emotion, the method presented in this paper attempts to handle a large range of human facial behavior by recognizing facial muscle actions that produce expressions. Virtually all of the existing vision systems for facial muscle action detection deal only with frontal-view face images and cannot handle temporal dynamics of facial actions. In this paper, we present a system for automatic recognition of facial action units (AUs) and their temporal models from long, profile-view face image sequences. We exploit particle filtering to track 15 facial points in an input face-profile sequence, and we introduce facial-action-dynamics recognition from continuous video input using temporal rules. The algorithm performs both automatic segmentation of an input video into facial expressions pictured and recognition of temporal segments (i.e., onset, apex, offset) of 27 AUs occurring alone or in a combination in the input face-profile video. A recognition rate of 87% is achieved. 相似文献

6.

Information extraction from image sequences of real-world facial expressions

Haisong Gu Qiang Ji 《Machine Vision and Applications》2005,16(2):105-115

Information extraction of facial expressions deals with facial-feature detection, feature tracking, and capture of the spatiotemporal relationships among features. It is a fundamental task in facial expression analysis and will ultimately determine the performance of expression recognition. For a real-world facial expression sequence, there are three challenges: (1) detection failure of some or all facial features due to changes in illumination and rapid head movement; (2) nonrigid object tracking resulting from facial expression change; and (3) feature occlusion due to out-of-plane head rotation. In this paper, a new approach is proposed to tackle these challenges. First, we use an active infrared (IR) illumination to reliably detect pupils under variable lighting conditions and head orientations. The pupil positions are then used to guide the entire information-extraction process. The simultaneous use of a global head motion constraint and Kalman filtering can robustly track individual facial features even in condition of rapid head motion and significant expression change. To handle feature occlusion, we propose a warping-based reliability propagation method. The reliable neighbor features and the spatial semantics among these features are used to detect and infer occluded features through an interframe warping transformation. Experimental results show that accurate information extraction can be achieved for video sequences with real-world facial expressions.Received: 16 August 2003, Accepted: 20 September 2004, Published online: 20 December 2004 Correspondence to: Qiang Ji 相似文献

7.

Automatic facial expression recognition in real-time from dynamic sequences of 3D face scans

Stefano Berretti Alberto del Bimbo Pietro Pala 《The Visual computer》2013,29(12):1333-1350

In this paper, we present a fully-automatic and real-time approach for person-independent recognition of facial expressions from dynamic sequences of 3D face scans. In the proposed solution, first a set of 3D facial landmarks are automatically detected, then the local characteristics of the face in the neighborhoods of the facial landmarks and their mutual distances are used to model the facial deformation. Training two hidden Markov models for each facial expression to be recognized, and combining them to form a multiclass classifier, an average recognition rate of 79.4 % has been obtained for the 3D dynamic sequences showing the six prototypical facial expressions of the Binghamton University 4D Facial Expression database. Comparisons with competitor approaches on the same database show that our solution is able to obtain effective results with the advantage of being capable to process facial sequences in real-time. 相似文献

8.

Extraction of 3D information from sonar image sequences

Trucco A. Curletto S. 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2003,33(4):687-699

This paper describes a set of methods that make it possible to estimate the position of a feature inside a three-dimensional (3D) space by starting from a sequence of two-dimensional (2D) acoustic images of the seafloor acquired with a sonar system. Typical sonar imaging systems are able to generate just 2D images, and the acquisition of 3D information involves sharp increases in complexity and costs. The front-scan sonar proposed in this paper is a new equipment devoted to acquiring a 2D image of the seafloor to sail over, and allows one to collect a sequence of images showing a specific feature during the approach of the ship. This fact seems to make it possible to recover the 3D position of a feature by comparing the feature positions along the sequence of images acquired from different (known) ship positions. This opportunity is investigated in the paper, where it is shown that encouraging results have been obtained by a processing chain composed of some blocks devoted to low-level processing, feature extraction and analysis, a Kalman filter for robust feature tracking, and some ad hoc equations for depth estimation and averaging. A statistical error analysis demonstrated the great potential of the proposed system also if some inaccuracies affect the sonar measures and the knowledge of the ship position. This was also confirmed by several tests performed on both simulated and real sequences, obtaining satisfactory results on both the feature tracking and, above all, the estimation of the 3D position. 相似文献

9.

Multi-scale depth information fusion network for image dehazing

Fan Guodong Hua Zhen Li Jinjiang 《Applied Intelligence》2021,51(10):7262-7280

Applied Intelligence - According to the atmospheric physical model, we can use accurate transmittance and atmospheric light information to convert a hazy image into a clean one. The scene-depth... 相似文献

10.

3-d motion estimation, understanding, and prediction from noisy image sequences 总被引：1，自引：0，他引：1

Weng J Huang TS Ahuja N 《IEEE transactions on pattern analysis and machine intelligence》1987,(3):370-389

This paper presents an approach to understanding general 3-D motion of a rigid body from image sequences. Based on dynamics, a locally constant angular momentum (LCAM) model is introduced. The model is local in the sense that it is applied to a limited number of image frames at a time. Specifically, the model constrains the motion, over a local frame subsequence, to be a superposition of precession and translation. Thus, the instantaneous rotation axis of the object is allowed to change through the subsequence. The trajectory of the rotation center is approximated by a vector polynomial. The parameters of the model evolve in time so that they can adapt to long term changes in motion characteristics. The nature and parameters of short term motion can be estimated continuously with the goal of understanding motion through the image sequence. The estimation algorithm presented in this paper is linear, i.e., the algorithm consists of solving simultaneous linear equations. Based on the assumption that the motion is smooth, object positions and motion in the near future can be predicted, and short missing subsequences can be recovered. Noise smoothing is achieved by overdetermination and a leastsquares criterion. The framework is flexible in the sense that it allows both overdetermination in number of feature points and the number of image frames. 相似文献

11.

A multimedia information fusion framework for web image categorization

Wenting Lu Lei Li Jingxuan Li Tao Li Honggang Zhang Jun Guo 《Multimedia Tools and Applications》2014,70(3):1453-1486

相似文献

12.

3D facial feature and expression computing from Internet image or video

Wang Shan Shen Xukun Zhang Yan 《Multimedia Tools and Applications》2018,77(17):22231-22246

Multimedia Tools and Applications - Large-scale multimedia datasets such as the Internet image and video collections provide new opportunities to understand and analyze human actions, among which... 相似文献

13.

Analysis and synthesis of facial image sequences using physical andanatomical models

Terzopoulos D. Waters K. 《IEEE transactions on pattern analysis and machine intelligence》1993,15(6):569-579

An approach to the analysis of dynamic facial images for the purposes of estimating and resynthesizing dynamic facial expressions is presented. The approach exploits a sophisticated generative model of the human face originally developed for realistic facial animation. The face model which may be simulated and rendered at interactive rates on a graphics workstation, incorporates a physics-based synthetic facial tissue and a set of anatomically motivated facial muscle actuators. The estimation of dynamical facial muscle contractions from video sequences of expressive human faces is considered. An estimation technique that uses deformable contour models (snakes) to track the nonrigid motions of facial features in video images is developed. The technique estimates muscle actuator controls with sufficient accuracy to permit the face model to resynthesize transient expressions 相似文献

14.

A recognition network model-based approach to dynamic image understanding

Frederic Jurie Jean Gallice 《Annals of Mathematics and Artificial Intelligence》1995,13(3-4):317-345

In this paper, we present definitions for a dynamic knowledge-based image understanding system. From a sequence of grey level images, the system produces a flow of image interpretations. We use a semantic network to represent the knowledge embodied in the system. Dynamic representation is achieved by ahypotheses network. This network is a graph in which nodes represent information and arcs relations. A control strategy performs a continuous update of this network. The originality of our work lies in the control strategy: it includes astructure tracking phase, using the representation structure obtained from previous images to reduce the computational complexity of understanding processes. We demonstrate that in our case the computational complexity, which is exponential if we only use a purely data-driven bottom-up scheme, is polynomial when using the hypotheses tracking mechanism. This is to say that gain improvement in computation time is a major reason for dynamic understanding. The proposed system is implemented; experimental results of road mark detection and tracking are given. 相似文献

15.

图像融合的运动目标检测算法研究 总被引：1，自引：0，他引：1

吴君钦刘昊罗勇《计算机工程与设计》2012,33(12):4614-4618

为了改进常用的运动目标检测算法易受噪声和光线变化的影响、易出空洞、阴影和假边缘等现象,提出一种基于连续五帧帧间差分与Surendra背景边缘差分相融合的运动目标检测算法。该方法先采用Surendra自适应背景提取算法建立运动区域模型,通过优化的Canny算子进行背景边缘检测差分运算,再与五帧差分法相融合,通过双向模板填充和后期处理获得完整、准确的运动目标区域并完成背景的实时更新。实验结果表明,该算法快速、准确,能满足实时性检测的要求。相似文献

16.

Motion recovery from image sequences using only first order optical flow information

Shahriar Negahdaripour Shinhak Lee 《International Journal of Computer Vision》1992,9(3):163-184

The primary goal in motion vision is to extract information about the motion and shape of an object in a scene that is encoded in the optic flow. While many solutions to this problem, both iterative and in closed form, have been proposed, practitioners still view the problem as unsolved, since these methods, for the most part, cannot deal with some important aspects of realistic scenes. Among these are complex unsegmented scenes, nonsmooth objects, and general motion of the camera. In addition, the performance of many methods degrades ungracefully as the quality of the data deteriorates.Here, we will derive a closed-form solution for motion estimation based on thefirst-order information from two image regions with distinct flow structures. A unique solution is guaranteed when these corespond to two surface patches with different normal vectors. Given an image sequence, we will show how the image may be segmented into regions with the necessary properties, optical flow is computed for these regions, and motion parameters are calculated. The method can be applied to arbitrary scenes and any camera motion. We will show theoretically why the method is more robust than other proposed techniques that require the knowledge of the full flow or information up to the second-order terms of it. Experimental results are presented to support the theoretical derivations. 相似文献

17.

Guest editorial: Automatic facial and bodily expression perception for human behaviour understanding

Zhang Li Lim Chee Peng Han Jungong 《Multimedia Tools and Applications》2019,78(21):30331-30334

Multimedia Tools and Applications - 相似文献

18.

An analysis of facial expression recognition under partial facial image occlusion

Irene Kotsia Ioan Buciu Ioannis Pitas 《Image and vision computing》2008,26(7):1052-1067

In this paper, an analysis of the effect of partial occlusion on facial expression recognition is investigated. The classification from partially occluded images in one of the six basic facial expressions is performed using a method based on Gabor wavelets texture information extraction, a supervised image decomposition method based on Discriminant Non-negative Matrix Factorization and a shape-based method that exploits the geometrical displacement of certain facial features. We demonstrate how partial occlusion affects the above mentioned methods in the classification of the six basic facial expressions, and indicate the way partial occlusion affects human observers when recognizing facial expressions. An attempt to specify which part of the face (left, right, lower or upper region) contains more discriminant information for each facial expression, is also made and conclusions regarding the pairs of facial expressions misclassifications that each type of occlusion introduces, are drawn. 相似文献

19.

Random forest-based scheme using feature and decision levels information for multi-focus image fusion

Nabeela Kausar Abdul Majid 《Pattern Analysis & Applications》2016,19(1):221-236

Often captured images are not focussed everywhere. Many applications of pattern recognition and computer vision require all parts of the image to be well-focussed. The all-in-focus image obtained, through the improved image fusion scheme, is useful for downstream tasks of image processing such as image enhancement, image segmentation, and edge detection. Mostly, fusion techniques have used feature-level information extracted from spatial or transform domain. In contrast, we have proposed a random forest (RF)-based novel scheme that has incorporated feature and decision levels information. In the proposed scheme, useful features are extracted from both spatial and transform domains. These features are used to train randomly generated trees of RF algorithm. The predicted information of trees is aggregated to construct more accurate decision map for fusion. Our proposed scheme has yielded better-fused image than the fused image produced by principal component analysis and Wavelet transform-based previous approaches that use simple feature-level information. Moreover, our approach has generated better-fused images than Support Vector Machine and Probabilistic Neural Network-based individual Machine Learning approaches. The performance of proposed scheme is evaluated using various qualitative and quantitative measures. The proposed scheme has reported 98.83, 97.29, 98.97, 97.78, and 98.14 % accuracy for standard images of Elaine, Barbara, Boat, Lena, and Cameraman, respectively. Further, this scheme has yielded 97.94, 98.84, 97.55, and 98.09 % accuracy for the real blurred images of Calendar, Leaf, Tree, and Lab, respectively. 相似文献

20.

A non-reference image fusion metric based on mutual information of image features

Mohammad Bagher Akbari HaghighatAuthor Vitae Hadi Seyedarabi^{Author Vitae} 《Computers & Electrical Engineering》2011,37(5):744-756

The widespread usage of image fusion causes an increase in the importance of assessing the performance of different fusion algorithms. The problem of introducing a suitable quality measure for image fusion lies in the difficulty of defining an ideal fused image. In this paper, we propose a non-reference objective image fusion metric based on mutual information which calculates the amount of information conducted from the source images to the fused image. The considered information is represented by image features like gradients or edges, which are often in the form of two-dimensional signals. In this paper, a method of estimating the joint probability distribution from marginal distributions is also presented which is employed in calculation of mutual information. The proposed method is compared with the most popular existing algorithms. Various experiments, performed on several databases, certify the efficiency of our proposed method which is more consistent with the subjective criteria. 相似文献