首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Appraisal theories in psychology study facial expressions in order to deduct information regarding the underlying emotion elicitation processes. Scherer’s component process model provides predictions regarding particular face muscle deformations that are attributed as reactions to the cognitive appraisal stimuli in the study of emotion episodes. In the current work, MPEG-4 facial animation parameters are used in order to evaluate these theoretical predictions for intermediate and final expressions of a given emotion episode. We manipulate parameters such as intensity and temporal evolution of synthesized facial expressions. In emotion episodes originating from identical stimuli, by varying the cognitive appraisals of the stimuli and mapping them to different expression intensities and timings, various behavioral patterns can be generated and thus different agent character profiles can be defined. The results of the synthesis process are consequently applied to Embodied Conversational Agents (ECAs), aiming to render their interaction with humans, or other ECAs, more affective.  相似文献   

2.
Igor S. Pandzic   《Graphical Models》2003,65(6):385-404
We propose a method for automatically copying facial motion from one 3D face model to another, while preserving the compliance of the motion to the MPEG-4 Face and Body Animation (FBA) standard. Despite the enormous progress in the field of Facial Animation, producing a new animatable face from scratch is still a tremendous task for an artist. Although many methods exist to animate a face automatically based on procedural methods, these methods still need to be initialized by defining facial regions or similar, and they lack flexibility because the artist can only obtain the facial motion that a particular algorithm offers. Therefore a very common approach is interpolation between key facial expressions, usually called morph targets, containing either speech elements (visemes) or emotional expressions. Following the same approach, the MPEG-4 Facial Animation specification offers a method for interpolation of facial motion from key positions, called Facial Animation Tables, which are essentially morph targets corresponding to all possible motions specified in MPEG-4. The problem of this approach is that the artist needs to create a new set of morph targets for each new face model. In case of MPEG-4 there are 86 morph targets, which is a lot of work to create manually. Our method solves this problem by cloning the morph targets, i.e. by automatically copying the motion of vertices, as well as geometry transforms, from source face to target face while maintaining the regional correspondences and the correct scale of motion. It requires the user only to identify a subset of the MPEG-4 Feature Points in the source and target faces. The scale of the movement is normalized with respect to MPEG-4 normalization units (FAPUs), meaning that the MPEG-4 FBA compliance of the copied motion is preserved. Our method is therefore suitable not only for cloning of free facial expressions, but also of MPEG-4 compatible facial motion, in particular the Facial Animation Tables. We believe that Facial Motion Cloning offers dramatic time saving to artists producing morph targets for facial animation or MPEG-4 Facial Animation Tables.  相似文献   

3.
For effective interaction between humans and socially adept, intelligent service robots, a key capability required by this class of sociable robots is the successful interpretation of visual data. In addition to crucial techniques like human face detection and recognition, an important next step for enabling intelligence and empathy within social robots is that of emotion recognition. In this paper, an automated and interactive computer vision system is investigated for human facial expression recognition and tracking based on the facial structure features and movement information. Twenty facial features are adopted since they are more informative and prominent for reducing the ambiguity during classification. An unsupervised learning algorithm, distributed locally linear embedding (DLLE), is introduced to recover the inherent properties of scattered data lying on a manifold embedded in high-dimensional input facial images. The selected person-dependent facial expression images in a video are classified using the DLLE. In addition, facial expression motion energy is introduced to describe the facial muscle’s tension during the expressions for person-independent tracking for person-independent recognition. This method takes advantage of the optical flow which tracks the feature points’ movement information. Finally, experimental results show that our approach is able to separate different expressions successfully.  相似文献   

4.
基于自由形状变形的三维人脸表情控制   总被引:2,自引:0,他引:2       下载免费PDF全文
人脸表情控制是生物特征识别研究的重要内容,本文提出了一种基于MPEG-4中FAP、FAT标准进行三维人脸表情合成的方法。首先对人脸进行关键点定义和区域分割,然后使用面向表面的自由形状变形方法(SOFFD)生成人脸表情。在特定人脸表情生成过程中,使用了基表情的比例合成方法。实验表明,该方法可以有效地合成各种真实的人脸表情。  相似文献   

5.
Avatars are increasingly used to express our emotions in our online communications. Such avatars are used based on the assumption that avatar expressions are interpreted universally among all cultures. This paper investigated cross-cultural evaluations of avatar expressions designed by Japanese and Western designers. The goals of the study were: (1) to investigate cultural differences in avatar expression evaluation and apply findings from psychological studies of human facial expression recognition, (2) to identify expressions and design features that cause cultural differences in avatar facial expression interpretation. The results of our study confirmed that (1) there are cultural differences in interpreting avatars’ facial expressions, and the psychological theory that suggests physical proximity affects facial expression recognition accuracy is also applicable to avatar facial expressions, (2) positive expressions have wider cultural variance in interpretation than negative ones, (3) use of gestures and gesture marks may sometimes cause counter-effects in recognizing avatar facial expressions.  相似文献   

6.
This work is about multimodal and expressive synthesis on virtual agents, based on the analysis of actions performed by human users. As input we consider the image sequence of the recorded human behavior. Computer vision and image processing techniques are incorporated in order to detect cues needed for expressivity features extraction. The multimodality of the approach lies in the fact that both facial and gestural aspects of the user’s behavior are analyzed and processed. The mimicry consists of perception, interpretation, planning and animation of the expressions shown by the human, resulting not in an exact duplicate rather than an expressive model of the user’s original behavior.  相似文献   

7.
8.
Variations in illumination degrade the performance of appearance based face recognition. We present a novel algorithm for the normalization of color facial images using a single image and its co-registered 3D pointcloud (3D image). The algorithm borrows the physically based Phong’s lighting model from computer graphics which is used for rendering computer images and employs it in a reverse mode for the calculation of face albedo from real facial images. Our algorithm estimates the number of the dominant light sources and their directions from the specularities in the facial image and the corresponding 3D points. The intensities of the light sources and the parameters of the Phong’s model are estimated by fitting the Phong’s model onto the facial skin data. Unlike existing approaches, our algorithm takes into account both Lambertian and specular reflections as well as attached and cast shadows. Moreover, our algorithm is invariant to facial pose and expression and can effectively handle the case of multiple extended light sources. The algorithm was tested on the challenging FRGC v2.0 data and satisfactory results were achieved. The mean fitting error was 6.3% of the maximum color value. Performing face recognition using the normalized images increased both identification and verification rates.  相似文献   

9.
We introduce the WASABI ([W]ASABI [A]ffect [S]imulation for [A]gents with [B]elievable [I]nteractivity) Affect Simulation Architecture, in which a virtual human’s cognitive reasoning capabilities are combined with simulated embodiment to achieve the simulation of primary and secondary emotions. In modeling primary emotions we follow the idea of “Core Affect” in combination with a continuous progression of bodily feeling in three-dimensional emotion space (PAD space), that is subsequently categorized into discrete emotions. In humans, primary emotions are understood as onto-genetically earlier emotions, which directly influence facial expressions. Secondary emotions, in contrast, afford the ability to reason about current events in the light of experiences and expectations. By technically representing aspects of each secondary emotion’s connotative meaning in PAD space, we not only assure their mood-congruent elicitation, but also combine them with facial expressions, that are concurrently driven by primary emotions. Results of an empirical study suggest that human players in a card game scenario judge our virtual human MAX significantly older when secondary emotions are simulated in addition to primary ones.  相似文献   

10.
Given a person’s neutral face, we can predict his/her unseen expression by machine learning techniques for image processing. Different from the prior expression cloning or image analogy approaches, we try to hallucinate the person’s plausible facial expression with the help of a large face expression database. In the first step, regularization network based nonlinear manifold learning is used to obtain a smooth estimation for unseen facial expression, which is better than the reconstruction results of PCA. In the second step, Markov network is adopted to learn the low-level local facial feature’s relationship between the residual neutral and the expressional face image’s patches in the training set, then belief propagation is employed to infer the expressional residual face image for that person. By integrating the two approaches, we obtain the final results. The experimental results show that the hallucinated facial expression is not only expressive but also close to the ground truth.  相似文献   

11.
12.
This paper presents a compressed-domain motion object extraction algorithm based on optical flow approximation for MPEG-2 video stream. The discrete cosine transform (DCT) coefficients of P and B frames are estimated to reconstruct DC + 2AC image using their motion vectors and the DCT coefficients in I frames, which can be directly extracted from MPEG-2 compressed domain. Initial optical flow is estimated with Black’s optical flow estimation framework, in which DC image is substituted by DC + 2AC image to provide more intensity information. A high confidence measure is exploited to generate dense and accurate motion vector field by removing noisy and false motion vectors. Global motion estimation and iterative rejection are further utilized to separate foreground and background motion vectors. Region growing with automatic seed selection is performed to extract accurate object boundary by motion consistency model. The object boundary is further refined by partially decoding the boundary blocks to improve the accuracy. Experimental results on several test sequences demonstrate that the proposed approach can achieve compressed-domain video object extraction for MPEG-2 video stream in CIF format with real-time performance.  相似文献   

13.
A human face does not play its role in the identification of an individual but also communicates useful information about a person’s emotional state at a particular time. No wonder automatic face expression recognition has become an area of great interest within the computer science, psychology, medicine, and human–computer interaction research communities. Various feature extraction techniques based on statistical to geometrical data have been used for recognition of expressions from static images as well as real-time videos. In this paper, we present a method for automatic recognition of facial expressions from face images by providing discrete wavelet transform features to a bank of seven parallel support vector machines (SVMs). Each SVM is trained to recognize a particular facial expression, so that it is most sensitive to that expression. Multi-classification is achieved by combining multiple SVMs performing binary classification using one-against-all approach. The outputs of all SVMs are combined using a maximum function. The classification efficiency is tested on static images from the publicly available Japanese Female Facial Expression database. The experiments using the proposed method demonstrate promising results.  相似文献   

14.
具有真实感的三维人脸动画   总被引:10,自引:0,他引:10       下载免费PDF全文
张青山  陈国良 《软件学报》2003,14(3):643-650
具有真实感的三维人脸模型的构造和动画是计算机图形学领域中一个重要的研究课题.如何在三维人脸模型上实时地模拟人脸的运动,产生具有真实感的人脸表情和动作,是其中的一个难点.提出一种实时的三维人脸动画方法,该方法将人脸模型划分成若干个运动相对独立的功能区,然后使用提出的基于加权狄里克利自由变形DFFD(Dirichlet free-form deformation)和刚体运动模拟的混合技术模拟功能区的运动.同时,通过交叉的运动控制点模拟功能区之间运动的相互影响.在该方法中,人脸模型的运动通过移动控制点来驱动.为了简化人脸模型的驱动,提出了基于MPEG-4中脸部动画参数FAP(facial animation parameters)流和基于肌肉模型的两种高层驱动方法.这两种方法不但具有较高的真实感,而且具有良好的计算性能,能实时模拟真实人脸的表情和动作.  相似文献   

15.
MPEG-4提出的基于对象的编码格式,将人脸作为一个特殊的对象,为人脸建模和动画研究奠定了基础。本文通过对MPEG-4人脸动画标准的分析,提出基于MPEG-4人脸动画系统的设计思想和需解决的关键问题。  相似文献   

16.
基于MPEG-4的人脸表情图像变形研究   总被引:1,自引:0,他引:1       下载免费PDF全文
为了实时地生成自然真实的人脸表情,提出了一种基于MPEG-4人脸动画框架的人脸表情图像变形方法。该方法首先采用face alignment工具提取人脸照片中的88个特征点;接着在此基础上,对标准人脸网格进行校准变形,以进一步生成特定人脸的三角网格;然后根据人脸动画参数(FAP)移动相应的面部关键特征点及其附近的关联特征点,并在移动过程中保证在多个FAP的作用下的人脸三角网格拓扑结构不变;最后对发生形变的所有三角网格区域通过仿射变换进行面部纹理填充,生成了由FAP所定义的人脸表情图像。该方法的输入是一张中性人脸照片和一组人脸动画参数,输出是对应的人脸表情图像。为了实现细微表情动作和虚拟说话人的合成,还设计了一种眼神表情动作和口内细节纹理的生成算法。基于5分制(MOS)的主观评测实验表明,利用该人脸图像变形方法生成的表情脸像自然度得分为3.67。虚拟说话人合成的实验表明,该方法具有很好的实时性,在普通PC机上的平均处理速度为66.67 fps,适用于实时的视频处理和人脸动画的生成。  相似文献   

17.
视觉语音参数估计在视觉语音的研究中占有重要的地位.从MPEG-4定义的人脸动画参数FAP中选择24个与发音有直接关系的参数来描述视觉语音,将统计学习方法和基于规则的方法结合起来,利用人脸颜色概率分布信息和先验形状及边缘知识跟踪嘴唇轮廓线和人脸特征点,取得了较为精确的跟踪效果.在滤除参考点跟踪中的高频噪声后,利用人脸上最为突出的4个参考点估计出主要的人脸运动姿态,从而消除了全局运动的影响,最后根据这些人脸特征点的运动计算出准确的视觉语音参数,并得到了实际应用.  相似文献   

18.
Cognitive appraisal theories, which link human emotional experience to their interpretations of events happening in the environment, are leading approaches to model emotions. Cognitive appraisal theories have often been used both for simulating “real emotions” in virtual characters and for predicting the human user’s emotional experience to facilitate human–computer interaction. In this work, we investigate the computational modeling of appraisal in a multi-agent decision-theoretic framework using Partially Observable Markov Decision Process-based (POMDP) agents. Domain-independent approaches are developed for five key appraisal dimensions (motivational relevance, motivation congruence, accountability, control and novelty). We also discuss how the modeling of theory of mind (recursive beliefs about self and others) is realized in the agents and is critical for simulating social emotions. Our model of appraisal is applied to three different scenarios to illustrate its usages. This work not only provides a solution for computationally modeling emotion in POMDP-based agents, but also illustrates the tight relationship between emotion and cognition—the appraisal dimensions are derived from the processes and information required for the agent’s decision-making and belief maintenance processes, which suggests a uniform cognitive structure for emotion and cognition.  相似文献   

19.
论文提出了一种新的基于三维人脸形变模型,并兼容于MPEG-4的三维人脸动画模型。采用基于均匀网格重采样的方法建立原型三维人脸之间的对齐,应用MPEG-4中定义的三维人脸动画规则,驱动三维模型自动生成真实感人脸动画。给定一幅人脸图像,三维人脸动画模型可自动重建其真实感的三维人脸,并根据FAP参数驱动模型自动生成人脸动画。  相似文献   

20.
To date the most popular and sophisticated types of virtual worlds can be found in the area of video gaming, especially in the genre of Massively Multiplayer Online Role Playing Games (MMORPG). Game developers have made great strides in achieving game worlds that look and feel increasingly realistic. However, despite these achievements in the visual realism of virtual game worlds, they are much less sophisticated when it comes to modeling face-to-face interaction. In face-to-face, ordinary social activities are “accountable,” that is, people use a variety of kinds of observational information about what others are doing in order to make sense of others’ actions and to tightly coordinate their own actions with others. Such information includes: (1) the real-time unfolding of turns-at-talk; (2) the observability of embodied activities; and (3) the direction of eye gaze for the purpose of gesturing. But despite the fact that today’s games provide virtual bodies, or “avatars,” for players to control, these avatars display much less information about players’ current state than real bodies do. In this paper, we discuss the impact of the lack of each type of information on players’ ability to tightly coordinate their activities and offer guidelines for improving coordination and, ultimately, the players’ social experience. “They come here to talk turkey with suits from around the world, and they consider it just as good as a face-to-face. They more or less ignore what is being said—a lot gets lost in translation, after all. They pay attention to the facial expressions and body language of the people they are talking to. And that’s how they know what’s going on inside a person’s head—by condensing fact from the vapor of nuance.” —Neal Stephenson, Snow Crash, 1992  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号