首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
针对现有的虚拟说话人面部表情比较单一,表情和动作不能很好的协同问题,提出了一种建立具有真实感的情绪化虚拟人的方法。该方法首先利用三参数产生,保持和消退来对动态面部表情进行仿真,采用融合变形技术合成复杂的表情,然后以人类心理学的统计数据为依据来对眼部和头部动作进行设计,使虚拟人看起来更加逼真,最后分析了外在条件相机位置、光照对增加虚拟人真实感的影响。实验结果表明,该方法建立的虚拟人不仅逼真自然且富有感情,而且语音、动态面部表情、眼动和头动达到了很好的协调同步。  相似文献   

2.
刘涛  周先春  严锡君 《计算机科学》2018,45(10):286-290, 319
文中提出了一种人脸表情识别的新方法,该方法采用动态的光流特征来描述人脸表情的变化差异,提高人脸表情的识别率。首先,计算人脸表情图像与中性表情图像之间的光流特征;然后,对传统的线性判断分析方法(Linear Discriminant Analysis,LDA)进行扩展,采用高斯LDA方法对光流特征进行映射,从而得到人脸表情图像的特征向量;最后,设计多类支持向量机分类器,实现人脸表情的分类与识别。在JAFFE和CK人脸表情数据库上的表情识别实验结果表明,该方法的平均识别率比3种对比方法的高出2%以上。  相似文献   

3.
基于生成式对抗网络的鲁棒人脸表情识别   总被引:1,自引:0,他引:1  
人们在自然情感交流中经常伴随着头部旋转和肢体动作,它们往往导致较大范围的人脸遮挡,使得人脸图像损失部分表情信息.现有的表情识别方法大多基于通用的人脸特征和识别算法,未考虑表情和身份的差异,导致对新用户的识别不够鲁棒.本文提出了一种对人脸局部遮挡图像进行用户无关表情识别的方法.该方法包括一个基于Wasserstein生成式对抗网络(Wasserstein generative adversarial net,WGAN)的人脸图像生成网络,能够为图像中的遮挡区域生成上下文一致的补全图像;以及一个表情识别网络,能够通过在表情识别任务和身份识别任务之间建立对抗关系来提取用户无关的表情特征并推断表情类别.实验结果表明,我们的方法在由CK+,Multi-PIE和JAFFE构成的混合数据集上用户无关的平均识别准确率超过了90%.在CK+上用户无关的识别准确率达到了96%,其中4.5%的性能提升得益于本文提出的对抗式表情特征提取方法.此外,在45°头部旋转范围内,本文方法还能够用于提高非正面表情的识别准确率.  相似文献   

4.
We describe a computer vision system for observing facial motion by using an optimal estimation optical flow method coupled with geometric, physical and motion-based dynamic models describing the facial structure. Our method produces a reliable parametric representation of the face's independent muscle action groups, as well as an accurate estimate of facial motion. Previous efforts at analysis of facial expression have been based on the facial action coding system (FACS), a representation developed in order to allow human psychologists to code expression from static pictures. To avoid use of this heuristic coding scheme, we have used our computer vision system to probabilistically characterize facial motion and muscle activation in an experimental population, thus deriving a new, more accurate, representation of human facial expressions that we call FACS+. Finally, we show how this method can be used for coding, analysis, interpretation, and recognition of facial expressions  相似文献   

5.
As technology advances, robots and virtual agents will be introduced into the home and healthcare settings to assist individuals, both young and old, with everyday living tasks. Understanding how users recognize an agent׳s social cues is therefore imperative, especially in social interactions. Facial expression, in particular, is one of the most common non-verbal cues used to display and communicate emotion in on-screen agents (Cassell et al., 2000). Age is important to consider because age-related differences in emotion recognition of human facial expression have been supported (Ruffman et al., 2008), with older adults showing a deficit for recognition of negative facial expressions. Previous work has shown that younger adults can effectively recognize facial emotions displayed by agents (Bartneck and Reichenbach, 2005, Courgeon et al., 2009, Courgeon et al., 2011, Breazeal, 2003); however, little research has compared in-depth younger and older adults’ ability to label a virtual agent׳s facial emotions, an import consideration because social agents will be required to interact with users of varying ages. If such age-related differences exist for recognition of virtual agent facial expressions, we aim to understand if those age-related differences are influenced by the intensity of the emotion, dynamic formation of emotion (i.e., a neutral expression developing into an expression of emotion through motion), or the type of virtual character differing by human-likeness. Study 1 investigated the relationship between age-related differences, the implication of dynamic formation of emotion, and the role of emotion intensity in emotion recognition of the facial expressions of a virtual agent (iCat). Study 2 examined age-related differences in recognition expressed by three types of virtual characters differing by human-likeness (non-humanoid iCat, synthetic human, and human). Study 2 also investigated the role of configural and featural processing as a possible explanation for age-related differences in emotion recognition. First, our findings show age-related differences in the recognition of emotions expressed by a virtual agent, with older adults showing lower recognition for the emotions of anger, disgust, fear, happiness, sadness, and neutral. These age-related difference might be explained by older adults having difficulty discriminating similarity in configural arrangement of facial features for certain emotions; for example, older adults often mislabeled the similar emotions of fear as surprise. Second, our results did not provide evidence for the dynamic formation improving emotion recognition; but, in general, the intensity of the emotion improved recognition. Lastly, we learned that emotion recognition, for older and younger adults, differed by character type, from best to worst: human, synthetic human, and then iCat. Our findings provide guidance for design, as well as the development of a framework of age-related differences in emotion recognition.  相似文献   

6.
目的 为解决真实环境中由类内差距引起的面部表情识别率低及室内外复杂环境对类内差距较大的面部表情识别难度大等问题,提出一种利用生成对抗网络(generative adversarial network,GAN)识别面部表情的方法。方法 在GAN生成对抗的思想下,构建一种IC-GAN(intra-class gap GAN)网络结构,使用卷积组建编码器、解码器对自制混合表情图像进行更深层次的特征提取,使用基于动量的Adam(adaptive moment estimation)优化算法进行网络权重更新,重点针对真实环境面部表情识别过程中的类内差距较大的表情进行识别,使其更好地适应类内差异较大的任务。结果 基于Pytorch环境,在自制的面部表情数据集上进行训练,在面部表情验证集上进行测试,并与深度置信网络(deep belief network,DBN)和GoogLeNet网络进行对比实验,最终IC-GAN网络的识别结果比DBN网络和GoogLeNet网络分别提高11%和8.3%。结论 实验验证了IC-GAN在类内差距较大的面部表情识别中的精度,降低了面部表情在类内差距较大情况下的误识率,提高了系统鲁棒性,为面部表情的生成工作打下了坚实的基础。  相似文献   

7.
This research explores and evaluates the contribution that facial expressions might have regarding improved comprehension and acceptability in sign language avatars. Focusing specifically on Irish sign language (ISL), the Deaf (the uppercase “D” in the word “Deaf” indicates Deaf as a culture as opposed to “deaf” as a medical condition) community’s responsiveness to sign language avatars is examined. The hypothesis of this is as follows: augmenting an existing avatar with the seven widely accepted universal emotions identified by Ekman (Basic emotions: handbook of cognition and emotion. Wiley, London, 2005) to achieve underlying facial expressions will make that avatar more human like and improve usability and understandability for the ISL user. Using human evaluation methods (Huenerfauth et al. in Trans Access Comput (ACM) 1:1, 2008), an augmented set of avatar utterances is compared against a baseline set, focusing on two key areas: comprehension and naturalness of facial configuration. The approach to the evaluation including the choice of ISL participants, interview environment and evaluation methodology is then outlined. The evaluation results reveal that in a comprehension test there was little difference between the baseline avatars and those augmented with emotional facial expression. It was also found that the avatars are lacking various linguistic attributes.  相似文献   

8.
Emotive audio–visual avatars are virtual computer agents which have the potential of improving the quality of human-machine interaction and human-human communication significantly. However, the understanding of human communication has not yet advanced to the point where it is possible to make realistic avatars that demonstrate interactions with natural- sounding emotive speech and realistic-looking emotional facial expressions. In this paper, We propose the various technical approaches of a novel multimodal framework leading to a text-driven emotive audio–visual avatar. Our primary work is focused on emotive speech synthesis, realistic emotional facial expression animation, and the co-articulation between speech gestures (i.e., lip movements) and facial expressions. A general framework of emotive text-to-speech (TTS) synthesis using a diphone synthesizer is designed and integrated into a generic 3-D avatar face model. Under the guidance of this framework, we therefore developed a realistic 3-D avatar prototype. A rule-based emotive TTS synthesis system module based on the Festival-MBROLA architecture has been designed to demonstrate the effectiveness of the framework design. Subjective listening experiments were carried out to evaluate the expressiveness of the synthetic talking avatar.   相似文献   

9.
For effective interaction between humans and socially adept, intelligent service robots, a key capability required by this class of sociable robots is the successful interpretation of visual data. In addition to crucial techniques like human face detection and recognition, an important next step for enabling intelligence and empathy within social robots is that of emotion recognition. In this paper, an automated and interactive computer vision system is investigated for human facial expression recognition and tracking based on the facial structure features and movement information. Twenty facial features are adopted since they are more informative and prominent for reducing the ambiguity during classification. An unsupervised learning algorithm, distributed locally linear embedding (DLLE), is introduced to recover the inherent properties of scattered data lying on a manifold embedded in high-dimensional input facial images. The selected person-dependent facial expression images in a video are classified using the DLLE. In addition, facial expression motion energy is introduced to describe the facial muscle’s tension during the expressions for person-independent tracking for person-independent recognition. This method takes advantage of the optical flow which tracks the feature points’ movement information. Finally, experimental results show that our approach is able to separate different expressions successfully.  相似文献   

10.
We propose an efficient algorithm for recognizing facial expressions using biologically plausible features: contours of face and its components with radial encoding strategy. A self-organizing network (SON) is applied to check the homogeneity of the encoded contours and then different classifiers, such as SON, multi-layer perceptron and K-nearest neighbor, are used for recognizing expressions from contours. Experimental results show that the recognition accuracy of our algorithm is comparable to that of other algorithms in the literature on the Japanese female facial expression database. We also apply our algorithm to Taiwanese facial expression image database to demonstrate its efficiency in recognizing facial expressions.  相似文献   

11.
Facial expression recognition has recently become an important research area, and many efforts have been made in facial feature extraction and its classification to improve face recognition systems. Most researchers adopt a posed facial expression database in their experiments, but in a real-life situation the facial expressions may not be very obvious. This article describes the extraction of the minimum number of Gabor wavelet parameters for the recognition of natural facial expressions. The objective of our research was to investigate the performance of a facial expression recognition system with a minimum number of features of the Gabor wavelet. In this research, principal component analysis (PCA) is employed to compress the Gabor features. We also discuss the selection of the minimum number of Gabor features that will perform the best in a recognition task employing a multiclass support vector machine (SVM) classifier. The performance of facial expression recognition using our approach is compared with those obtained previously by other researchers using other approaches. Experimental results showed that our proposed technique is successful in recognizing natural facial expressions by using a small number of Gabor features with an 81.7% recognition rate. In addition, we identify the relationship between the human vision and computer vision in recognizing natural facial expressions.  相似文献   

12.
基于小波变换和独立分量分析的面部表情识别   总被引:1,自引:0,他引:1       下载免费PDF全文
提出了一种联合二维离散小波变换(2D-DWT)和独立分量分析(ICA)相结合的表情特征提取法。首先通过2D-DWT将当前图像分解成4个子图像,其中一子图像对应原图像的主体部分(低通部分),其余三个子图像对应图像的细节部分(高通部分)。采用ICA分别对每一子图像进行特征提取,得到的表情矢量与中性矢量的差值矢量作为特征矢量,在此基础上使用性能比较稳定的支持向量机来分析各个子带图像的识别情况。此外,还提出了一种简单有效的方法对各个子图像所提取的特征进行融合,将融合的结果作为特征矢量来识别。同其它基于静态图像识别的方法相比,所提的方法识别效果好,且具有一定泛化性和鲁棒性。  相似文献   

13.
基于MPEG-4的人脸表情图像变形研究   总被引:1,自引:0,他引:1       下载免费PDF全文
为了实时地生成自然真实的人脸表情,提出了一种基于MPEG-4人脸动画框架的人脸表情图像变形方法。该方法首先采用face alignment工具提取人脸照片中的88个特征点;接着在此基础上,对标准人脸网格进行校准变形,以进一步生成特定人脸的三角网格;然后根据人脸动画参数(FAP)移动相应的面部关键特征点及其附近的关联特征点,并在移动过程中保证在多个FAP的作用下的人脸三角网格拓扑结构不变;最后对发生形变的所有三角网格区域通过仿射变换进行面部纹理填充,生成了由FAP所定义的人脸表情图像。该方法的输入是一张中性人脸照片和一组人脸动画参数,输出是对应的人脸表情图像。为了实现细微表情动作和虚拟说话人的合成,还设计了一种眼神表情动作和口内细节纹理的生成算法。基于5分制(MOS)的主观评测实验表明,利用该人脸图像变形方法生成的表情脸像自然度得分为3.67。虚拟说话人合成的实验表明,该方法具有很好的实时性,在普通PC机上的平均处理速度为66.67 fps,适用于实时的视频处理和人脸动画的生成。  相似文献   

14.
15.
针对非可控环境下人脸表情识别面临的诸如种族、性别和年龄等因子变化问题,提出一种基于深度条件随机森林的鲁棒性人脸表情识别方法.与传统的单任务人脸表情识别方法不同,设计了一种以人脸表情识别为主,人脸性别和年龄属性识别为辅的多任务识别模型.在研究中发现,人脸性别和年龄等属性对人脸表情识别有一定的影响,为了捕获它们之间的关系,提出一种基于人脸性别和年龄双属性的深度条件随机森林人脸表情识别方法.在特征提取阶段,采用多示例注意力机制进行人脸特征提取以便去除诸如光照、遮挡和低分辨率等变化问题;在人脸表情识别阶段,根据人脸性别和年龄双属性因子,采用多条件随机森林方法进行人脸表情识别.在公开的CK+,ExpW,RAF-DB,AffectNet人脸表情数据库上进行了大量实验:在经典的CK+人脸库上达到99%识别率,在具有挑战性的自然场景库(ExpW,RAF-DB,AffectNet组合库)上达到70.52%的识别率.实验结果表明:与其他方法相比具有先进性,对自然场景中的遮挡、噪声和分辨率变化具有一定的鲁棒性.  相似文献   

16.
Bilinear Models for 3-D Face and Facial Expression Recognition   总被引:1,自引:0,他引:1  
In this paper, we explore bilinear models for jointly addressing 3-D face and facial expression recognition. An elastically deformable model algorithm that establishes correspondence among a set of faces is proposed first and then bilinear models that decouple the identity and facial expression factors are constructed. Fitting these models to unknown faces enables us to perform face recognition invariant to facial expressions and facial expression recognition with unknown identity. A quantitative evaluation of the proposed technique is conducted on the publicly available BU-3DFE face database in comparison with our previous work on face recognition and other state-of-the-art algorithms for facial expression recognition. Experimental results demonstrate an overall 90.5% facial expression recognition rate and an 86% rank-1 face recognition rate.   相似文献   

17.
Most studies use the facial expression to recognize a user’s emotion; however, gestures, such as nodding, shaking the head, or stillness can also be indicators of the user’s emotion. In our research, we use the facial expression and gestures to detect and recognize a user’s emotion. The pervasive Microsoft Kinect sensor captures video data, from which several features representing facial expressions and gestures are extracted. An in-house extensible markup language-based genetic programming engine (XGP) evolves the emotion recognition module of our system. To improve the computational performance of the recognition module, we implemented and compared several approaches, including directed evolution, collaborative filtering via canonical voting, and a genetic algorithm, for an automated voting system. The experimental results indicate that XGP is feasible for evolving emotion classifiers. In addition, the obtained results verify that collaborative filtering improves the generality of recognition. From a psychological viewpoint, the results prove that different people might express their emotions differently, as the emotion classifiers that are evolved for particular users might not be applied successfully to other user(s).  相似文献   

18.
本文通过Gabor变换进行人脸表情图像的特征提取,并利用局部线性嵌入(LLE)系列算法进行数据降维操作.LLE算法是一种非线性降维算法,它可以使得降维后的数据保持原有的拓扑结构,在人脸表情识别中有广泛的应用.因为LLE算法没有考虑样本的类别信息,因此有了监督的局部线性嵌入(SLLE)算法.但是SLLE算法仅仅考虑了样本的类别信息却没有考虑到各种表情之间的关系,因此本文提出一种改进的SLLE算法,该算法认为中性表情是其他各种表情的中心.在JAFFE库上进行人脸表情识别实验结果表明,相比LLE算法和SLLE算法,该算法获得了更好的人脸表情识别率,是一种有效算法.  相似文献   

19.
This paper explores the use of multisensory information fusion technique with dynamic Bayesian networks (DBN) for modeling and understanding the temporal behaviors of facial expressions in image sequences. Our facial feature detection and tracking based on active IR illumination provides reliable visual information under variable lighting and head motion. Our approach to facial expression recognition lies in the proposed dynamic and probabilistic framework based on combining DBN with Ekman's facial action coding system (FACS) for systematically modeling the dynamic and stochastic behaviors of spontaneous facial expressions. The framework not only provides a coherent and unified hierarchical probabilistic framework to represent spatial and temporal information related to facial expressions, but also allows us to actively select the most informative visual cues from the available information sources to minimize the ambiguity in recognition. The recognition of facial expressions is accomplished by fusing not only from the current visual observations, but also from the previous visual evidences. Consequently, the recognition becomes more robust and accurate through explicitly modeling temporal behavior of facial expression. In this paper, we present the theoretical foundation underlying the proposed probabilistic and dynamic framework for facial expression modeling and understanding. Experimental results demonstrate that our approach can accurately and robustly recognize spontaneous facial expressions from an image sequence under different conditions.  相似文献   

20.
In this paper, we propose a recursive framework to recognize facial expressions from images in real scenes. Unlike traditional approaches that typically focus on developing and refining algorithms for improving recognition performance on an existing dataset, we integrate three important components in a recursive manner: facial dataset generation, facial expression recognition model building, and interactive interfaces for testing and new data collection. To start with, we first create candid images for facial expression (CIFE) dataset. We then apply a convolutional neural network (CNN) to CIFE and build a CNN model for web image expression classification. In order to increase the expression recognition accuracy, we also fine-tune the CNN model and thus obtain a better CNN facial expression recognition model. Based on the fine-tuned CNN model, we design a facial expression game engine and collect a new and more balanced dataset, GaMo. The images of this dataset are collected from the different expressions our game users make when playing the game. Finally, we run yet another recursive step—a self-evaluation of the quality of the data labeling and propose a self-cleansing mechanism for improve the quality of the data. We evaluate the GaMo and CIFE datasets and show that our recursive framework can help build a better facial expression model for dealing with real scene facial expression tasks.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号