首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The modern trend of diversification and personalization has encouraged people to boldly express their differentiation and uniqueness in many aspects, and one of the noticeable evidences is the wide variety of hairstyles that we could observe today. Given the needs for hairstyle customization, approaches or systems, ranging from 2D to 3D, or from automatic to manual, have been proposed or developed to digitally facilitate the choice of hairstyles. However, nearly all existing approaches suffer from providing realistic hairstyle synthesis results. By assuming the inputs to be 2D photos, the vividness of a hairstyle re-synthesis result relies heavily on the removal of the original hairstyle, because the co-existence of the original hairstyle and the newly re-synthesized hairstyle may lead to serious artifact on human perception. We resolve this issue by extending the active shape model to more precisely extract the entire facial contour, which can then be used to trim away the hair from the input photo. After hair removal, the facial skin of the revealed forehead needs to be recovered. Since the skin texture is non-stationary and there is little information left, the traditional texture synthesis and image inpainting approaches do not fit to solve this problem. Our proposed method yields a more desired facial skin patch by first interpolating a base skin patch, and followed by a non-stationary texture synthesis. In this paper, we also would like to reduce the user assistance during such a process as much as possible. We have devised a new and friendly facial contour and hairstyle adjusting mechanism that make it extremely easy to manipulate and fit a desired hairstyle onto a face. In addition, our system is also equipped with the functionality of extracting the hairstyle from a given photo, which makes our work more complete. Moreover, by extracting the face from the input photo, our system allows users to exchange faces as well. In the end of this paper, our re-synthesized results are shown, comparisons are made, and user studies are conducted as well to further demonstrate the usefulness of our system.  相似文献   

2.
《Advanced Robotics》2013,27(6):585-604
We are attempting to introduce a 3D, realistic human-like animated face robot to human-robot communication. The face robot can recognize human facial expressions as well as produce realistic facial expressions in real time. For the animated face robot to communicate interactively, we propose a new concept of 'active human interface', and we investigate the performance of real time recognition of facial expressions by neural networks (NN) and the expressionability of facial messages on the face robot. We find that the NN recognition of facial expressions and the face robot's performance in generating facial expressions are of almost same level as that in humans. We also construct an artificial emotion model able to generate six basic emotions in accordance with the recognition of a given facial expression and the situational context. This implies a high potential for the animated face robot to undertake interactive communication with humans, when integrating these three component technologies into the face robot.  相似文献   

3.
CEOs of big companies may travel frequently to give their philosophies and policies to the employees who are working at world wide branches. Video technology makes it possible to give their lectures anywhere and anytime in the world very easily. However, 2-dimentional video systems lack the reality. If we can give natural realistic lectures through humanoid robots, CEOs do not need to meet the employees in person. They can save their time and money for traveling.We propose a substitute robot of remote person. The substitute robot is a humanoid robot that can reproduce the lecturers’ facial expressions and body movements, and that can send the lecturers to everywhere in the world instantaneously with the feeling of being at a live performance. There are two major tasks for the development; they are the facial expression recognition/reproduction and the body language reproduction.For the former task, we proposed a facial expression recognition method based on a neural network model. We recognized five emotions, or surprise, anger, sadness, happiness and no emotion, in real time. We also developed a facial robot to reproduce the recognized emotion on the robot face. Through experiments, we showed that the robot could reproduce the speakers’ emotions with its face.For the latter task, we proposed a degradation control method to reproduce the natural movement of the lecturer even when a robot rotary joint fails. For the fundamental stage of our research for this sub-system, we proposed a control method for the front view movement model, or 2-dimentional model.  相似文献   

4.
在人机交互、数字娱乐等领域,传统的表情合成技术难以稳定地生成具有真实感的个性化人脸表情动画.为此,提出一种基于单张图像的三维人脸建模和表情动画系统.自动检测人脸在图像中的位置,自动定位人脸上的关键点,基于这些关键点和形变模型重建个性化三维人脸模型,对重建的人脸模型进行扩展得到完整的人脸网格,采用稀疏关键点控制的动画数据映射方法来驱动重建的人脸模型生成动态表情动画.实验结果表明,该方法稳定性强、自动化程度高,人脸模型与表情动画比较逼真.  相似文献   

5.
Synthesizing expressive facial animation is a very challenging topic within the graphics community. In this paper, we present an expressive facial animation synthesis system enabled by automated learning from facial motion capture data. Accurate 3D motions of the markers on the face of a human subject are captured while he/she recites a predesigned corpus, with specific spoken and visual expressions. We present a novel motion capture mining technique that "learns" speech coarticulation models for diphones and triphones from the recorded data. A phoneme-independent expression eigenspace (PIEES) that encloses the dynamic expression signals is constructed by motion signal processing (phoneme-based time-warping and subtraction) and principal component analysis (PCA) reduction. New expressive facial animations are synthesized as follows: First, the learned coarticulation models are concatenated to synthesize neutral visual speech according to novel speech input, then a texture-synthesis-based approach is used to generate a novel dynamic expression signal from the PIEES model, and finally the synthesized expression signal is blended with the synthesized neutral visual speech to create the final expressive facial animation. Our experiments demonstrate that the system can effectively synthesize realistic expressive facial animation  相似文献   

6.
情绪识别作为人机交互的热门领域,其技术已经被应用于医学、教育、安全驾驶、电子商务等领域.情绪主要由面部表情、声音、话语等进行表达,不同情绪表达时的面部肌肉、语气、语调等特征也不相同,使用单一模态特征确定的情绪的不准确性偏高,考虑到情绪表达主要通过视觉和听觉进行感知,本文提出了一种基于视听觉感知系统的多模态表情识别算法,分别从语音和图像模态出发,提取两种模态的情感特征,并设计多个分类器为单特征进行情绪分类实验,得到多个基于单特征的表情识别模型.在语音和图像的多模态实验中,提出了晚期融合策略进行特征融合,考虑到不同模型间的弱依赖性,采用加权投票法进行模型融合,得到基于多个单特征模型的融合表情识别模型.本文使用AFEW数据集进行实验,通过对比融合表情识别模型与单特征的表情识别模型的识别结果,验证了基于视听觉感知系统的多模态情感识别效果要优于基于单模态的识别效果.  相似文献   

7.
With technology allowing for increased realism in video games, realistic, human-like characters risk falling into the Uncanny Valley. The Uncanny Valley phenomenon implies that virtual characters approaching full human-likeness will evoke a negative reaction from the viewer, due to aspects of the character’s appearance and behavior differing from the human norm. This study investigates if “uncanniness” is increased for a character with a perceived lack of facial expression in the upper parts of the face. More important, our study also investigates if the magnitude of this increased uncanniness varies depending on which emotion is being communicated. Individual parameters for each facial muscle in a 3D model were controlled for the six emotions: anger, disgust, fear, happiness, sadness and surprise in addition to a neutral expression. The results indicate that even fully and expertly animated characters are rated as more uncanny than humans and that, in virtual characters, a lack of facial expression in the upper parts of the face during speech exaggerates the uncanny by inhibiting effective communication of the perceived emotion, significantly so for fear, sadness, disgust, and surprise but not for anger and happiness. Based on our results, we consider the implications for virtual character design.  相似文献   

8.
Extracting and understanding of emotion is of high importance for the interaction between human and machine communication systems. The most expressive way to display the human’s emotion is through facial expression analysis. This paper proposes a multiple emotion recognition system that can recognize combinations of up to a maximum of three different emotions using an active appearance model (AAM), the proposed classification standard, and a k-nearest neighbor (k-NN) classifier in mobile environments. AAM can take the expression of variations that are calculated by the proposed classification standard according to changes in human expressions in real time. The proposed k-NN can classify basic emotions (normal, happy, sad, angry, surprise) as well as more ambiguous emotions by combining the basic emotions in real time, and each recognized emotion that can be subdivided has strength. Whereas most previous methods of emotion recognition recognize various kind of a single emotion, this paper recognizes various emotions with a combination of the five basic emotions. To be easily understood, the recognized result is presented in three ways on a mobile camera screen. The result of the experiment was an average 85 % recognition rate and a 40 % performance showed optimized emotions. The implemented system can be represented by one of the example for augmented reality on displaying combination of real face video and virtual animation with user’s avatar.  相似文献   

9.
基于MPEG-4标准,实现了一种由彩铃语音及蕴含情感共同驱动生成人脸动画的方法和系统.选用HMM作为分类器,训练使其识别语音库中嗔怒、欣喜、可爱、无奈和兴奋5类情感,并对每类情感建立一组与之对应的表情人脸动画参数(FAP).分析语音强弱得到综合表情函数,并用此函数融合表情FAP与唇动FAP,实现人脸表情多源信息合成,得到综合FAP驱动人脸网格生成动画.实验结果表明,彩铃语音情感识别率可达94.44%,该系统生成的人脸动画也具有较高的真实感.  相似文献   

10.
奚琰 《计算机系统应用》2022,31(11):175-183
和实验室环境不同,现实生活中的人脸表情图像场景复杂,其中最常见的局部遮挡问题会造成面部外观的显著改变,使得模型提取到的全局特征包含与情感无关的冗余信息从而降低了判别力.针对此问题,本文提出了一种结合对比学习和通道-空间注意力机制的人脸表情识别方法,学习各局部显著情感特征并关注局部特征与全局特征之间的关系.首先引入对比学习,通过特定的数据增强方法设计新的正负样本选取策略,对大量易获得的无标签情感数据进行预训练,学习具有感知遮挡能力的表征,再将此表征迁移到下游人脸表情识别任务以提高识别性能.在下游任务中,将每张人脸图像的表情分析问题转化为多个局部区域的情感检测问题,使用通道-空间注意力机制学习人脸不同局部区域的细粒度注意力图,并对加权特征进行融合,削弱遮挡内容带来的噪声影响,最后提出约束损失联合训练,优化最终用于分类的融合特征.实验结果表明,无论是在公开的非遮挡人脸表情数据集(RAFDB和FER2013)还是人工合成的遮挡人脸表情数据集上,所提方法都取得了与现有先进方法可媲美的结果.  相似文献   

11.
基于人脸表情特征的情感交互系统*   总被引:1,自引:1,他引:0  
徐红  彭力 《计算机应用研究》2012,29(3):1111-1115
设计了一套基于人脸表情特征的情感交互系统(情感虚拟人),关键技术分别为情感识别、情感计算、情感合成与输出三个方面。情感识别部分首先采用特征块的方法对面部静态表情图形进行预处理,然后利用二维主元分析(2DPCA)提取特征,最后利用多级量子神经网络分类器实现七类表情识别分类;在情感计算部分建立了隐马尔可夫情感模型(HMM),并且用改进的遗传算法估计模型中的参数;在情感合成与输出阶段,首先采用NURBS曲面和面片相结合的算法,建立人脸三维网格模型,然后采用关键帧技术,实现了符合人类行为规律的连续表情动画。最后完成了基于人脸表情特征的情感交互系统的设计。  相似文献   

12.
面向复杂场景的人物视觉理解技术能够提升社会智能化协作效率,加速社会治理智能化进程,并在服务人类社会的经济活动、建设智慧城市等方面展现出巨大活力,具有重大的社会效益和经济价值。人物视觉理解技术主要包括实时人物识别、个体行为分析与群体交互理解、人机协同学习、表情与语音情感识别和知识引导下视觉理解等,当环境处于复杂场景中,特别是考虑“人物—行为—场景”整体关联的视觉表达与理解,相关问题的研究更具有挑战性。其中,大规模复杂场景实时人物识别主要集中在人脸检测、人物特征理解以及场景分析等,是复杂场景下人物视觉理解技术的重要研究基础;个体行为分析与群体交互理解主要集中在视频行人重识别、视频动作识别、视频问答和视频对话等,是视觉理解的关键行为组成部分;同时,在个体行为分析和群体交互理解中,形成综合利用知识与先验的机器学习模式,包含视觉问答对话、视觉语言导航两个重点研究方向;情感的识别与合成主要集中在人脸表情识别、语音情感识别与合成以及知识引导下视觉分析等方面,是情感交互的核心技术。本文围绕上述核心关键技术,阐述复杂场景下人物视觉理解领域的研究热点与应用场景,总结国内外相关成果与进展,展望该领域的前沿技术与发展趋势。  相似文献   

13.
Emotive audio–visual avatars are virtual computer agents which have the potential of improving the quality of human-machine interaction and human-human communication significantly. However, the understanding of human communication has not yet advanced to the point where it is possible to make realistic avatars that demonstrate interactions with natural- sounding emotive speech and realistic-looking emotional facial expressions. In this paper, We propose the various technical approaches of a novel multimodal framework leading to a text-driven emotive audio–visual avatar. Our primary work is focused on emotive speech synthesis, realistic emotional facial expression animation, and the co-articulation between speech gestures (i.e., lip movements) and facial expressions. A general framework of emotive text-to-speech (TTS) synthesis using a diphone synthesizer is designed and integrated into a generic 3-D avatar face model. Under the guidance of this framework, we therefore developed a realistic 3-D avatar prototype. A rule-based emotive TTS synthesis system module based on the Festival-MBROLA architecture has been designed to demonstrate the effectiveness of the framework design. Subjective listening experiments were carried out to evaluate the expressiveness of the synthetic talking avatar.   相似文献   

14.
The use of expressive Virtual Characters is an effective complementary means of communication for social networks offering multi-user 3D-chatting environment. In such contexts, the facial expression channel offers a rich medium to translate the ongoing emotions conveyed by the text-based exchanges. However, until recently, only purely symmetric facial expressions have been considered for that purpose. In this article we examine human sensitivity to facial asymmetry in the expression of both basic and complex emotions. The rationale for introducing asymmetry in the display of facial expressions stems from two well-established observations in cognitive neuroscience: first that the expression of basic emotions generally displays a small asymmetry, second that more complex emotions such as ambivalent feeling may reflect in the partial display of different, potentially opposite, emotions on each side of the face. A frequent occurrence of this second case results from the conflict between the truly felt emotion and the one that should be displayed due to social conventions. Our main hypothesis is that a much larger expressive and emotional space can only be automatically synthesized by means of facial asymmetry when modeling emotions with a general Valence-Arousal-Dominance dimensional approach. Besides, we want also to explore the general human sensitivity to the introduction of a small degree of asymmetry into the expression of basic emotions. We conducted an experiment by presenting 64 pairs of static facial expressions, one symmetric and one asymmetric, illustrating eight emotions (three basic and five complex ones) alternatively for a male and a female character. Each emotion was presented four times by swapping the symmetric and asymmetric positions and by mirroring the asymmetrical expression. Participants were asked to grade, on a continuous scale, the correctness of each facial expression with respect to a short definition. Results confirm the potential of introducing facial asymmetry for a subset of the complex emotions. Guidelines are proposed for designers of embodied conversational agent and emotionally reflective avatars.  相似文献   

15.
Emotion recognition is a crucial application in human–computer interaction. It is usually conducted using facial expressions as the main modality, which might not be reliable. In this study, we proposed a multimodal approach that uses 2-channel electroencephalography (EEG) signals and eye modality in addition to the face modality to enhance the recognition performance. We also studied the use of facial images versus facial depth as the face modality and adapted the common arousal–valence model of emotions and the convolutional neural network, which can model the spatiotemporal information from the modality data for emotion recognition. Extensive experiments were conducted on the modality and emotion data, the results of which showed that our system has high accuracies of 67.8% and 77.0% in valence recognition and arousal recognition, respectively. The proposed method outperformed most state-of-the-art systems that use similar but fewer modalities. Moreover, the use of facial depth has outperformed the use of facial images. The proposed method of emotion recognition has significant potential for integration into various educational applications.  相似文献   

16.
A facial expression emotion recognition based human-robot interaction (FEER-HRI) system is proposed, for which a four-layer system framework is designed. The FEERHRI system enables the robots not only to recognize human emotions, but also to generate facial expression for adapting to human emotions. A facial emotion recognition method based on 2D-Gabor, uniform local binary pattern (LBP) operator, and multiclass extreme learning machine (ELM) classifier is presented, which is applied to real-time facial expression recognition for robots. Facial expressions of robots are represented by simple cartoon symbols and displayed by a LED screen equipped in the robots, which can be easily understood by human. Four scenarios, i.e., guiding, entertainment, home service and scene simulation are performed in the human-robot interaction experiment, in which smooth communication is realized by facial expression recognition of humans and facial expression generation of robots within 2 seconds. As a few prospective applications, the FEERHRI system can be applied in home service, smart home, safe driving, and so on.   相似文献   

17.
《Graphical Models》2001,63(2):67-85
In this paper, we develop a hairstyle modeling and animation technique specifically designed for human hairs, and we report several experimental results. Using simplified cantilever beam model and one-dimensional projective differential equations of angular momenta, we give a practical solution to the problem of enormous complexity. Even though our hair animation algorithm is an approximate solution, it includes all the relevant dynamic elements such as gravity, wind, inertia, air-resistance, hair-to-head, and hair-to-hair friction forces. Collision is an important element that makes a collection of hair strands look like hair. We develop an accurate but efficient hair-to-head and hair-to-hair collision detection and treatment algorithm. The algorithm produces quite realistic results; still it runs at an interactive speed. An interesting contribution of our algorithm is that it unifies hairstyle modeling and animation into a single equation, so that (1) hairstyling can be done under the effects of gravity and other internal or external forces, and (2) original hairstyle is more or less restored even after the initial hair is tangled by the application of external forces or head movements.  相似文献   

18.
Physical simulation has long been the approach of choice for generating realistic hair animations in CG. A constant drawback of simulation, however, is the necessity to manually set the physical parameters of the simulation model in order to get the desired dynamic behavior. To alleviate this, researchers have begun to explore methods for reconstructing hair from the real world and even to estimate the corresponding simulation parameters through the process of inversion. So far, however, these methods have had limited applicability, because dynamic hair capture can only be played back without the ability to edit, and solving for simulation parameters can only be accomplished for static hairstyles, ignoring the dynamic behavior. We present the first method for capturing dynamic hair and automatically determining the physical properties for simulating the observed hairstyle in motion. Since our dynamic inversion is agnostic to the simulation model, the proposed method applies to virtually any hair simulation technique, which we demonstrate using two state‐of‐the‐art hair simulation models. The output of our method is a fully simulation‐ready hairstyle, consisting of both the static hair geometry as well as its physical properties. The hairstyle can be easily edited by adding additional external forces, changing the head motion, or re‐simulating in completely different environments, all while remaining faithful to the captured hairstyle.  相似文献   

19.
基于MPEG-4的人脸表情图像变形研究   总被引:1,自引:0,他引:1       下载免费PDF全文
为了实时地生成自然真实的人脸表情,提出了一种基于MPEG-4人脸动画框架的人脸表情图像变形方法。该方法首先采用face alignment工具提取人脸照片中的88个特征点;接着在此基础上,对标准人脸网格进行校准变形,以进一步生成特定人脸的三角网格;然后根据人脸动画参数(FAP)移动相应的面部关键特征点及其附近的关联特征点,并在移动过程中保证在多个FAP的作用下的人脸三角网格拓扑结构不变;最后对发生形变的所有三角网格区域通过仿射变换进行面部纹理填充,生成了由FAP所定义的人脸表情图像。该方法的输入是一张中性人脸照片和一组人脸动画参数,输出是对应的人脸表情图像。为了实现细微表情动作和虚拟说话人的合成,还设计了一种眼神表情动作和口内细节纹理的生成算法。基于5分制(MOS)的主观评测实验表明,利用该人脸图像变形方法生成的表情脸像自然度得分为3.67。虚拟说话人合成的实验表明,该方法具有很好的实时性,在普通PC机上的平均处理速度为66.67 fps,适用于实时的视频处理和人脸动画的生成。  相似文献   

20.
Psychological research findings suggest that humans rely on the combined visual channels of face and body more than any other channel when they make judgments about human communicative behavior. However, most of the existing systems attempting to analyze the human nonverbal behavior are mono-modal and focus only on the face. Research that aims to integrate gestures as an expression mean has only recently emerged. Accordingly, this paper presents an approach to automatic visual recognition of expressive face and upper-body gestures from video sequences suitable for use in a vision-based affective multi-modal framework. Face and body movements are captured simultaneously using two separate cameras. For each video sequence single expressive frames both from face and body are selected manually for analysis and recognition of emotions. Firstly, individual classifiers are trained from individual modalities. Secondly, we fuse facial expression and affective body gesture information at the feature and at the decision level. In the experiments performed, the emotion classification using the two modalities achieved a better recognition accuracy outperforming classification using the individual facial or bodily modality alone.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号