期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

IEMOCAP: interactive emotional dyadic motion capture database

Carlos Busso Murtaza Bulut Chi-Chun Lee Abe Kazemzadeh Emily Mower Samuel Kim Jeannette N. Chang Sungbok Lee Shrikanth S. Narayanan 《Language Resources and Evaluation》2008,42(4):335-359

相似文献

2.

Creation of a new set of dynamic virtual reality faces for the assessment and training of facial emotion recognition ability 总被引：1，自引：0，他引：1

José Gutiérrez-Maldonado Mar Rus-Calafell Joan González-Conde 《Virtual Reality》2014,18(1):61-71

The ability to recognize facial emotions is target behaviour when treating people with social impairment. When assessing this ability, the most widely used facial stimuli are photographs. Although their use has been shown to be valid, photographs are unable to capture the dynamic aspects of human expressions. This limitation can be overcome by creating virtual agents with feasible expressed emotions. The main objective of the present study was to create a new set of dynamic virtual faces with high realism that could be integrated into a virtual reality (VR) cyberintervention to train people with schizophrenia in the full repertoire of social skills. A set of highly realistic virtual faces was created based on the Facial Action Coding System. Facial movement animation was also included so as to mimic the dynamism of human facial expressions. Consecutive healthy participants (n = 98) completed a facial emotion recognition task using both natural faces (photographs) and virtual agents expressing five basic emotions plus a neutral one. Repeated-measures ANOVA revealed no significant difference in participants’ accuracy of recognition between the two presentation conditions. However, anger was better recognized in the VR images, and disgust was better recognized in photographs. Age, the participant’s gender and reaction times were also explored. Implications of the use of virtual agents with realistic human expressions in cyberinterventions are discussed. 相似文献

3.

Evaluating the Sensitivity to Virtual Characters Facial Asymmetry in Emotion Synthesis

Nan Wang Junghyun Ahn Ronan Boulic 《Applied Artificial Intelligence》2017,31(2):103-118

The use of expressive Virtual Characters is an effective complementary means of communication for social networks offering multi-user 3D-chatting environment. In such contexts, the facial expression channel offers a rich medium to translate the ongoing emotions conveyed by the text-based exchanges. However, until recently, only purely symmetric facial expressions have been considered for that purpose. In this article we examine human sensitivity to facial asymmetry in the expression of both basic and complex emotions. The rationale for introducing asymmetry in the display of facial expressions stems from two well-established observations in cognitive neuroscience: first that the expression of basic emotions generally displays a small asymmetry, second that more complex emotions such as ambivalent feeling may reflect in the partial display of different, potentially opposite, emotions on each side of the face. A frequent occurrence of this second case results from the conflict between the truly felt emotion and the one that should be displayed due to social conventions. Our main hypothesis is that a much larger expressive and emotional space can only be automatically synthesized by means of facial asymmetry when modeling emotions with a general Valence-Arousal-Dominance dimensional approach. Besides, we want also to explore the general human sensitivity to the introduction of a small degree of asymmetry into the expression of basic emotions. We conducted an experiment by presenting 64 pairs of static facial expressions, one symmetric and one asymmetric, illustrating eight emotions (three basic and five complex ones) alternatively for a male and a female character. Each emotion was presented four times by swapping the symmetric and asymmetric positions and by mirroring the asymmetrical expression. Participants were asked to grade, on a continuous scale, the correctness of each facial expression with respect to a short definition. Results confirm the potential of introducing facial asymmetry for a subset of the complex emotions. Guidelines are proposed for designers of embodied conversational agent and emotionally reflective avatars. 相似文献

4.

Recognizing multiple emotion from ambiguous facial expressions on mobile platforms

Yong-Hwan Lee Wuri Han Youngseop Kim 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2016,20(5):1811-1819

Extracting and understanding of emotion is of high importance for the interaction between human and machine communication systems. The most expressive way to display the human’s emotion is through facial expression analysis. This paper proposes a multiple emotion recognition system that can recognize combinations of up to a maximum of three different emotions using an active appearance model (AAM), the proposed classification standard, and a k-nearest neighbor (k-NN) classifier in mobile environments. AAM can take the expression of variations that are calculated by the proposed classification standard according to changes in human expressions in real time. The proposed k-NN can classify basic emotions (normal, happy, sad, angry, surprise) as well as more ambiguous emotions by combining the basic emotions in real time, and each recognized emotion that can be subdivided has strength. Whereas most previous methods of emotion recognition recognize various kind of a single emotion, this paper recognizes various emotions with a combination of the five basic emotions. To be easily understood, the recognized result is presented in three ways on a mobile camera screen. The result of the experiment was an average 85 % recognition rate and a 40 % performance showed optimized emotions. The implemented system can be represented by one of the example for augmented reality on displaying combination of real face video and virtual animation with user’s avatar. 相似文献

5.

Facial expression recognition system based on rigid and non-rigid motion separation and 3D pose estimation

Te-Hsun Wang Author Vitae Author Vitae 《Pattern recognition》2009,42(5):962-977

This study presents a facial expression recognition system which separates the non-rigid facial expression from the rigid head rotation and estimates the 3D rigid head rotation angle in real time. The extracted trajectories of the feature points contain both rigid head motion components and non-rigid facial expression motion components. A 3D virtual face model is used to obtain accurate estimation of the head rotation angle such that the non-rigid motion components can be precisely separated to enhance the facial expression recognition performance. The separation performance of the proposed system is further improved through the use of a restoration mechanism designed to recover feature points lost during large pan rotations. Having separated the rigid and non-rigid motions, hidden Markov models (HMMs) are employed to recognize a prescribed set of facial expressions defined in terms of facial action coding system (FACS) action units (AUs). 相似文献

6.

Younger and older users׳ recognition of virtual agent facial expressions

《International journal of human-computer studies》2015

As technology advances, robots and virtual agents will be introduced into the home and healthcare settings to assist individuals, both young and old, with everyday living tasks. Understanding how users recognize an agent׳s social cues is therefore imperative, especially in social interactions. Facial expression, in particular, is one of the most common non-verbal cues used to display and communicate emotion in on-screen agents (Cassell et al., 2000). Age is important to consider because age-related differences in emotion recognition of human facial expression have been supported (Ruffman et al., 2008), with older adults showing a deficit for recognition of negative facial expressions. Previous work has shown that younger adults can effectively recognize facial emotions displayed by agents (Bartneck and Reichenbach, 2005, Courgeon et al., 2009, Courgeon et al., 2011, Breazeal, 2003); however, little research has compared in-depth younger and older adults’ ability to label a virtual agent׳s facial emotions, an import consideration because social agents will be required to interact with users of varying ages. If such age-related differences exist for recognition of virtual agent facial expressions, we aim to understand if those age-related differences are influenced by the intensity of the emotion, dynamic formation of emotion (i.e., a neutral expression developing into an expression of emotion through motion), or the type of virtual character differing by human-likeness. Study 1 investigated the relationship between age-related differences, the implication of dynamic formation of emotion, and the role of emotion intensity in emotion recognition of the facial expressions of a virtual agent (iCat). Study 2 examined age-related differences in recognition expressed by three types of virtual characters differing by human-likeness (non-humanoid iCat, synthetic human, and human). Study 2 also investigated the role of configural and featural processing as a possible explanation for age-related differences in emotion recognition. First, our findings show age-related differences in the recognition of emotions expressed by a virtual agent, with older adults showing lower recognition for the emotions of anger, disgust, fear, happiness, sadness, and neutral. These age-related difference might be explained by older adults having difficulty discriminating similarity in configural arrangement of facial features for certain emotions; for example, older adults often mislabeled the similar emotions of fear as surprise. Second, our results did not provide evidence for the dynamic formation improving emotion recognition; but, in general, the intensity of the emotion improved recognition. Lastly, we learned that emotion recognition, for older and younger adults, differed by character type, from best to worst: human, synthetic human, and then iCat. Our findings provide guidance for design, as well as the development of a framework of age-related differences in emotion recognition. 相似文献

7.

Mixed feelings: expression of non-basic emotions in a muscle-based talking head

Irene Albrecht Marc Schröder Jörg Haber Hans-Peter Seidel 《Virtual Reality》2005,8(4):201-212

We present an algorithm for generating facial expressions for a continuum of pure and mixed emotions of varying intensity. Based on the observation that in natural interaction among humans, shades of emotion are much more frequently encountered than expressions of basic emotions, a method to generate more than Ekmans six basic emotions (joy, anger, fear, sadness, disgust and surprise) is required. To this end, we have adapted the algorithm proposed by Tsapatsoulis et al. [1] to be applicable to a physics-based facial animation system and a single, integrated emotion model. A physics-based facial animation system was combined with an equally flexible and expressive text-to-speech synthesis system, based upon the same emotion model, to form a talking head capable of expressing non-basic emotions of varying intensities. With a variety of life-like intermediate facial expressions captured as snapshots from the system we demonstrate the appropriateness of our approach.

Hans-Peter SeidelEmail:

相似文献

8.

Humanoid Audio–Visual Avatar With Emotive Text-to-Speech Synthesis

《Multimedia, IEEE Transactions on》2008,10(6):969-981

Emotive audio–visual avatars are virtual computer agents which have the potential of improving the quality of human-machine interaction and human-human communication significantly. However, the understanding of human communication has not yet advanced to the point where it is possible to make realistic avatars that demonstrate interactions with natural- sounding emotive speech and realistic-looking emotional facial expressions. In this paper, We propose the various technical approaches of a novel multimodal framework leading to a text-driven emotive audio–visual avatar. Our primary work is focused on emotive speech synthesis, realistic emotional facial expression animation, and the co-articulation between speech gestures (i.e., lip movements) and facial expressions. A general framework of emotive text-to-speech (TTS) synthesis using a diphone synthesizer is designed and integrated into a generic 3-D avatar face model. Under the guidance of this framework, we therefore developed a realistic 3-D avatar prototype. A rule-based emotive TTS synthesis system module based on the Festival-MBROLA architecture has been designed to demonstrate the effectiveness of the framework design. Subjective listening experiments were carried out to evaluate the expressiveness of the synthetic talking avatar. 相似文献

9.

Affective computing with primary and secondary emotions in a virtual human

Christian Becker-Asano Ipke Wachsmuth 《Autonomous Agents and Multi-Agent Systems》2010,20(1):32-49

We introduce the WASABI ([W]ASABI [A]ffect [S]imulation for [A]gents with [B]elievable [I]nteractivity) Affect Simulation Architecture, in which a virtual human’s cognitive reasoning capabilities are combined with simulated embodiment to achieve the simulation of primary and secondary emotions. In modeling primary emotions we follow the idea of “Core Affect” in combination with a continuous progression of bodily feeling in three-dimensional emotion space (PAD space), that is subsequently categorized into discrete emotions. In humans, primary emotions are understood as onto-genetically earlier emotions, which directly influence facial expressions. Secondary emotions, in contrast, afford the ability to reason about current events in the light of experiences and expectations. By technically representing aspects of each secondary emotion’s connotative meaning in PAD space, we not only assure their mood-congruent elicitation, but also combine them with facial expressions, that are concurrently driven by primary emotions. Results of an empirical study suggest that human players in a card game scenario judge our virtual human MAX significantly older when secondary emotions are simulated in addition to primary ones. 相似文献

10.

State-of-the art in component technology for an animated face robot-its component technology development for interactive communication with humans

《Advanced Robotics》2013,27(6):585-604

We are attempting to introduce a 3D, realistic human-like animated face robot to human-robot communication. The face robot can recognize human facial expressions as well as produce realistic facial expressions in real time. For the animated face robot to communicate interactively, we propose a new concept of 'active human interface', and we investigate the performance of real time recognition of facial expressions by neural networks (NN) and the expressionability of facial messages on the face robot. We find that the NN recognition of facial expressions and the face robot's performance in generating facial expressions are of almost same level as that in humans. We also construct an artificial emotion model able to generate six basic emotions in accordance with the recognition of a given facial expression and the situational context. This implies a high potential for the animated face robot to undertake interactive communication with humans, when integrating these three component technologies into the face robot. 相似文献

11.

Modeling and animation of facial expressions based on B-Splines

Michael Hoch Georg Fleischmann Bernd Girod 《The Visual computer》1994,11(2):87-95

This paper describes a technique for the automatic adaptation of a canonical facial model to data obtained by a 3D laser scanner. The facial model is a B-spline surface with 13×16 control points. We introduce a technique by which this canonical model is fit to the scanned data and that takes into consideration the requirements for the animation of facial expressions. The animation of facial expressions is based on the facial action coding system (FACS). Using B-splines in combination with FACS, we automatically create the impression of a moving skin. To increase the realism of the animation we map textural information onto the B-spline surface. 相似文献

12.

A nonparametric regression model for virtual humans generation

Yun-Feng Chou Zen-Chung Shih 《Multimedia Tools and Applications》2010,47(1):163-187

In this paper, we propose a novel nonparametric regression model to generate virtual humans from still images for the applications of next generation environments (NG). This model automatically synthesizes deformed shapes of characters by using kernel regression with elliptic radial basis functions (ERBFs) and locally weighted regression (LOESS). Kernel regression with ERBFs is used for representing the deformed character shapes and creating lively animated talking faces. For preserving patterns within the shapes, LOESS is applied to fit the details with local control. The results show that our method effectively simulates plausible movements for character animation, including body movement simulation, novel views synthesis, and expressive facial animation synchronized with input speech. Therefore, the proposed model is especially suitable for intelligent multimedia applications in virtual humans generation. 相似文献

13.

Expressive Speech Animation Synthesis with Phoneme‐Level Controls

Z. Deng U. Neumann 《Computer Graphics Forum》2008,27(8):2096-2113

This paper presents a novel data‐driven expressive speech animation synthesis system with phoneme‐level controls. This system is based on a pre‐recorded facial motion capture database, where an actress was directed to recite a pre‐designed corpus with four facial expressions (neutral, happiness, anger and sadness). Given new phoneme‐aligned expressive speech and its emotion modifiers as inputs, a constrained dynamic programming algorithm is used to search for best‐matched captured motion clips from the processed facial motion database by minimizing a cost function. Users optionally specify ‘hard constraints’ (motion‐node constraints for expressing phoneme utterances) and ‘soft constraints’ (emotion modifiers) to guide this search process. We also introduce a phoneme–Isomap interface for visualizing and interacting phoneme clusters that are typically composed of thousands of facial motion capture frames. On top of this novel visualization interface, users can conveniently remove contaminated motion subsequences from a large facial motion dataset. Facial animation synthesis experiments and objective comparisons between synthesized facial motion and captured motion showed that this system is effective for producing realistic expressive speech animations. 相似文献

14.

Multimodal behavior realization for embodied conversational agents

Aleksandra Čereković Igor S. Pandžić 《Multimedia Tools and Applications》2011,54(1):143-164

Applications with intelligent conversational virtual humans, called Embodied Conversational Agents (ECAs), seek to bring human-like abilities into machines and establish natural human-computer interaction. In this paper we discuss realization of ECA multimodal behaviors which include speech and nonverbal behaviors. We devise RealActor, an open-source, multi-platform animation system for real-time multimodal behavior realization for ECAs. The system employs a novel solution for synchronizing gestures and speech using neural networks. It also employs an adaptive face animation model based on Facial Action Coding System (FACS) to synthesize face expressions. Our aim is to provide a generic animation system which can help researchers create believable and expressive ECAs. 相似文献

15.

The Role of Body Postures in the Recognition of Emotions in Contextually Rich Scenarios

Stéphanie Buisine Matthieu Courgeon Aurélien Charles Céline Clavel Jean-Claude Martin Ning Tan 《International journal of human-computer interaction》2014,30(1):52-62

In this article the role of different categories of postures in the detection, recognition, and interpretation of emotion in contextually rich scenarios, including ironic items, is investigated. Animated scenarios are designed with 3D virtual agents in order to test 3 conditions: In the “still” condition, the narrative content was accompanied by emotional facial expressions without any body movements; in the “idle” condition, emotionally neutral body movements were introduced; and in the “congruent” condition, emotional body postures congruent with the character's facial expressions were displayed. Those conditions were examined by 27 subjects, and their impact on the viewers’ attentional and emotional processes was assessed. The results highlight the importance of the contextual information to emotion recognition and irony interpretation. It is also shown that both idle and emotional postures improve the detection of emotional expressions. Moreover, emotional postures increase the perceived intensity of emotions and the realism of the animations. 相似文献

16.

Expressive facial animation synthesis by learning speech coarticulation and expression spaces 总被引：2，自引：0，他引：2

Deng Z Neumann U Lewis JP Kim TY Bulut M Narayanan S 《IEEE transactions on visualization and computer graphics》2006,12(6):1523-1534

Synthesizing expressive facial animation is a very challenging topic within the graphics community. In this paper, we present an expressive facial animation synthesis system enabled by automated learning from facial motion capture data. Accurate 3D motions of the markers on the face of a human subject are captured while he/she recites a predesigned corpus, with specific spoken and visual expressions. We present a novel motion capture mining technique that "learns" speech coarticulation models for diphones and triphones from the recorded data. A phoneme-independent expression eigenspace (PIEES) that encloses the dynamic expression signals is constructed by motion signal processing (phoneme-based time-warping and subtraction) and principal component analysis (PCA) reduction. New expressive facial animations are synthesized as follows: First, the learned coarticulation models are concatenated to synthesize neutral visual speech according to novel speech input, then a texture-synthesis-based approach is used to generate a novel dynamic expression signal from the PIEES model, and finally the synthesized expression signal is blended with the synthesized neutral visual speech to create the final expressive facial animation. Our experiments demonstrate that the system can effectively synthesize realistic expressive facial animation 相似文献

17.

A Linear Affect–Expression Space Model and Control Points for Mascot-Type Facial Robots

Hui Sung Lee Jeong Woo Park Myung Jin Chung 《Robotics, IEEE Transactions on》2007,23(5):863-873

A robot's face is its symbolic feature, and its facial expressions are the best method for interacting with people with emotional information. Moreover, a robot's facial expressions play an important role in human-robot emotional interactions. This paper proposes a general rule for the design and realization of expressions when some mascot-type facial robots are developed. Mascot-type facial robots are developed to enable friendly human feelings. The number and type of control points for six basic expressions or emotions were determined through a questionnaire. A linear affect-expression space model is provided to realize continuous and various expressions effectively, and the effects of the proposed method are shown through experiments using a simulator and an actual robot system. 相似文献

18.

Virtual agent multimodal mimicry of humans

George Caridakis Amaryllis Raouzaiou Elisabetta Bevacqua Maurizio Mancini Kostas Karpouzis Lori Malatesta Catherine Pelachaud 《Language Resources and Evaluation》2007,41(3-4):367-388

This work is about multimodal and expressive synthesis on virtual agents, based on the analysis of actions performed by human users. As input we consider the image sequence of the recorded human behavior. Computer vision and image processing techniques are incorporated in order to detect cues needed for expressivity features extraction. The multimodality of the approach lies in the fact that both facial and gestural aspects of the user’s behavior are analyzed and processed. The mimicry consists of perception, interpretation, planning and animation of the expressions shown by the human, resulting not in an exact duplicate rather than an expressive model of the user’s original behavior. 相似文献

19.

基于双相机捕获面部表情及人体姿态生成三维虚拟人动画

刘洁李毅朱江平《计算机应用》2021,41(3):839-844

为了生成表情丰富、动作流畅的三维虚拟人动画,提出了一种基于双相机同步捕获面部表情及人体姿态生成三维虚拟人动画的方法。首先,采用传输控制协议（TCP）网络时间戳方法实现双相机时间同步,采用张正友标定法实现双相机空间同步。然后,利用双相机分别采集面部表情和人体姿态。采集面部表情时,提取图像的2D特征点,利用这些2D特征点回归计算得到面部行为编码系统（FACS）面部行为单元,为实现表情动画做准备;以标准头部3D坐标值为基准,根据相机内参,采用高效n点投影（EPnP）算法实现头部姿态估计;之后将面部表情信息和头部姿态估计信息进行匹配。采集人体姿态时,利用遮挡鲁棒姿势图（ORPM）方法计算人体姿态,输出每个骨骼点位置、旋转角度等数据。最后,在虚幻引擎4（UE4）中使用建立的虚拟人体三维模型来展示数据驱动动画的效果。实验结果表明,该方法能够同步捕获面部表情及人体姿态,而且在实验测试中的帧率达到20 fps,能实时生成自然真实的三维动画。相似文献

20.

Avatar culture: cross-cultural evaluations of avatar facial expressions

Tomoko Koda Toru Ishida Matthias Rehm Elisabeth André 《AI & Society》2009,24(3):237-250

Avatars are increasingly used to express our emotions in our online communications. Such avatars are used based on the assumption that avatar expressions are interpreted universally among all cultures. This paper investigated cross-cultural evaluations of avatar expressions designed by Japanese and Western designers. The goals of the study were: (1) to investigate cultural differences in avatar expression evaluation and apply findings from psychological studies of human facial expression recognition, (2) to identify expressions and design features that cause cultural differences in avatar facial expression interpretation. The results of our study confirmed that (1) there are cultural differences in interpreting avatars’ facial expressions, and the psychological theory that suggests physical proximity affects facial expression recognition accuracy is also applicable to avatar facial expressions, (2) positive expressions have wider cultural variance in interpretation than negative ones, (3) use of gestures and gesture marks may sometimes cause counter-effects in recognizing avatar facial expressions. 相似文献