首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
Avatars are increasingly used to express our emotions in our online communications. Such avatars are used based on the assumption that avatar expressions are interpreted universally among all cultures. This paper investigated cross-cultural evaluations of avatar expressions designed by Japanese and Western designers. The goals of the study were: (1) to investigate cultural differences in avatar expression evaluation and apply findings from psychological studies of human facial expression recognition, (2) to identify expressions and design features that cause cultural differences in avatar facial expression interpretation. The results of our study confirmed that (1) there are cultural differences in interpreting avatars’ facial expressions, and the psychological theory that suggests physical proximity affects facial expression recognition accuracy is also applicable to avatar facial expressions, (2) positive expressions have wider cultural variance in interpretation than negative ones, (3) use of gestures and gesture marks may sometimes cause counter-effects in recognizing avatar facial expressions.  相似文献   

2.
This research explores and evaluates the contribution that facial expressions might have regarding improved comprehension and acceptability in sign language avatars. Focusing specifically on Irish sign language (ISL), the Deaf (the uppercase “D” in the word “Deaf” indicates Deaf as a culture as opposed to “deaf” as a medical condition) community’s responsiveness to sign language avatars is examined. The hypothesis of this is as follows: augmenting an existing avatar with the seven widely accepted universal emotions identified by Ekman (Basic emotions: handbook of cognition and emotion. Wiley, London, 2005) to achieve underlying facial expressions will make that avatar more human like and improve usability and understandability for the ISL user. Using human evaluation methods (Huenerfauth et al. in Trans Access Comput (ACM) 1:1, 2008), an augmented set of avatar utterances is compared against a baseline set, focusing on two key areas: comprehension and naturalness of facial configuration. The approach to the evaluation including the choice of ISL participants, interview environment and evaluation methodology is then outlined. The evaluation results reveal that in a comprehension test there was little difference between the baseline avatars and those augmented with emotional facial expression. It was also found that the avatars are lacking various linguistic attributes.  相似文献   

3.
The use of avatars with emotionally expressive faces is potentially highly beneficial to communication in collaborative virtual environments (CVEs), especially when used in a distance learning context. However, little is known about how, or indeed whether, emotions can effectively be transmitted through the medium of a CVE. Given this, an avatar head model with limited but human-like expressive abilities was built, designed to enrich CVE communication. Based on the facial action coding system (FACS), the head was designed to express, in a readily recognisable manner, the six universal emotions. An experiment was conducted to investigate the efficacy of the model. Results indicate that the approach of applying the FACS model to virtual face representations is not guaranteed to work for all expressions of a particular emotion category. However, given appropriate use of the model, emotions can effectively be visualised with a limited number of facial features. A set of exemplar facial expressions is presented.  相似文献   

4.
Facial expressions are one of most intuitive way for expressing emotions, and can facilitate human-computer interaction by enabling users to communicate with computers using more natural ways. Besides, the hair can be designed to enhance the expression of emotions particularly. To visualize the emotions in multiple aspects for completeness, we propose a realistic visual emotional synthesis system based on the combination of facial expression and hairstyle in this paper. Firstly, facial expression is synthesized by the anatomical model and parameterized model. Secondly, cartoonish hairstyle is synthesized to describe emotion implicitly by the mass-spring model and cantilever beam model. Finally, the synthesis results of facial expression and hairstyle are combined to produce a complete visual emotion synthesis result. Experiment results demonstrate the proposed system can synthesize realistic animation, and the emotion expressiveness by combining of face and hair outperform that by face or hair alone.  相似文献   

5.
This research investigates the impact on social communication quality of using anonymous avatars during small-screen mobile audio/visual communications. Elements of behavioral and visual realism of avatars are defined, as is an elaborated three-component measure of communication quality called Social Copresence. Experimental results with 196 participants participating in a social interaction using a simulated mobile device with varied levels of avatar visual and behavioral realism showed higher levels of avatar Kinetic Conformity and Fidelity produced increased perceived Social Richness of Medium, while higher avatar Anthropomorphism produced higher levels of Psychological Copresence and Interactant Satisfaction with Communication. Increased levels of avatar Anonymity produced decreases in Social Copresence, but these were smaller when avatars possessed higher levels of visual and behavioral realism.  相似文献   

6.
针对现有的虚拟说话人面部表情比较单一,表情和动作不能很好的协同问题,提出了一种建立具有真实感的情绪化虚拟人的方法。该方法首先利用三参数产生,保持和消退来对动态面部表情进行仿真,采用融合变形技术合成复杂的表情,然后以人类心理学的统计数据为依据来对眼部和头部动作进行设计,使虚拟人看起来更加逼真,最后分析了外在条件相机位置、光照对增加虚拟人真实感的影响。实验结果表明,该方法建立的虚拟人不仅逼真自然且富有感情,而且语音、动态面部表情、眼动和头动达到了很好的协调同步。  相似文献   

7.
Synthesizing expressive facial animation is a very challenging topic within the graphics community. In this paper, we present an expressive facial animation synthesis system enabled by automated learning from facial motion capture data. Accurate 3D motions of the markers on the face of a human subject are captured while he/she recites a predesigned corpus, with specific spoken and visual expressions. We present a novel motion capture mining technique that "learns" speech coarticulation models for diphones and triphones from the recorded data. A phoneme-independent expression eigenspace (PIEES) that encloses the dynamic expression signals is constructed by motion signal processing (phoneme-based time-warping and subtraction) and principal component analysis (PCA) reduction. New expressive facial animations are synthesized as follows: First, the learned coarticulation models are concatenated to synthesize neutral visual speech according to novel speech input, then a texture-synthesis-based approach is used to generate a novel dynamic expression signal from the PIEES model, and finally the synthesized expression signal is blended with the synthesized neutral visual speech to create the final expressive facial animation. Our experiments demonstrate that the system can effectively synthesize realistic expressive facial animation  相似文献   

8.
Although avatars may resemble communicative interface agents, they have for the most part not profited from recent research into autonomous embodied conversational systems. In particular, even though avatars function within conversational environments (for example, chat or games), and even though they often resemble humans (with a head, hands, and a body) they are incapable of representing the kinds of knowledge that humans have about how to use the body during communication. Humans, however, do make extensive use of the visual channel for interaction management where many subtle and even involuntary cues are read from stance, gaze, and gesture. We argue that the modeling and animation of such fundamental behavior is crucial for the credibility and effectiveness of the virtual interaction in chat. By treating the avatar as a communicative agent, we propose a method to automate the animation of important communicative behavior, deriving from work in conversation and discourse theory. BodyChat is a system that allows users to communicate via text while their avatars automatically animate attention, salutations, turn taking, back-channel feedback, and facial expression. An evaluation shows that users found an avatar with autonomous conversational behaviors to be more natural than avatars whose behaviors they controlled, and to increase the perceived expressiveness of the conversation. Interestingly, users also felt that avatars with autonomous communicative behaviors provided a greater sense of user control.  相似文献   

9.
10.
This paper describes a behavioural model used to simulate realistic eye‐gaze behaviour and body animations for avatars representing participants in a shared immersive virtual environment (IVE). The model was used in a study designed to explore the impact of avatar realism on the perceived quality of communication within a negotiation scenario. Our eye‐gaze model was based on data and studies carried out on the behaviour of eye‐gaze during face‐to‐face communication. The technical features of the model are reported here. Information about the motivation behind the study, experimental procedures and a full analysis of the results obtained are given in [ 17 ].  相似文献   

11.
A multi-user 3-D virtual environment allows remote participants to have a transparent communication as if they are communicating face-to-face. The sense of presence in such an environment can be established by representing each participant with a vivid human-like character called an avatar. We review several immersive technologies, including directional sound, eye gaze, hand gestures, lip synchronization and facial expressions, that facilitates multimodal interaction among participants in the virtual environment using speech processing and animation techniques. Interactive collaboration can be further encouraged with the ability to share and manipulate 3-D objects in the virtual environment. A shared whiteboard makes it easy for participants in the virtual environment to convey their ideas graphically. We survey various kinds of capture devices used for providing the input for the shared whiteboard. Efficient storage of the whiteboard session and precise archival at a later time bring up interesting research topics in information retrieval.  相似文献   

12.
为了解决语言障碍者与健康人之间的交流障碍问题,提出了一种基于神经网络的手语到情感语音转换方法。首先,建立了手势语料库、人脸表情语料库和情感语音语料库;然后利用深度卷积神经网络实现手势识别和人脸表情识别,并以普通话声韵母为合成单元,训练基于说话人自适应的深度神经网络情感语音声学模型和基于说话人自适应的混合长短时记忆网络情感语音声学模型;最后将手势语义的上下文相关标注和人脸表情对应的情感标签输入情感语音合成模型,合成出对应的情感语音。实验结果表明,该方法手势识别率和人脸表情识别率分别达到了95.86%和92.42%,合成的情感语音EMOS得分为4.15,合成的情感语音具有较高的情感表达程度,可用于语言障碍者与健康人之间正常交流。  相似文献   

13.
Photo-realistic talking-heads from image samples   总被引:1,自引:0,他引:1  
  相似文献   

14.
A novel model is presented to learn bimodally informative structures from audio–visual signals. The signal is represented as a sparse sum of audio–visual kernels. Each kernel is a bimodal function consisting of synchronous snippets of an audio waveform and a spatio–temporal visual basis function. To represent an audio–visual signal, the kernels can be positioned independently and arbitrarily in space and time. The proposed algorithm uses unsupervised learning to form dictionaries of bimodal kernels from audio–visual material. The basis functions that emerge during learning capture salient audio–visual data structures. In addition, it is demonstrated that the learned dictionary can be used to locate sources of sound in the movie frame. Specifically, in sequences containing two speakers, the algorithm can robustly localize a speaker even in the presence of severe acoustic and visual distracters.   相似文献   

15.
This research focuses on computer-mediated communication where users are represented by a graphical avatar. An avatar represents a user's self-identity and desire for self-disclosure. Therefore, the claim is made that there is a relationship between the characteristics of media and the choice of avatar. This study supports the claim by examining the difference between Internet Relay Chat (IRC) avatars and Instant Messenger (IM) avatars as of 2003 when both media had distinct characteristics and popular avatar service in Korea. Users of IRC are generally anonymous and involved with topic-based group discussions, whereas users of IM are known by their “real” names and communicate via one-on-one chitchatting. We found that avatars as symbols for users can have different characteristics in terms of self-identity and self-disclosure in different media. Gender is found to have significant moderation effect on avatar usage, whereas age is shown to have a mixed moderation effect.  相似文献   

16.
The use of 3D avatars is becoming more frequent with the development of computer technology and the internet. To meet users?? requirements, some software or programs have allowed users to customize the avatar. However, users are only able to customize the avatar using the pre-defined accessories such as hair, clothing and so on. That is, users have limited chance to customize the avatar according to their own styles. It will be of interest to users if they are able to change the appearance of the avatar by their own design, such as creating garments for avatars themselves. This paper provides an easy solution to dressing realistic 3D avatars for non-professional users based on a sketch interface. After a user drawing a 2D garment profile around the avatar, the prototype system can generate an elaborate 3D geometric garment surface dressed on the avatar. The construction of the garment surface is constrained by key body features. And the garment shape is then optimized to remove artefacts. The proposed method can generate a uniform mesh for processing such as mesh refinement, 3D decoration and so on.  相似文献   

17.
In both online and offline interactions, the visual representation of people influences how others perceive them. In contrast to the offline body, an online visual representation of a person is consciously chosen and not stable. This paper reports the results of a 2 step examination of the influence of avatars on the person perception process. Specifically, this project examines the reliance on visual characteristics during the online perception process, and the relative influence of androgyny, anthropomorphism and credibility. In the first step, 255 participants fill out a survey where they rated a set of 30 static avatars on their credibility, androgyny, and anthropomorphism. The second step is a between subjects experiment with 230 participants who interact with partners represented by one of eight avatars (high and low androgyny, and anthropomorphism by high and low credibility). Results show that the characteristics of the avatar are used in the person perception process. Causal modeling techniques revealed that perceptions of avatar androgyny influence perceptions of anthropomorphism, which influences attributions of both avatar and partner credibility. Implications of these results for theory, future research, and users and designers of systems using avatars are discussed.  相似文献   

18.
基于情感交互的仿人头部机器人   总被引:4,自引:0,他引:4  
本研究的目的是设计一台机器人,使它可以与人互动,并在日常生活中和常见的地方协助人类.为了 完成这些任务,机器人必须友好地显示出一些情感,表现出友好的特点和个性.依据仿生学,研制了一台仿人头部 机器人,建立了机器人的行为决策模型.该机器人具有人类的6 种基本面部表情,以及人脸检测、语音情感识别与 合成、情感行为决策等能力,能够通过机器视觉、语音交互、情感表达等方式与人进行有效的情感交互.  相似文献   

19.
基于MPEG-4的人脸表情图像变形研究   总被引:1,自引:0,他引:1       下载免费PDF全文
为了实时地生成自然真实的人脸表情,提出了一种基于MPEG-4人脸动画框架的人脸表情图像变形方法。该方法首先采用face alignment工具提取人脸照片中的88个特征点;接着在此基础上,对标准人脸网格进行校准变形,以进一步生成特定人脸的三角网格;然后根据人脸动画参数(FAP)移动相应的面部关键特征点及其附近的关联特征点,并在移动过程中保证在多个FAP的作用下的人脸三角网格拓扑结构不变;最后对发生形变的所有三角网格区域通过仿射变换进行面部纹理填充,生成了由FAP所定义的人脸表情图像。该方法的输入是一张中性人脸照片和一组人脸动画参数,输出是对应的人脸表情图像。为了实现细微表情动作和虚拟说话人的合成,还设计了一种眼神表情动作和口内细节纹理的生成算法。基于5分制(MOS)的主观评测实验表明,利用该人脸图像变形方法生成的表情脸像自然度得分为3.67。虚拟说话人合成的实验表明,该方法具有很好的实时性,在普通PC机上的平均处理速度为66.67 fps,适用于实时的视频处理和人脸动画的生成。  相似文献   

20.
This paper presents a realistic visual speech synthesis based on the hybrid concatenation method. Unlike previous methods based on phoneme level unit selection or hidden Markov model (HMM), etc., the hybrid concatenation method uses a frame level-based unit selection method combined with a fused HMM, and is able to generate more expressive and stable facial animations. The fused HMM can be used to explicitly model the loose synchronization of tightly coupled streams, with much better results than a normal HMM for audiovisual mapping. After fused HMM is created, facial animation is generated via the unit selection method at the frame level by using the fused HMM output probabilities. To accelerate the computing efficiency of the unit selection on a large corpus, this paper also proposes a two-layer Viterbi search method in which only the subsets that have been selected in the first layer are further checked in the second layer. Using this idea, the system has been successfully integrated into real-time applications. Furthermore, the paper also proposes a mapping method to generate emotional facial expressions from neutral facial expressions based on Gaussian mixture models (GMMs). Final experiments prove that the method described can output synthesized facial parameters with high quality. Compared with other audiovisual mapping methods, our method has better performance with respect to expressiveness, stability, and system running speed.   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号