首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
Affective computing conjoins the research topics of emotion recognition and sentiment analysis, and can be realized with unimodal or multimodal data, consisting primarily of physical information (e.g., text, audio, and visual) and physiological signals (e.g., EEG and ECG). Physical-based affect recognition caters to more researchers due to the availability of multiple public databases, but it is challenging to reveal one's inner emotion hidden purposefully from facial expressions, audio tones, body gestures, etc. Physiological signals can generate more precise and reliable emotional results; yet, the difficulty in acquiring these signals hinders their practical application. Besides, by fusing physical information and physiological signals, useful features of emotional states can be obtained to enhance the performance of affective computing models. While existing reviews focus on one specific aspect of affective computing, we provide a systematical survey of important components: emotion models, databases, and recent advances. Firstly, we introduce two typical emotion models followed by five kinds of commonly used databases for affective computing. Next, we survey and taxonomize state-of-the-art unimodal affect recognition and multimodal affective analysis in terms of their detailed architectures and performances. Finally, we discuss some critical aspects of affective computing and its applications and conclude this review by pointing out some of the most promising future directions, such as the establishment of benchmark database and fusion strategies. The overarching goal of this systematic review is to help academic and industrial researchers understand the recent advances as well as new developments in this fast-paced, high-impact domain.  相似文献   

2.
We developed and evaluated a multimodal affect detector that combines conversational cues, gross body language, and facial features. The multimodal affect detector uses feature-level fusion to combine the sensory channels and linear discriminant analyses to discriminate between naturally occurring experiences of boredom, engagement/flow, confusion, frustration, delight, and neutral. Training and validation data for the affect detector were collected in a study where 28 learners completed a 32- min. tutorial session with AutoTutor, an intelligent tutoring system with conversational dialogue. Classification results supported a channel × judgment type interaction, where the face was the most diagnostic channel for spontaneous affect judgments (i.e., at any time in the tutorial session), while conversational cues were superior for fixed judgments (i.e., every 20 s in the session). The analyses also indicated that the accuracy of the multichannel model (face, dialogue, and posture) was statistically higher than the best single-channel model for the fixed but not spontaneous affect expressions. However, multichannel models reduced the discrepancy (i.e., variance in the precision of the different emotions) of the discriminant models for both judgment types. The results also indicated that the combination of channels yielded superadditive effects for some affective states, but additive, redundant, and inhibitory effects for others. We explore the structure of the multimodal linear discriminant models and discuss the implications of some of our major findings.  相似文献   

3.
The potential importance of human affect during human - computer interaction (HCI) is becoming increasingly well recognised. However, measuring and analysing affective behaviour is problematic. Physiological indicators reveal only some, sometimes ambiguous information. Video analysis and existing coding schemes are notoriously lengthy and complex, and examine only certain aspects of affect. This paper describes the development of a practical methodology to assess user affect, as displayed by emotional expressions. Interaction analysis techniques were used to identify discrete affective messages 'affectemes' and their components. This paper explains the rationale for this approach and demonstrates how it can be applied in practice. Preliminary evidence for its efficacy and reliability is also presented.  相似文献   

4.

The potential importance of human affect during human - computer interaction (HCI) is becoming increasingly well recognised. However, measuring and analysing affective behaviour is problematic. Physiological indicators reveal only some, sometimes ambiguous information. Video analysis and existing coding schemes are notoriously lengthy and complex, and examine only certain aspects of affect. This paper describes the development of a practical methodology to assess user affect, as displayed by emotional expressions. Interaction analysis techniques were used to identify discrete affective messages 'affectemes' and their components. This paper explains the rationale for this approach and demonstrates how it can be applied in practice. Preliminary evidence for its efficacy and reliability is also presented.  相似文献   

5.
As the technology in computer graphics advances, Animated-Virtual Actors (AVAs) in Virtual Reality (VR) applications become increasingly rich and complex. Cognitive Theory of Multimedia Learning (CTML) suggests that complex visual materials could hinder novice learners from attending to the lesson properly. On the other hand, previous studies have shown that visual complexity correlates with presence and may increase the perceived affective quality of the virtual world, towards an optimal experience or flow. Increasing these in VR applications may promote enjoyment and higher cognitive engagement for better learning outcomes. While visually complex materials could be motivating and pleasing to attend to, would they affect learning adversely? We developed a series of VR presentations to teach second-year psychology students about the navigational behaviour of Cataglyphis ants with flat, cartoon, or lifelike AVAs. To assess learning outcomes, we used Program Ratings, which measured perception of learning and perceived difficulty, and retention and transfer tests. The results from 200 students did not reveal any significant differences in presence, perceived affective quality, or learning outcomes as a function of the AVA’s visual complexity. While the results showed positive correlations between presence, perceived affective quality and perception of learning, none of these correlates with perceived difficulty, retention, or transfer scores. Nevertheless, our simulation produced significant improvements on retention and transfer scores in all conditions. We discuss possible explanations and future research directions.  相似文献   

6.
Advances in computer processing power and emerging algorithms are allowing new ways of envisioning human-computer interaction. Although the benefit of audio-visual fusion is expected for affect recognition from the psychological and engineering perspectives, most of existing approaches to automatic human affect analysis are unimodal: information processed by computer system is limited to either face images or the speech signals. This paper focuses on the development of a computing algorithm that uses both audio and visual sensors to detect and track a user's affective state to aid computer decision making. Using our multistream fused hidden Markov model (MFHMM), we analyzed coupled audio and visual streams to detect four cognitive states (interest, boredom, frustration and puzzlement) and seven prototypical emotions (neural, happiness, sadness, anger, disgust, fear and surprise). The MFHMM allows the building of an optimal connection among multiple streams according to the maximum entropy principle and the maximum mutual information criterion. Person-independent experimental results from 20 subjects in 660 sequences show that the MFHMM approach outperforms face-only HMM, pitch-only HMM, energy-only HMM, and independent HMM fusion, under clean and varying audio channel noise condition.  相似文献   

7.
There is an increasing interest in developing intelligent human–computer interaction systems that can fulfill two functions—recognizing user affective states and providing the user with timely and appropriate assistance. In this paper, we present a general unified decision-theoretic framework based on influence diagrams for simultaneously modeling user affect recognition and assistance. Affective state recognition is achieved through active probabilistic inference from the available multi modality sensory data. User assistance is automatically accomplished through a decision-making process that balances the benefits of keeping the user in productive affective states and the costs of performing user assistance. We discuss three theoretical issues within the framework, namely, user affect recognition, active sensory action selection, and user assistance. Validation of the proposed framework via a simulation study demonstrates its capability in efficient user affect recognition as well as timely and appropriate user assistance. Besides the theoretical contributions, we build a non-invasive real-time prototype system to recognize different user affective states (stress and fatigue) from four-modality user measurements, namely physical appearance features, physiological measures, user performance, and behavioral data. The affect recognition component of the prototype system is subsequently validated through a real-world study involving human subjects.  相似文献   

8.
Automatic analysis of human facial expression is a challenging problem with many applications. Most of the existing automated systems for facial expression analysis attempt to recognize a few prototypic emotional expressions, such as anger and happiness. Instead of representing another approach to machine analysis of prototypic facial expressions of emotion, the method presented in this paper attempts to handle a large range of human facial behavior by recognizing facial muscle actions that produce expressions. Virtually all of the existing vision systems for facial muscle action detection deal only with frontal-view face images and cannot handle temporal dynamics of facial actions. In this paper, we present a system for automatic recognition of facial action units (AUs) and their temporal models from long, profile-view face image sequences. We exploit particle filtering to track 15 facial points in an input face-profile sequence, and we introduce facial-action-dynamics recognition from continuous video input using temporal rules. The algorithm performs both automatic segmentation of an input video into facial expressions pictured and recognition of temporal segments (i.e., onset, apex, offset) of 27 AUs occurring alone or in a combination in the input face-profile video. A recognition rate of 87% is achieved.  相似文献   

9.
A social robot should be able to autonomously interpret human affect and adapt its behavior accordingly in order for successful social human–robot interaction to take place. This paper presents a modular non-contact automated affect-estimation system that employs support vector regression over a set of novel facial expression parameters to estimate a person’s affective states using a valence-arousal two-dimensional model of affect. The proposed system captures complex and ambiguous emotions that are prevalent in real-world scenarios by utilizing a continuous two-dimensional model, rather than a traditional discrete categorical model for affect. As the goal is to incorporate this recognition system in robots, real-time estimation of spontaneous natural facial expressions in response to environmental and interactive stimuli is an objective. The proposed system can be combined with affect detection techniques using other modes, such as speech, body language and/or physiological signals, etc., in order to develop an accurate multi-modal affect estimation system for social HRI applications. Experiments presented herein demonstrate the system’s ability to successfully estimate the affect of a diverse group of unknown individuals exhibiting spontaneous natural facial expressions.  相似文献   

10.
In the context of affective human behavior analysis, we use the term continuous input to refer to naturalistic settings where explicit or implicit input from the subject is continuously available, where in a human–human or human–computer interaction setting, the subject plays the role of a producer of the communicative behavior or the role of a recipient of the communicative behavior. As a result, the analysis and the response provided by the automatic system are also envisioned to be continuous over the course of time, within the boundaries of digital machine output. The term continuous affect analysis is used as analysis that is continuous in time as well as analysis that uses affect phenomenon represented in dimensional space. The former refers to acquiring and processing long unsegmented recordings for detection of an affective state or event (e.g., nod, laughter, pain), and the latter refers to prediction of an affect dimension (e.g., valence, arousal, power). In line with the Special Issue on Affect Analysis in Continuous Input, this survey paper aims to put the continuity aspect of affect under the spotlight by investigating the current trends and provide guidance towards possible future directions.  相似文献   

11.
What role can computers play in the study of strategic interpersonal behaviours, and research on affective influences on social behaviour in particular? Despite intense recent interest in affective phenomena, the role of affect in social interaction has rarely been studied. This paper reviews past work on affective influences on interpersonal behaviour, with special emphasis on Michael Argyle’s pioneering studies in this field. We then discuss historical and contemporary theories of affective influences on social behaviour. A series of experiments using computer-mediated interaction tasks are described, investigating affective influences on interpersonal behaviours such as self-disclosure strategies and the production of persuasive arguments. It is suggested that computer-mediated interaction offers a reliable and valid technique for studying the cognitive, information processing variables that facilitate or inhibit affective influences on interpersonal behaviour. These studies show that mild affective states produce significant differences in the way people perform in interpersonal situations, and can accentuate or attenuate (through affective priming) self-disclosure intimacy or persuasive argument quality. The implications of these studies for recent theories and affect-cognition models, and for our understanding of people’s everyday interpersonal strategies are discussed.  相似文献   

12.
Affective states and their non-verbal expressions are an important aspect of human reasoning, communication and social life. Automated recognition of affective states can be integrated into a wide variety of applications for various fields. Therefore, it is of interest to design systems that can infer the affective states of speakers from the non-verbal expressions in speech, occurring in real scenarios. This paper presents such a system and the framework for its design and validation. The framework defines a representation method that comprises a set of affective-state groups or archetypes that often appear in everyday life. The inference system is designed to infer combinations of affective states that can occur simultaneously and whose level of expression can change over time. The framework considers also the validation and generalisation of the system. The system was built of 36 independent pair-wise comparison machines, with average accuracy (tenfold cross-validation) of 75%. The accumulated inference system yielded total accuracy of 83% and recognised combinations for different nuances within the affective-state groups. In addition to the ability to recognise these affective-state groups, the inference system was applied to characterisation of a very large variety of affective state concepts (549 concepts) as combinations of the affective-state groups. The system was also applied to annotation of affective states that were naturally evoked during sustained human–computer interactions and multi-modal analysis of the interactions, to new speakers and to a different language, with no additional training. The system provides a powerful tool for recognition, characterisation, annotation (interpretation) and analysis of affective states. In addition, the results inferred from speech in both English and Hebrew, indicate that the vocal expressions of complex affective states such as thinking, certainty and interest transcend language boundaries.  相似文献   

13.
Psychological research findings suggest that humans rely on the combined visual channels of face and body more than any other channel when they make judgments about human communicative behavior. However, most of the existing systems attempting to analyze the human nonverbal behavior are mono-modal and focus only on the face. Research that aims to integrate gestures as an expression mean has only recently emerged. Accordingly, this paper presents an approach to automatic visual recognition of expressive face and upper-body gestures from video sequences suitable for use in a vision-based affective multi-modal framework. Face and body movements are captured simultaneously using two separate cameras. For each video sequence single expressive frames both from face and body are selected manually for analysis and recognition of emotions. Firstly, individual classifiers are trained from individual modalities. Secondly, we fuse facial expression and affective body gesture information at the feature and at the decision level. In the experiments performed, the emotion classification using the two modalities achieved a better recognition accuracy outperforming classification using the individual facial or bodily modality alone.  相似文献   

14.
The computer graphics and vision communities have dedicated long standing efforts in building computerized tools for reconstructing, tracking, and analyzing human faces based on visual input. Over the past years rapid progress has been made, which led to novel and powerful algorithms that obtain impressive results even in the very challenging case of reconstruction from a single RGB or RGB‐D camera. The range of applications is vast and steadily growing as these technologies are further improving in speed, accuracy, and ease of use. Motivated by this rapid progress, this state‐of‐the‐art report summarizes recent trends in monocular facial performance capture and discusses its applications, which range from performance‐based animation to real‐time facial reenactment. We focus our discussion on methods where the central task is to recover and track a three dimensional model of the human face using optimization‐based reconstruction algorithms. We provide an in‐depth overview of the underlying concepts of real‐world image formation, and we discuss common assumptions and simplifications that make these algorithms practical. In addition, we extensively cover the priors that are used to better constrain the under‐constrained monocular reconstruction problem, and discuss the optimization techniques that are employed to recover dense, photo‐geometric 3D face models from monocular 2D data. Finally, we discuss a variety of use cases for the reviewed algorithms in the context of motion capture, facial animation, as well as image and video editing.  相似文献   

15.
Psychologists have long explored mechanisms with which humans recognize other humans' affective states from modalities, such as voice and face display. This exploration has led to the identification of the main mechanisms, including the important role played in the recognition process by the modalities' dynamics. Constrained by the human physiology, the temporal evolution of a modality appears to be well approximated by a sequence of temporal segments called onset, apex, and offset. Stemming from these findings, computer scientists, over the past 15 years, have proposed various methodologies to automate the recognition process. We note, however, two main limitations to date. The first is that much of the past research has focused on affect recognition from single modalities. The second is that even the few multimodal systems have not paid sufficient attention to the modalities' dynamics: The automatic determination of their temporal segments, their synchronization to the purpose of modality fusion, and their role in affect recognition are yet to be adequately explored. To address this issue, this paper focuses on affective face and body display, proposes a method to automatically detect their temporal segments or phases, explores whether the detection of the temporal phases can effectively support recognition of affective states, and recognizes affective states based on phase synchronization/alignment. The experimental results obtained show the following: 1) affective face and body displays are simultaneous but not strictly synchronous; 2) explicit detection of the temporal phases can improve the accuracy of affect recognition; 3) recognition from fused face and body modalities performs better than that from the face or the body modality alone; and 4) synchronized feature-level fusion achieves better performance than decision-level fusion.  相似文献   

16.
Breakdowns in complex systems often occur as a result of system elements interacting in unanticipated ways. In systems with human operators, human–automation interaction associated with both normative and erroneous human behavior can contribute to such failures. Model-driven design and analysis techniques provide engineers with formal methods tools and techniques capable of evaluating how human behavior can contribute to system failures. This paper presents a novel method for automatically generating task analytic models encompassing both normative and erroneous human behavior from normative task models. The generated erroneous behavior is capable of replicating Hollnagel's zero-order phenotypes of erroneous action for omissions, jumps, repetitions, and intrusions. Multiple phenotypical acts can occur in sequence, thus allowing for the generation of higher order phenotypes. The task behavior model pattern capable of generating erroneous behavior can be integrated into a formal system model so that system safety properties can be formally verified with a model checker. This allows analysts to prove that a human–automation interactive system (as represented by the model) will or will not satisfy safety properties with both normative and generated erroneous human behavior. We present benchmarks related to the size of the statespace and verification time of models to show how the erroneous human behavior generation process scales. We demonstrate the method with a case study: the operation of a radiation therapy machine. A potential problem resulting from a generated erroneous human action is discovered. A design intervention is presented which prevents this problem from occurring. We discuss how our method could be used to evaluate larger applications and recommend future paths of development.  相似文献   

17.
Affective computing is an emerging interdisciplinary research field bringing together researchers and practitioners from various fields, ranging from artificial intelligence, natural language processing, to cognitive and social sciences. With the proliferation of videos posted online (e.g., on YouTube, Facebook, Twitter) for product reviews, movie reviews, political views, and more, affective computing research has increasingly evolved from conventional unimodal analysis to more complex forms of multimodal analysis. This is the primary motivation behind our first of its kind, comprehensive literature review of the diverse field of affective computing. Furthermore, existing literature surveys lack a detailed discussion of state of the art in multimodal affect analysis frameworks, which this review aims to address. Multimodality is defined by the presence of more than one modality or channel, e.g., visual, audio, text, gestures, and eye gage. In this paper, we focus mainly on the use of audio, visual and text information for multimodal affect analysis, since around 90% of the relevant literature appears to cover these three modalities. Following an overview of different techniques for unimodal affect analysis, we outline existing methods for fusing information from different modalities. As part of this review, we carry out an extensive study of different categories of state-of-the-art fusion techniques, followed by a critical analysis of potential performance improvements with multimodal analysis compared to unimodal analysis. A comprehensive overview of these two complementary fields aims to form the building blocks for readers, to better understand this challenging and exciting research field.  相似文献   

18.
We describe a computer vision system for observing facial motion by using an optimal estimation optical flow method coupled with geometric, physical and motion-based dynamic models describing the facial structure. Our method produces a reliable parametric representation of the face's independent muscle action groups, as well as an accurate estimate of facial motion. Previous efforts at analysis of facial expression have been based on the facial action coding system (FACS), a representation developed in order to allow human psychologists to code expression from static pictures. To avoid use of this heuristic coding scheme, we have used our computer vision system to probabilistically characterize facial motion and muscle activation in an experimental population, thus deriving a new, more accurate, representation of human facial expressions that we call FACS+. Finally, we show how this method can be used for coding, analysis, interpretation, and recognition of facial expressions  相似文献   

19.
In intelligent virtual environments (IVEs), it is a challenging research issue to provide the intelligent virtual actors (or avatars) with the ability of visual perception and rapid response to virtual world events. Modeling an avatar’s cognitive and synthetic behavior appropriately is of paramount important in IVEs. We propose a new cognitive and behavior modeling methodology that integrates two previously developed complementary approaches. We present expression cloning, walking synthetic behavior modeling, and an autonomous agent cognitive model for driving an avatar’s behavior. Facial expressions are generated using our own-developed rule-based state transition system. Facial expressions are further personalized for individuals by expression cloning. An avatar’s walking behavior is modeled using a skeleton model that is implemented by seven-motion sequences and finite state machines (FSMs). We discuss experimental results demonstrating the benefits of our approach.  相似文献   

20.
This paper presents the theoretical and practical fundamentals of using physiology sensors to capture human emotion reactivity in a products or systems engineering context. We aim to underline the complexity of regulating (internal and external) effects on the human body and highly individual physiological (emotion) responses and provide a starting point for engineering researchers entering the field. Although great advances have been made in scenarios involving human-machine interactions, the critical elements—the actions and responses of the human—remain far beyond automatic control, because of the irrational behavior of human subjects. These (re)actions, which cannot be satisfactorily modeled, stem mostly from the fact that human behavior is regulated by emotions. The physiological measurement of the latter can thus be a potential door to future advances for the community. In this paper, following a brief overview of the foundations and ongoing discussions in psychology and neuroscience, various emotion-related physiological responses are explained on the basis of a systematic review of the autonomic nervous system and its regulation of the human body. Based on sympathetic and parasympathetic nervous system responses, various sensor measurements that are relevant in an engineering context, such as electrocardiography, electroencephalography, electromyography, pulse oximetry, blood pressure measurements, respiratory transducer, body temperature measurements, galvanic skin response measurements, and others, are explained. After providing an overview of ongoing engineering and human-computer interaction projects, we discuss engineering-specific challenges and experiment setups in terms of their usability and appropriateness for data analysis. We identify current limitations associated with the use of physiology sensors and discuss developments in this area, such as software-based facial affect coding and near-infrared spectroscopy. The key to truly understanding user experience and designing systems and products that integrate emotional states dynamically lies in understanding and measuring physiology. This paper serves as a call for the advancement of affective engineering research.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号