期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The USC CreativeIT database of multimodal dyadic interactions: from speech and full body motion capture to continuous emotional annotations

Angeliki?Metallinou Zhaojun?Yang Email author Chi-chun?Lee Carlos?Busso Sharon?Carnicke Shrikanth?Narayanan 《Language Resources and Evaluation》2016,50(3):497-521

Improvised acting is a viable technique to study expressive human communication and to shed light into actors’ creativity. The USC CreativeIT database provides a novel, freely-available multimodal resource for the study of theatrical improvisation and rich expressive human behavior (speech and body language) in dyadic interactions. The theoretical design of the database is based on the well-established improvisation technique of Active Analysis in order to provide naturally induced affective and expressive, goal-driven interactions. This database contains dyadic theatrical improvisations performed by 16 actors, providing detailed full body motion capture data and audio data of each participant in an interaction. The carefully engineered data collection, the improvisation design to elicit natural emotions and expressive speech and body language, as well as the well-developed annotation processes provide a gateway to study and model various aspects of theatrical performance, expressive behaviors and human communication and interaction. 相似文献

2.

Expressive Speech Animation Synthesis with Phoneme‐Level Controls

Z. Deng U. Neumann 《Computer Graphics Forum》2008,27(8):2096-2113

This paper presents a novel data‐driven expressive speech animation synthesis system with phoneme‐level controls. This system is based on a pre‐recorded facial motion capture database, where an actress was directed to recite a pre‐designed corpus with four facial expressions (neutral, happiness, anger and sadness). Given new phoneme‐aligned expressive speech and its emotion modifiers as inputs, a constrained dynamic programming algorithm is used to search for best‐matched captured motion clips from the processed facial motion database by minimizing a cost function. Users optionally specify ‘hard constraints’ (motion‐node constraints for expressing phoneme utterances) and ‘soft constraints’ (emotion modifiers) to guide this search process. We also introduce a phoneme–Isomap interface for visualizing and interacting phoneme clusters that are typically composed of thousands of facial motion capture frames. On top of this novel visualization interface, users can conveniently remove contaminated motion subsequences from a large facial motion dataset. Facial animation synthesis experiments and objective comparisons between synthesized facial motion and captured motion showed that this system is effective for producing realistic expressive speech animations. 相似文献

3.

What is the effective key length for a block cipher: an attack on every practical block cipher

HUANG JiaLin LAI XueJia 《中国科学:信息科学(英文版)》2014,57(7):1-11

Recently,several important block ciphers are considered to be broken by the brute-force-like cryptanalysis,with a time complexity faster than the exhaustive key search by going over the entire key space but performing less than a full encryption for each possible key.Motivated by this observation,we describe a meetin-the-middle attack that can always be successfully mounted against any practical block ciphers with success probability one.The data complexity of this attack is the smallest according to the unicity distance.The time complexity can be written as 2k(1-),where>0 for all practical block ciphers.Previously,the security bound that is commonly accepted is the length k of the given master key.From our result we point out that actually this k-bit security is always overestimated and can never be reached because of the inevitable loss of the key bits.No amount of clever design can prevent it,but increments of the number of rounds can reduce this key loss as much as possible.We give more insight into the problem of the upper bound of effective key bits in block ciphers,and show a more accurate bound.A suggestion about the relationship between the key size and block size is given.That is,when the number of rounds is fixed,it is better to take a key size equal to the block size.Also,effective key bits of many well-known block ciphers are calculated and analyzed,which also confirms their lower security margins than thought before.The results in this article motivate us to reconsider the real complexity that a valid attack should compare to. 相似文献

4.

TC-Net: A Modest & Lightweight Emotion Recognition System Using Temporal Convolution Network

Muhammad Ishaq Mustaqeem Khan Soonil Kwon 《计算机系统科学与工程》2023,46(3):3355-3369

Speech signals play an essential role in communication and provide an efficient way to exchange information between humans and machines. Speech Emotion Recognition (SER) is one of the critical sources for human evaluation, which is applicable in many real-world applications such as healthcare, call centers, robotics, safety, and virtual reality. This work developed a novel TCN-based emotion recognition system using speech signals through a spatial-temporal convolution network to recognize the speaker’s emotional state. The authors designed a Temporal Convolutional Network (TCN) core block to recognize long-term dependencies in speech signals and then feed these temporal cues to a dense network to fuse the spatial features and recognize global information for final classification. The proposed network extracts valid sequential cues automatically from speech signals, which performed better than state-of-the-art (SOTA) and traditional machine learning algorithms. Results of the proposed method show a high recognition rate compared with SOTA methods. The final unweighted accuracy of 80.84%, and 92.31%, for interactive emotional dyadic motion captures (IEMOCAP) and berlin emotional dataset (EMO-DB), indicate the robustness and efficiency of the designed model. 相似文献

5.

Unconscious emotions: quantifying and logging something we are not aware of

Leonid Ivonin Huang-Ming Chang Wei Chen Matthias Rauterberg 《Personal and Ubiquitous Computing》2013,17(4):663-673

Lifelogging tools aim to precisely capture daily experiences of people from the first-person perspective. Although there have been numerous lifelogging tools developed for users to record the external environment around them, the internal part of experience characterized by emotions seems to be neglected in the lifelogging field. However, the internal experiences of people are important and, therefore, lifelogging tools should be able to capture not only the environmental data, but also emotional experiences, thereby providing a more complete archive of past events. Moreover, there are implicit emotions that cannot be consciously experienced, but still influence human behaviors and memories. It has been proven that conscious emotions can be recognized from physiological signals of the human body. This fact may be used to enhance life-logs with information about unconscious emotions, which otherwise would remain hidden. On the other hand, it is not clear if unconscious emotions can be recognized from physiological signals and differentiated from conscious emotions. Therefore, an experiment was designed to elicit emotions (both conscious and unconscious) with visual and auditory stimuli and to record cardiovascular responses of 34 participants. The experimental results showed that heart rate responses to the presentation of the stimuli are unique for every category of the emotional stimuli and allow differentiation between various emotional experiences of the participants. 相似文献

6.

Tracking continuous emotional trends of participants during affective dyadic interactions using body language and speech information

Angeliki Metallinou Athanasios Katsamanis Shrikanth Narayanan 《Image and vision computing》2013

相似文献

7.

Emotional empathy transition patterns from human brain responses in interactive communication situations

Tomasz M. Rutkowski Andrzej Cichocki Danilo P. Mandic Toyoaki Nishida 《AI & Society》2011,26(3):301-315

The paper reports our research aiming at utilization of human interactive communication modeling principles in application to a novel interaction paradigm designed for brain–computer/machine-interfacing (BCI/BMI) technologies as well as for socially aware intelligent environments or communication support systems. Automatic procedures for human affective responses or emotional states estimation are still a hot topic of contemporary research. We propose to utilize human brain and bodily physiological responses for affective/emotional as well as communicative interactivity estimation, which potentially could be used in the future for human–machine/environment interaction design. As a test platform for such an intelligent human–machine communication application, an emotional stimuli paradigm was chosen to evaluate brain responses to various affective stimuli in an emotional empathy mode. Videos with moving faces expressing various emotional displays as well as speech stimuli with similarly emotionally articulated sentences are presented to the subjects in order to further analyze different affective responses. From information processing point of view, several challenges with multimodal signal conditioning and stimuli dynamic response extraction in time frequency domain are addressed. Emotions play an important role in human daily life and human-to-human communication. This is why involvement of affective stimuli principles to human–machine communication or machine-mediated communication with utilization of multichannel neurophysiological and periphery physiological signals monitoring techniques, allowing real-time subjective brain responses evaluation, is discussed. We present our preliminary results and discuss potential applications of brain/body affective responses estimation for future interactive/smart environments. 相似文献

8.

Expressive facial animation synthesis by learning speech coarticulation and expression spaces 总被引：2，自引：0，他引：2

Deng Z Neumann U Lewis JP Kim TY Bulut M Narayanan S 《IEEE transactions on visualization and computer graphics》2006,12(6):1523-1534

Synthesizing expressive facial animation is a very challenging topic within the graphics community. In this paper, we present an expressive facial animation synthesis system enabled by automated learning from facial motion capture data. Accurate 3D motions of the markers on the face of a human subject are captured while he/she recites a predesigned corpus, with specific spoken and visual expressions. We present a novel motion capture mining technique that "learns" speech coarticulation models for diphones and triphones from the recorded data. A phoneme-independent expression eigenspace (PIEES) that encloses the dynamic expression signals is constructed by motion signal processing (phoneme-based time-warping and subtraction) and principal component analysis (PCA) reduction. New expressive facial animations are synthesized as follows: First, the learned coarticulation models are concatenated to synthesize neutral visual speech according to novel speech input, then a texture-synthesis-based approach is used to generate a novel dynamic expression signal from the PIEES model, and finally the synthesized expression signal is blended with the synthesized neutral visual speech to create the final expressive facial animation. Our experiments demonstrate that the system can effectively synthesize realistic expressive facial animation 相似文献

9.

Expressive motions recognition and analysis with learning and statistical methods

Ajili Insaf Ramezanpanah Zahra Mallem Malik Didier Jean-Yves 《Multimedia Tools and Applications》2019,78(12):16575-16600

相似文献

10.

Decisions in software development projects management. An exploratory study

《Behaviour & Information Technology》2012,31(11):1077-1085

Given the importance of software in today's world, the development of software systems is a key activity that requires complex management scenarios. This article explores the implications of hard decisions in the context of software development projects (SDPs). More in deep, it focuses on the emotional consequences of making hard decisions in IT organisations. Complex SDPs involve a great variety of actors. This fact entails morale, feelings and emotions, which play an important role for communication, interaction and, ultimately, decision making. The aim of the article is twofold. First (Study 1), to identify which are the most important hard decisions in SDPS. Second (Study 2), to study the influence of emotions on decision-making processes (Study 2). Findings show the complex emotional consequences and difficulties that managers must face in hard decision-making processes. 相似文献

11.

Modeling appraisal in theory of mind reasoning

Mei Si Stacy C. Marsella David V. Pynadath 《Autonomous Agents and Multi-Agent Systems》2010,20(1):14-31

Cognitive appraisal theories, which link human emotional experience to their interpretations of events happening in the environment, are leading approaches to model emotions. Cognitive appraisal theories have often been used both for simulating “real emotions” in virtual characters and for predicting the human user’s emotional experience to facilitate human–computer interaction. In this work, we investigate the computational modeling of appraisal in a multi-agent decision-theoretic framework using Partially Observable Markov Decision Process-based (POMDP) agents. Domain-independent approaches are developed for five key appraisal dimensions (motivational relevance, motivation congruence, accountability, control and novelty). We also discuss how the modeling of theory of mind (recursive beliefs about self and others) is realized in the agents and is critical for simulating social emotions. Our model of appraisal is applied to three different scenarios to illustrate its usages. This work not only provides a solution for computationally modeling emotion in POMDP-based agents, but also illustrates the tight relationship between emotion and cognition—the appraisal dimensions are derived from the processes and information required for the agent’s decision-making and belief maintenance processes, which suggests a uniform cognitive structure for emotion and cognition. 相似文献

12.

An objective and subjective study of the role of semantics and prosodic features in building corpora for emotional TTS

Navas E. Hernaez I. Iker Luengo 《IEEE transactions on audio, speech, and language processing》2006,14(4):1117-1127

Building a text corpus suitable to be used in corpus-based speech synthesis is a time-consuming process that usually requires some human intervention to select the desired phonetic content and the necessary variety of prosodic contexts. If an emotional text-to-speech (TTS) system is desired, the complexity of the corpus generation process increases. This paper presents a study aiming to validate or reject the use of a semantically neutral text corpus for the recording of both neutral and emotional (acted) speech. The use of this kind of texts would eliminate the need to include semantically emotional texts into the corpus. The study has been performed for Basque language. It has been made by performing subjective and objective comparisons between the prosodic characteristics of recorded emotional speech using both semantically neutral and emotional texts. At the same time, the performed experiments allow for an evaluation of the capability of prosody to carry emotional information in Basque language. Prosody manipulation is the most common processing tool used in concatenative TTS. Experiments of automatic recognition of the emotions considered in this paper (the "Big Six emotions") show that prosody is an important emotional indicator, but cannot be the only manipulated parameter in an emotional TTS system-at least not for all the emotions. Resynthesis experiments transferring prosody from emotional to neutral speech have also been performed. They corroborate the results and support the use of a neutral-semantic-content text in databases for emotional speech synthesis. 相似文献

13.

Motion capture employing an uncalibrated camera

Joo Kooi Tan Seiji Ishikawa Kensuke Kouno Hirofumi Ohbuchi Hyoungseop Kim 《Artificial Life and Robotics》2008,13(1):311-314

This paper describes a novel optical motion capture technique. Motion capture is a technique for producing a 3-D model of a human motion or action. Unlike the existent methods of motion capture, the presented technique neither performs camera calibration nor employs markers. Instead it makes use of a motion database, and, by retrieving the database, it recognizes the unknown motion of an observed human and it reproduces the motion by use of a digital human model. This new motion capture system may provide a simple and easy-to-use motion capture system. Performance of the proposed technique is shown experimentally. This work was presented in part at the 13th International Symposium on Artificial Life and Robotics, Oita, Japan, January 31–February 2, 2008 相似文献

14.

Interaction of robot with humans by communicating simulated emotional states through expressive movements

Sara Baber Sial Muhammad Baber Sial Yasar Ayaz Syed Irtiza Ali Shah Aleksandar Zivanovic 《Intelligent Service Robotics》2016,9(3):231-255

This paper presents a non-verbal and non-facial method for effective communication of a “mechanoid robot” by conveying the emotions through gestures. This research focuses on human–robot interaction using a mechanoid robot that does not possess any anthropomorphic facial features for conveying gestures. Another feature of this research is the use of human-like smooth motion of this mechanoid robot in contrast to the traditional trapezoidal velocity profile for its communication. For conveying gestures, the connection between motion of robot and perceived emotions is established by varying the velocity and acceleration of the mechanoid structure. The selected motion parameters are changed systematically to observe the variation in perceived emotions. The perceived emotions have been further investigated using three different emotional behavior models: Russell’s circumplex model of affect, Tellegen–Watson–Clark model and PAD model. Results obtained show that the designated motion parameters are linked with the change of emotions. Moreover, the emotions perceived by the user are same through all three models, validating the reliability of all the three emotional scale models and also of the emotions perceived by the user. 相似文献

15.

The Rovereto Emotion and Cooperation Corpus: a new resource to investigate cooperation and emotions

Federica Cavicchio Massimo Poesio 《Language Resources and Evaluation》2012,46(1):117-130

The Rovereto Emotion and Cooperation Corpus (RECC) is a new resource collected to investigate the relationship between cooperation and emotions in an interactive setting. Previous attempts at collecting corpora to study emotions have shown that this data are often quite difficult to classify and analyse, and coding schemes to analyse emotions are often found not to be reliable. We collected a corpus of task-oriented (MapTask-style) dialogues in Italian, in which the segments of emotional interest are identified using psycho-physiological indexes (Heart Rate and Galvanic Skin Conductance) which are highly reliable. We then annotated these segments in accordance with novel multimodal annotation schemes for cooperation (in terms of effort) and facial expressions (an indicator of emotional state). High agreement was obtained among coders on all the features. The RECC corpus is to our knowledge the first resource with psycho-physiological data aligned with verbal and nonverbal behaviour data. 相似文献

16.

Glances,glares, and glowering: how should a virtual human express emotion through gaze?

Brent Lance Stacy Marsella 《Autonomous Agents and Multi-Agent Systems》2010,20(1):50-69

Gaze is an extremely powerful expressive signal that is used for many purposes, from expressing emotion to regulating human interaction. The use of gaze as a signal has been exploited to strong effect in hand-animated characters, greatly enhancing the believability of the character’s simulated life. However, virtual humans animated in real-time have been less successful at using expressive gaze. One reason for this is that we lack a model of expressive gaze in virtual humans. A gaze shift towards any specific target can be performed in many different ways, using many different expressive manners of gaze, each of which can potentially imply a different emotional or cognitive internal state. However, there is currently no mapping that describes how a user will attribute these internal states to a virtual character performing a gaze shift in a particular manner. In this paper, we begin to address this by providing the results of an empirical study that explores the mapping between an observer’s attribution of emotional state to gaze. The purpose of this mapping is to allow for an interactive virtual human to generate believable gaze shifts that a user will attribute a desired emotional state to. We have generated a set of animations by composing low-level gaze attributes culled from the nonverbal behavior literature. Then, subjects judged the animations displaying these attributes. While the results do not provide a complete mapping between gaze and emotion, they do provide a basis for a generative model of expressive gaze. 相似文献

17.

A blog emotion corpus for emotional expression analysis in Chinese

Changqin Quan Fuji Ren 《Computer Speech and Language》2010,24(4):726-749

Weblogs are increasingly popular modes of communication and they are frequently used as mediums for emotional expression in the ever changing online world. This work uses blogs as object and data source for Chinese emotional expression analysis. First, a textual emotional expression space model is described, and based on this model, a relatively fine-grained annotation scheme is proposed for manual annotation of an emotion corpus. In document and paragraph levels, emotion category, emotion intensity, topic word and topic sentence are annotated. In sentence level, emotion category, emotion intensity, emotional keyword and phrase, degree word, negative word, conjunction, rhetoric, punctuation, objective or subjective, and emotion polarity are annotated. Then, using this corpus, we explore these linguistic expressions that indicate emotion in Chinese, and present a detailed data analysis on them, involving mixed emotions, independent emotion, emotion transfer, and analysis on words and rhetorics for emotional expression. 相似文献

18.

Fiction support for realistic portrayals of fear-type emotional manifestations

C. Clavel I. Vasilescu L. Devillers 《Computer Speech and Language》2011,25(1):63-83

The present paper aims at filling the lack that currently exists with respect to databases containing emotional manifestations. Emotions, such as strong emotions, are indeed difficult to collect in real-life. They occur during contexts, which are generally unpredictable, and some of them such as anger are less frequent in public life than in private. Even though such emotions are not so present in existing databases, the need for applications, which target them (crisis management, surveillance, strategic intelligence, etc.), and the need for emotional recordings is even more acute. We propose here to use fictional media to compensate for the difficulty of collecting strong emotions. Emotions in realistic fictions are portrayed by skilled actors in interpersonal interactions. The mise-en-scene of the actors tends to stir genuine emotions. In addition, fiction offers an overall view of emotional manifestations in various real-life contexts: face-to-face interactions, phone calls, interviews, emotional event reporting vs. in situ emotional manifestations. A fear-type emotion recognition system has been developed, that is based on acoustic models learnt from the fiction corpus. This paper aims at providing an in-depth analysis of the various factors that may influence the system behaviour: the annotation issue and the acoustic features behaviour. These two aspects emphasize the main feature of fiction: the variety of the emotional manifestations and of their context. 相似文献

19.

System Integration for Cognitive Model of a Robot Partner

Jinseok Woo János Botzheim Naoyuki Kubota 《Intelligent Automation and Soft Computing》2018,24(4):829-841

This paper introduces the integrated system of a smart-device-based cognitive robot partner called iPhonoid-C. Interaction with a robot partner requires many elements, including verbal communication, nonverbal communication, and embodiment as well. A robot partner should be able to understand human sentences, as well as nonverbal information such as human gestures. In the proposed system, the robot has an emotional model connecting the input information from the human with the robot’s behavior. Since emotions are involved in human natural communication, and emotion has a significant impact on humans’ actions, it is important to develop an emotional model for the robot partner to enhance human robot interaction. In our proposed system, human sentences and gestures influence the robot’s emotional state, and then the robot will perform gestural and facial expressions and generate sentences according to its emotional state. The proposed cognitive method is validated using a real robot partner. 相似文献

20.

Human motion generation with multifactor models

Gengdai Liu Mingliang Xu Zhigeng Pan Abdennour El Rhalibi 《Computer Animation and Virtual Worlds》2011,22(4):351-359

To generate human motions with various specific attributes is a difficult task because of high dimensionality and complexity of human motions. This paper presents a novel human motion model for generating and editing motions with multiple factors. A set of motions performed by several actors with various styles was captured for constructing a well‐structured motion database. Subsequently, MICA (multilinear independent component analysis) model that combines ICA and conventional multilinear framework was adopted for the construction of a multifactor model. With this model, new motions can be synthesized by interpolation and through solving optimization problems for the specific factors. Our method offers a practical solution to edit stylistic human motions in a parametric space learnt with MICA model. We demonstrated the power of our method by generating and editing sideways stepping, reaching, and striding over obstructions using different actors with various styles. The experimental results show that our method can be used for interactive stylistic motion synthesis and editing. Copyright © 2011 John Wiley & Sons, Ltd. 相似文献