首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper proposes two stage speech emotion recognition approach using speaking rate. The emotions considered in this study are anger, disgust, fear, happy, neutral, sadness, sarcastic and surprise. At the first stage, based on speaking rate, eight emotions are categorized into 3 broad groups namely active (fast), normal and passive (slow). In the second stage, these 3 broad groups are further classified into individual emotions using vocal tract characteristics. Gaussian mixture models (GMM) are used for developing the emotion models. Emotion classification performance at broader level, based on speaking rate is found to be around 99% for speaker and text dependent cases. Performance of overall emotion classification is observed to be improved using the proposed two stage approach. Along with spectral features, the formant features are explored in the second stage, to achieve robust emotion recognition performance in case of speaker, gender and text independent cases.  相似文献   

2.
语音情感中基于ZCPA的VAP模型   总被引:1,自引:0,他引:1       下载免费PDF全文
分析一个基于心理学的情感空间模型原理。研究语音情感识别中7种情感(中性、喜悦、愤怒、惊讶、恐惧、悲伤和厌恶)的效价-激励-能量(VAP)维分布状况,根据过零峰值幅度(ZCPA)的最大值、最小值、均值和绝对值方差和,在VAP三维空间中分析维数水平和 ZCPA韵律特征之间的关系。实验结果表明,该情感空间模型原理有助于描述和区分各种语音情感。  相似文献   

3.
As technology advances, robots and virtual agents will be introduced into the home and healthcare settings to assist individuals, both young and old, with everyday living tasks. Understanding how users recognize an agent׳s social cues is therefore imperative, especially in social interactions. Facial expression, in particular, is one of the most common non-verbal cues used to display and communicate emotion in on-screen agents (Cassell et al., 2000). Age is important to consider because age-related differences in emotion recognition of human facial expression have been supported (Ruffman et al., 2008), with older adults showing a deficit for recognition of negative facial expressions. Previous work has shown that younger adults can effectively recognize facial emotions displayed by agents (Bartneck and Reichenbach, 2005, Courgeon et al., 2009, Courgeon et al., 2011, Breazeal, 2003); however, little research has compared in-depth younger and older adults’ ability to label a virtual agent׳s facial emotions, an import consideration because social agents will be required to interact with users of varying ages. If such age-related differences exist for recognition of virtual agent facial expressions, we aim to understand if those age-related differences are influenced by the intensity of the emotion, dynamic formation of emotion (i.e., a neutral expression developing into an expression of emotion through motion), or the type of virtual character differing by human-likeness. Study 1 investigated the relationship between age-related differences, the implication of dynamic formation of emotion, and the role of emotion intensity in emotion recognition of the facial expressions of a virtual agent (iCat). Study 2 examined age-related differences in recognition expressed by three types of virtual characters differing by human-likeness (non-humanoid iCat, synthetic human, and human). Study 2 also investigated the role of configural and featural processing as a possible explanation for age-related differences in emotion recognition. First, our findings show age-related differences in the recognition of emotions expressed by a virtual agent, with older adults showing lower recognition for the emotions of anger, disgust, fear, happiness, sadness, and neutral. These age-related difference might be explained by older adults having difficulty discriminating similarity in configural arrangement of facial features for certain emotions; for example, older adults often mislabeled the similar emotions of fear as surprise. Second, our results did not provide evidence for the dynamic formation improving emotion recognition; but, in general, the intensity of the emotion improved recognition. Lastly, we learned that emotion recognition, for older and younger adults, differed by character type, from best to worst: human, synthetic human, and then iCat. Our findings provide guidance for design, as well as the development of a framework of age-related differences in emotion recognition.  相似文献   

4.
The classification of facial expressions by cascade-correlation neural networks [1] is described. A success rate of 100% over the training data for each of six categories of emotion —happiness, sadness, anger, surprise, fear and disgust — and of up to 87.5% over the same categories for the test data, has been achieved. By using single emotion nets for each category, together with a Net for Resolution, the results represent a 12.5% success rate beyond what was achieved by a single net classifying over all six emotion categories. Face data in the form of 10 hand measurements made on 94 well validated full face photographs [2] provided the input data after normalisation. These measures, among others, had previously been shown to discriminate between emotions [3].  相似文献   

5.
With technology allowing for increased realism in video games, realistic, human-like characters risk falling into the Uncanny Valley. The Uncanny Valley phenomenon implies that virtual characters approaching full human-likeness will evoke a negative reaction from the viewer, due to aspects of the character’s appearance and behavior differing from the human norm. This study investigates if “uncanniness” is increased for a character with a perceived lack of facial expression in the upper parts of the face. More important, our study also investigates if the magnitude of this increased uncanniness varies depending on which emotion is being communicated. Individual parameters for each facial muscle in a 3D model were controlled for the six emotions: anger, disgust, fear, happiness, sadness and surprise in addition to a neutral expression. The results indicate that even fully and expertly animated characters are rated as more uncanny than humans and that, in virtual characters, a lack of facial expression in the upper parts of the face during speech exaggerates the uncanny by inhibiting effective communication of the perceived emotion, significantly so for fear, sadness, disgust, and surprise but not for anger and happiness. Based on our results, we consider the implications for virtual character design.  相似文献   

6.
Human emotion expressed in social media plays an increasingly important role in shaping policies and decisions. However, the process by which emotion produces influence in online social media networks is relatively unknown. Previous works focus largely on sentiment classification and polarity identification but do not adequately consider the way emotion affects user influence. This research developed a novel framework, a theory-based model, and a proof-of-concept system for dissecting emotion and user influence in social media networks. The system models emotion-triggered influence and facilitates analysis of emotion-influence causality in the context of U.S. border security (using 5,327,813 tweets posted by 1,303,477 users). Motivated by a theory of emotion spread, the model was integrated in an influence-computation method, called the interaction modeling (IM) approach, which was compared with a benchmark using a user centrality (UC) approach based on social positions. IM was found to have identified influential users who are more broadly related to U.S. cultural issues. Influential users tended to express intense emotions of fear, anger, disgust, and sadness. The emotion trust distinguishes influential users from others, whereas anger and fear contributed significantly to causing user influence. The research contributes to incorporating human emotion into the data-information-knowledge-wisdom model of knowledge management and to providing new information systems artifacts and new causality findings for emotion-influence analysis.  相似文献   

7.
Context-Independent Multilingual Emotion Recognition from Speech Signals   总被引:3,自引:0,他引:3  
This paper presents and discusses an analysis of multilingual emotion recognition from speech with database-specific emotional features. Recognition was performed on English, Slovenian, Spanish, and French InterFace emotional speech databases. The InterFace databases included several neutral speaking styles and six emotions: disgust, surprise, joy, fear, anger and sadness. Speech features for emotion recognition were determined in two steps. In the first step, low-level features were defined and in the second high-level features were calculated from low-level features. Low-level features are composed from pitch, derivative of pitch, energy, derivative of energy, and duration of speech segments. High-level features are statistical presentations of low-level features. Database-specific emotional features were selected from high-level features that contain the most information about emotions in speech. Speaker-dependent and monolingual emotion recognisers were defined, as well as multilingual recognisers. Emotion recognition was performed using artificial neural networks. The achieved recognition accuracy was highest for speaker-dependent emotion recognition, smaller for monolingual emotion recognition and smallest for multilingual recognition. The database-specific emotional features are most convenient for use in multilingual emotion recognition. Among speaker-dependent, monolingual, and multilingual emotion recognition, the difference between emotion recognition with all high-level features and emotion recognition with database-specific emotional features is smallest for multilingual emotion recognition—3.84%.  相似文献   

8.
情绪识别在人工智能领域具有广阔的应用前景,目前基于人脸表情的情绪识别已经相对成熟,而根据人类肢体动作进行情绪识别的研究却不多。通过VLBP和LBP-TOP算子从三维空间中提取图像序列的肢体动作特征,分析愤怒、无聊、厌恶、恐惧、高兴、疑惑和悲伤7种自然情绪的特点,并用参数优化的支持向量机对情绪分类识别,识别率最高能够达到77.0%。实验结果表明,VLBP和LBP-TOP算子具有较强的鲁棒性,能有效的从肢体动作中识别人的情绪。  相似文献   

9.
With the changes in the interface paradigm, a user may not be satisfied using only a behavior-based interface such as a mouse and keyboard. In this paper, we propose a real-time user interface with emotion recognition that depends on the need for skill development to support a change in the interface paradigm to one that is more human centered. The proposed emotion recognition technology may provide services to meet the need to recognize emotions when using contents. Until now, most studies on an emotion recognition interface have used a single signal, which was difficult to apply because of low accuracy. In this study, we developed a complex biological signal emotion recognition system that blends the ratio of an ECG for the autonomic nervous system and the relative power value of an EEG (theta, alpha, beta, and gamma) to improve the low accuracy. The system creates a data map that stores user-specific probabilities to recognize six kinds of feelings (amusement, fear, sadness, joy, anger, and disgust). It updates the weights to improve the accuracy of the emotion corresponding to the brain waves of each channel. In addition, we compared the results of the complex biological signal data set and EEG data set to verify the accuracy of the complex biological signal, and found that the accuracy had increased by 35.78%. The proposed system will be utilized as an interface for controlling a game and smart space for a user with high accuracy.  相似文献   

10.
微博等社交媒体为人们情绪表达提供了重要平台,分析微博的情绪倾向具有重要的商业价值和社会意义。文中提出了基于词典的规则方法识别微博所表达的喜、哀、怒、惧、恶、惊六种情绪。针对情绪表达的重要线索表情符利用互信息法生成了表情符词典,与传统情绪词典相结合,制定了针对否定用法的规则对微博进行分析。建立了第一个包含六种情绪的人工标注微博数据集。实验表明,传统的情绪词典虽然收录了大量词汇,但对于社交媒体文本分析的准确率和覆盖率都不高。表情符词典的应用显著地提高了微博情绪分析的精度和覆盖率。  相似文献   

11.
To solve the speaker independent emotion recognition problem, a three-level speech emotion recognition model is proposed to classify six speech emotions, including sadness, anger, surprise, fear, happiness and disgust from coarse to fine. For each level, appropriate features are selected from 288 candidates by using Fisher rate which is also regarded as input parameter for Support Vector Machine (SVM). In order to evaluate the proposed system, principal component analysis (PCA) for dimension reduction and artificial neural network (ANN) for classification are adopted to design four comparative experiments, including Fisher + SVM, PCA + SVM, Fisher + ANN, PCA + ANN. The experimental results proved that Fisher is better than PCA for dimension reduction, and SVM is more expansible than ANN for speaker independent speech emotion recognition. The average recognition rates for each level are 86.5%, 68.5% and 50.2% respectively.  相似文献   

12.
针对基于生理信号的情感识别问题,采用具有模拟退火机制的遗传算法、最大最小蚁群算法和粒子群算法来进行特征选择,用Fisher分类器对高兴、惊奇、厌恶、悲伤、愤怒和恐惧6种情感进行分类,获得了较高的识别率,并找出了对情感识别系统模型的构建具有较好性能的特征组合,建立了对6类情感具有预测能力的识别系统。  相似文献   

13.
Robin H. Kay   《Computers & Education》2008,50(4):1269-1283
Most computer users have to deal with major software upgrades every 6–18 months. Given the pressure of having to adjust so quickly and so often, it is reasonable to assume that users will express emotional reactions such as anger, desperation, anxiety, or relief during the learning process. To date, the primary emotion studied with respect to computer knowledge has been anxiety or fear. The purpose of the following study was to explore the relationship among a broader range of emotions (anger, anxiety, happiness, and sadness) and the acquisition of nine computer related skills. Pre- and post-surveys were given to 184 preservice education students enrolled in 8 month, integrated laptop program. Happiness was expressed most of the time – anxiety, anger, and sadness were reported sometimes. Anxiety and anger levels decreased significantly, while computer knowledge increased. All four emotions were significantly correlated with all nine computer knowledge areas at the beginning of the program, but happiness and anxiety were the only emotions significantly related to change in computer knowledge.  相似文献   

14.
Recently, researchers have tried to better understand human behaviors so as to let robots act in more human ways, which means a robot may have its own emotions defined by its designers. To achieve this goal, in this study, we designed and simulated a robot, named Shiau_Lu, which is empowered with six universal human emotions, including happiness, anger, fear, sadness, disgust and surprise. When we input a sentence to Shiau_Lu through voice, it recognizes the sentence by invoking the Google speech recognition method running on an Android system, and outputs a sentence to reveal its current emotional states. Each input sentence affects the strength of the six emotional variables used to represent the six emotions, one corresponding to one. After that, the emotional variables will change into new states. The consequent fuzzy inference process infers and determines the most significant emotion as the primary emotion, with which an appropriate output sentence as a response of the input is chosen from its Output-sentence database. With the new states of the six emotional variables, when the robot encounters another sentence, the above process repeats and another output sentence is then selected and replied. Artificial intelligence and psychological theories of human behaviors have been applied to the robot to simulate how emotions are influenced by the outside world through languages. In fact, the robot may help autistic children to interact more with the world around them and relate themselves well to the outside world.  相似文献   

15.
This paper explores the excitation source features of speech production mechanism for characterizing and recognizing the emotions from speech signal. The excitation source signal is obtained from speech signal using linear prediction (LP) analysis, and it is also known as LP residual. Glottal volume velocity (GVV) signal is also used to represent excitation source, and it is derived from LP residual signal. Speech signal has high signal to noise ratio around the instants of glottal closure (GC). These instants of glottal closure are also known as epochs. In this paper, the following excitation source features are proposed for characterizing and recognizing the emotions: sequence of LP residual samples and their phase information, parameters of epochs and their dynamics at syllable and utterance levels, samples of GVV signal and its parameters. Auto-associative neural networks (AANN) and support vector machines (SVM) are used for developing the emotion recognition models. Telugu and Berlin emotion speech corpora are used to evaluate the developed models. Anger, disgust, fear, happy, neutral and sadness are the six emotions considered in this study. About 42 % to 63 % of average emotion recognition performance is observed using different excitation source features. Further, the combination of excitation source and spectral features has shown to improve the emotion recognition performance up to 84 %.  相似文献   

16.
Representation of facial expressions using continuous dimensions has shown to be inherently more expressive and psychologically meaningful than using categorized emotions, and thus has gained increasing attention over recent years. Many sub-problems have arisen in this new field that remain only partially understood. A comparison of the regression performance of different texture and geometric features and the investigation of the correlations between continuous dimensional axes and basic categorized emotions are two of these. This paper presents empirical studies addressing these problems, and it reports results from an evaluation of different methods for detecting spontaneous facial expressions within the arousal–valence (AV) dimensional space. The evaluation compares the performance of texture features (SIFT, Gabor, LBP) against geometric features (FAP-based distances), and the fusion of the two. It also compares the prediction of arousal and valence, obtained using the best fusion method, to the corresponding ground truths. Spatial distribution, shift, similarity, and correlation are considered for the six basic categorized emotions (i.e. anger, disgust, fear, happiness, sadness, surprise). Using the NVIE database, results show that the fusion of LBP and FAP features performs the best. The results from the NVIE and FEEDTUM databases reveal novel findings about the correlations of arousal and valence dimensions to each of six basic emotion categories.  相似文献   

17.
为克服由传统语音情感识别模型的缺陷导致的识别正确率不高的问题,将过程神经元网络引入到语音情感识别中来。通过提取基频、振幅、音质特征参数作为语音情感特征参数,利用小波分析去噪,主成分分析(PCA)消除冗余,用过程神经元网络对生气、高兴、悲伤和惊奇四种情感进行识别。实验结果表明,与传统的识别模型相比,使用过程神经元网络具有较好的识别效果。  相似文献   

18.
We present an algorithm for generating facial expressions for a continuum of pure and mixed emotions of varying intensity. Based on the observation that in natural interaction among humans, shades of emotion are much more frequently encountered than expressions of basic emotions, a method to generate more than Ekmans six basic emotions (joy, anger, fear, sadness, disgust and surprise) is required. To this end, we have adapted the algorithm proposed by Tsapatsoulis et al. [1] to be applicable to a physics-based facial animation system and a single, integrated emotion model. A physics-based facial animation system was combined with an equally flexible and expressive text-to-speech synthesis system, based upon the same emotion model, to form a talking head capable of expressing non-basic emotions of varying intensities. With a variety of life-like intermediate facial expressions captured as snapshots from the system we demonstrate the appropriateness of our approach.
Hans-Peter SeidelEmail:
  相似文献   

19.
Emotion recognition using physiological signals has gained momentum in the field of human computer–interaction. This work focuses on developing a user‐independent emotion recognition system that would classify five emotions (happiness, sadness, fear, surprise and disgust) and neutral state. The various stages such as design of emotion elicitation protocol, data acquisition, pre‐processing, feature extraction and classification are discussed. Emotional data were obtained from 30 undergraduate students by using emotional video clips. Power and entropy features were obtained in three ways – by decomposing and reconstructing the signal using empirical mode decomposition, by using a Hilbert–Huang transform and by applying a discrete Fourier transform to the intrinsic mode functions (IMFs). Statistical analysis using analysis of variance indicates significant differences among the six emotional states (p < 0.001). Classification results indicate that applying the discrete Fourier transform instead of the Hilbert transform to the IMFs provides comparatively better accuracy for all the six classes with an overall accuracy of 52%. Although the accuracy is less, it reveals the possibility of developing a system that could identify the six emotional states in a user‐independent manner using electrocardiogram signals. The accuracy of the system can be improved by investigating the power and entropy of the individual IMFs.  相似文献   

20.
一种基于HMM和ANN的语音情感识别分类器   总被引:2,自引:0,他引:2  
罗毅 《微计算机信息》2007,23(34):218-219,296
针对在语音情感识别中孤立使用隐马尔科夫模型(HMM)固有的分类特性较差的缺点,本文提出了利用隐马尔科夫模型和径向基函数神经网络(RBF)对惊奇,愤怒,喜悦,悲伤,厌恶5种语音情感进行识别的方法。该方法借助HMM规整语音情感特征向量,并用RBF作为最终的决策分类器。实验结果表明在本文的实验条件下此方法和孤立HMM相比具有更好的性能,厌恶的识别率有了较大改进。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号