期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

龙英潮丁美荣林桂锦刘鸿业曾碧卿《计算机系统应用》2021,30(12):218-225

情绪识别作为人机交互的热门领域,其技术已经被应用于医学、教育、安全驾驶、电子商务等领域.情绪主要由面部表情、声音、话语等进行表达,不同情绪表达时的面部肌肉、语气、语调等特征也不相同,使用单一模态特征确定的情绪的不准确性偏高,考虑到情绪表达主要通过视觉和听觉进行感知,本文提出了一种基于视听觉感知系统的多模态表情识别算法,分别从语音和图像模态出发,提取两种模态的情感特征,并设计多个分类器为单特征进行情绪分类实验,得到多个基于单特征的表情识别模型.在语音和图像的多模态实验中,提出了晚期融合策略进行特征融合,考虑到不同模型间的弱依赖性,采用加权投票法进行模型融合,得到基于多个单特征模型的融合表情识别模型.本文使用AFEW数据集进行实验,通过对比融合表情识别模型与单特征的表情识别模型的识别结果,验证了基于视听觉感知系统的多模态情感识别效果要优于基于单模态的识别效果. 相似文献

2.

Speech emotion recognition based on hierarchical attributes using feature nets

Huijuan Zhao Ning Ye Ruchuan Wang 《International Journal of Parallel, Emergent and Distributed Systems》2020,35(3):354-364

Speech emotion recognition is a challenging topic and has many important applications in our real life, especially in terms of human-computer interaction. Traditional methods are based on the pipeline of pre-processing, feature extraction, dimensionality reduction and emotion classification. Previous studies have focussed on emotion recognition based on two different models: discrete model and continuous model. Both the speaker's age and gender affect the speech emotion recognition in the two models. Moreover, investigation results shown that the dimensional attributes of emotion such as arousal, valence and dominance are related to each other. Based on these observations, we propose a new attributes recognition model using Feature Nets, aims to improve the emotion recognition performance and generalisation capabilities. The method utilises the corpus to train the age and gender classification model, which will be transferred to the main model: a hierarchical deep learning model, using age and gender as the high level attributes of the emotion. The public databases EMO-DB and IEMOCAP have been conducted to evaluate the performance both in the classification task and regression task. Experiment results show that the proposed approach based on attributes transferring can improve the recognition accuracy, no matter transferring age or gender. 相似文献

3.

Multimodal mood classification of Hindi and Western songs

Braja Gopal Patra Dipankar Das Sivaji Bandyopadhyay 《Journal of Intelligent Information Systems》2018,51(3):579-596

Music mood classification is one of the most interesting research areas in music information retrieval, and it has many real-world applications. Many experiments have been performed in mood classification or emotion recognition of Western music; however, research on mood classification of Indian music is still at initial stage due to scarcity of digitalized resources. In the present work, a mood taxonomy is proposed for Hindi and Western songs; both audio and lyrics were annotated using the proposed mood taxonomy. Differences in mood were observed during the annotation of the audio and lyrics for Hindi songs only. The detailed studies on mood classification of Hindi and Western music are presented for the requirement of the recommendation system. LibSVM and Feed-forward neural networks have been used to develop mood classification systems based on audio, lyrics, and a combination of them. The multimodal mood classification systems using Feed-forward neural networks for Hindi and Western songs obtained the maximum F-measures of 0.751 and 0.835, respectively. 相似文献

4.

多文化场景下的多模态情感识别

陈师哲王帅金琴《软件学报》2018,29(4):1060-1070

自动情感识别是一个非常具有挑战性的课题,并且有着广泛的应用价值.本文探讨了在多文化场景下的多模态情感识别问题.我们从语音声学和面部表情等模态分别提取了不同的情感特征,包括传统的手工定制特征和基于深度学习的特征,并通过多模态融合方法结合不同的模态,比较不同单模态特征和多模态特征融合的情感识别性能.我们在CHEAVD中文多模态情感数据集和AFEW英文多模态情感数据集进行实验,通过跨文化情感识别研究,我们验证了文化因素对于情感识别的重要影响,并提出3种训练策略提高在多文化场景下情感识别的性能,包括：分文化选择模型、多文化联合训练以及基于共同情感空间的多文化联合训练,其中基于共同情感空间的多文化联合训练通过将文化影响与情感特征分离,在语音和多模态情感识别中均取得最好的识别效果. 相似文献

5.

Multimodal object oriented user interfaces in mobile affective interaction 总被引：1，自引：1，他引：0

Efthymios Alepis Maria Virvou 《Multimedia Tools and Applications》2012,59(1):41-63

In this paper, we investigate an object oriented (OO) architecture for multimodal emotion recognition in interactive applications through mobile phones or handheld devices. Mobile phones are different from desktop computers since mobile phones are not performing any processing involving emotion recognition whereas desktop computers can perform such processing. In fact, in our approach, mobile phones have to pass all data collected to a server and then perform emotion recognition. The object oriented architecture that we have created, combines evidence from multiple modalities of interaction, namely the mobile device’s keyboard and the mobile device’s microphone, as well as data from emotion stereotypes. Moreover, the OO method classifies them into well structured objects with their own properties and methods. The resulting emotion detection server is capable of using and handling transmitted information from different mobile sources of multimodal data during human-computer interaction. As a test bed for the affective mobile interaction we have used an educational application that is incorporated into the mobile system. 相似文献

6.

Assessing the effects of different multimedia materials on emotions and learning performance for visual and verbal style learners

Chih-Ming Chen Ying-Chun Sun 《Computers & Education》2012

Multimedia materials are now increasingly used in curricula. However, individual preferences for multimedia materials based on visual and verbal cognitive styles may affect learners' emotions and performance. Therefore, in-depth studies that investigate how different multimedia materials affect learning performance and the emotions of learners with visual and verbal cognitive styles are needed. Additionally, many education scholars have argued that emotions directly affect learning performance. Therefore, a further study that confirms the relationships between learners' emotions and performance for learners with visual and verbal cognitive styles will provide useful knowledge in terms of designing an emotion-based adaptive multimedia learning system for supporting personalized learning. To investigate these issues, the study applies the Style of Processing (SOP) scale to identify verbalizers and visualizers. Moreover, the emotion assessment instrument emWave, which was developed by HeartMath, is applied to assess variations in emotional states for verbalizers and visualizers during learning processes. Three different multimedia materials, static text and image-based multimedia material, video-based multimedia material, and animated interactive multimedia material, were presented to verbalizers and visualizers to investigate how different multimedia materials affect individual learning performance and emotion, and to identify relationships between learning performance and emotion. Experimental results show that video-based multimedia material generates the best learning performance and most positive emotion for verbalizers. Moreover, dynamic multimedia materials containing video and animation are more appropriate for visualizers than static multimedia materials containing text and image. Finally, a partial correlation exists between negative emotion and learning performance; that is, negative emotion and pretest scores considered together and negative emotion alone can predict learning performance of visualizers who use video-based multimedia material for learning. 相似文献

7.

Assessing the effects of different multimedia materials on emotions and learning performance for visual and verbal style learners

《Computers & Education》2013,60(4):1273-1285

Multimedia materials are now increasingly used in curricula. However, individual preferences for multimedia materials based on visual and verbal cognitive styles may affect learners' emotions and performance. Therefore, in-depth studies that investigate how different multimedia materials affect learning performance and the emotions of learners with visual and verbal cognitive styles are needed. Additionally, many education scholars have argued that emotions directly affect learning performance. Therefore, a further study that confirms the relationships between learners' emotions and performance for learners with visual and verbal cognitive styles will provide useful knowledge in terms of designing an emotion-based adaptive multimedia learning system for supporting personalized learning. To investigate these issues, the study applies the Style of Processing (SOP) scale to identify verbalizers and visualizers. Moreover, the emotion assessment instrument emWave, which was developed by HeartMath, is applied to assess variations in emotional states for verbalizers and visualizers during learning processes. Three different multimedia materials, static text and image-based multimedia material, video-based multimedia material, and animated interactive multimedia material, were presented to verbalizers and visualizers to investigate how different multimedia materials affect individual learning performance and emotion, and to identify relationships between learning performance and emotion. Experimental results show that video-based multimedia material generates the best learning performance and most positive emotion for verbalizers. Moreover, dynamic multimedia materials containing video and animation are more appropriate for visualizers than static multimedia materials containing text and image. Finally, a partial correlation exists between negative emotion and learning performance; that is, negative emotion and pretest scores considered together and negative emotion alone can predict learning performance of visualizers who use video-based multimedia material for learning. 相似文献

8.

Machines Outperform Laypersons in Recognizing Emotions Elicited by Autobiographical Recollection

Joris H. Janssen Paul Tacken J.J.G. de Vries Egon L. van den Broek Joyce H.D.M. Westerink Pim Haselager 《Human-Computer Interaction》2013,28(6):479-517

Over the last decade, an increasing number of studies have focused on automated recognition of human emotions by machines. However, performances of machine emotion recognition studies are difficult to interpret because benchmarks have not been established. To provide such a benchmark, we compared machine with human emotion recognition. We gathered facial expressions, speech, and physiological signals from 17 individuals expressing 5 different emotional states. Support vector machines achieved an 82% recognition accuracy based on physiological and facial features. In experiments with 75 humans on the same data, a maximum recognition accuracy of 62.8% was obtained. As machines outperformed humans, automated emotion recognition might be ready to be tested in more practical applications. 相似文献

9.

情感计算与理解研究发展概述

下载免费PDF全文

姚鸿勋邓伟洪刘洪海洪晓鹏王甦菁杨巨峰赵思成《中国图象图形学报》2022,27(6):2008-2035

情感在感知、决策、逻辑推理和社交等一系列智能活动中起到核心作用,是实现人机交互和机器智能的重要元素。近年来,随着多媒体数据爆发式增长及人工智能的快速发展,情感计算与理解引发了广泛关注。情感计算与理解旨在赋予计算机系统识别、理解、表达和适应人的情感的能力来建立和谐人机环境,并使计算机具有更高、更全面的智能。根据输入信号的不同,情感计算与理解包含不同的研究方向。本文全面回顾了多模态情感识别、孤独症情感识别、情感图像内容分析以及面部表情识别等不同情感计算与理解方向在过去几十年的研究进展并对未来的发展趋势进行展望。对于每个研究方向,首先介绍了研究背景、问题定义和研究意义;其次从不同角度分别介绍了国际和国内研究现状,包括情感数据标注、特征提取、学习算法、部分代表性方法的性能比较和分析以及代表性研究团队等;然后对国内外研究进行了系统比较,分析了国内研究的优势和不足;最后讨论了目前研究存在的问题及未来的发展趋势与展望,例如考虑个体情感表达差异问题和用户隐私问题等。相似文献

10.

音乐情感识别研究综述

下载免费PDF全文

康健王海龙苏贵斌柳林《计算机工程与应用》2022,58(4):64-72

音乐是表达情感的重要载体,音乐情感识别广泛应用于各个领域.当前音乐情感研究中,存在音乐情感数据集稀缺、情感量化难度大、情感识别精准度有限等诸多问题,如何借助人工智能方法对音乐的情感趋向进行有效的、高质量的识别成为当前研究的热点与难点.总结目前音乐情感识别的研究现状,从音乐情感数据集、音乐情感模型、音乐情感分类方法三方面... 相似文献

11.

Cultural dependency analysis for understanding speech emotion 总被引：1，自引：0，他引：1

Norhaslinda Kamaruddin 《Expert systems with applications》2012,39(5):5115-5133

Speech has been one of the major communication medium for years and will continue to do so until video communication becomes widely available and easily accessible. Although numerous technologies have been developed to improve the effectiveness of speech communication system, human interaction with machines and robots are still far from ideal. It is acknowledged that human can communicate effectively with each other through the telephony system. This situation motivates many researchers to study in depth the human communication system, with emphasis on its ability to express and infer emotion for effective social communication. Understanding the interlocutors’ emotion and recognizing the listeners’ perception is the key to boost communication effectiveness and interaction. Nonetheless, the perceived emotion is subjective and very much dependent on culture, environment and the pre-emotional state of the listener. Attempts have been made to understand the influence of culture in speech emotion and researchers have reported mixed findings that lead us to believe there are some common acoustical characteristics that enable similar emotion to be discriminated universally across culture. Yet there are unique speech attributes that facilitate exclusive emotion recognition of a particular culture. Understanding culture dependency is thus important to the performance of the speech emotion recognition system.In this paper three different speech emotion databases; namely: Berlin Emo-db, NTU_American and NTU_Asian dataset were selected to represent three different cultures of European, American and Asian respectively focusing on three basic emotions of anger, happiness and sadness with neutral acting as a reference. Different data arrangements with accordance to varying degree of culture dependency were designed for the experimental setup to provide better understanding of inter-cultural and intra-cultural effect in recognizing the speech emotion. Features were extracted using Mel Frequency Cepstral Co-efficient (MFCC) method and classified with neural network (Multi Layer Perceptron (MLP)) and fuzzy neural networks; namely: Adaptive Network Fuzzy Inference System (ANFIS) and Generic Self-Organizing Fuzzy Neural Network (GenSOFNN) representing precise and linguistic fuzzy rule conjuncts respectively. From the experimental results, it can be observed that culture influences the speech emotion recognition accuracy. 75% accuracy performance was recorded for generalized homogeneous intra-cultural experiments whereas the accuracy performance dropped to almost as low as chance probability (25% for 4 classes) for both homogeneous and heterogeneous mixed-cultural inter-culture experiments. The two-stage culture-sensitive speech emotion recognition approach was subsequently proposed to discriminate culture and speech emotion. Results of the analysis show potential of using the proposed technique to recognize culture-influenced speech emotion, which can be extended in many applications, for instance call center and intelligent vehicle. Such analysis may help us to better understand the culture dependency of speech emotion and as a result the accuracy performance of the speech emotion recognition system can be boosted. 相似文献

12.

On measuring and employing texture directionality for image classification

Maskey Manil Newman Timothy S. 《Pattern Analysis & Applications》2021,24(4):1649-1665

Pattern Analysis and Applications - Directionality is useful in many computer vision, pattern recognition, visualization, and multimedia applications since it is considered as an important... 相似文献

13.

A portable HCI system‐oriented EEG feature extraction and channel selection for emotion recognition

Xiangwei Zheng Xiaofeng Liu Yuang Zhang Lizhen Cui Xiaomei Yu 《国际智能系统杂志》2021,36(1):152-176

Emotion recognition has become an important component of human–computer interaction systems. Research on emotion recognition based on electroencephalogram (EEG) signals are mostly conducted by the analysis of all channels' EEG signals. Although some progresses are achieved, there are still several challenges such as high dimensions, correlation between different features and feature redundancy in the realistic experimental process. These challenges have hindered the applications of emotion recognition to portable human–computer interaction systems (or devices). This paper explores how to find out the most effective EEG features and channels for emotion recognition so as to only collect data as less as possible. First, discriminative features of EEG signals from different dimensionalities are extracted for emotion classification, including the first difference, multiscale permutation entropy, Higuchi fractal dimension, and discrete wavelet transform. Second, relief algorithm and floating generalized sequential backward selection algorithm are integrated as a novel channel selection method. Then, support vector machine is employed to classify the emotions for verifying the performance of the channel selection method and extracted features. At last, experimental results demonstrate that the optimal channel set, which are mostly located at the frontal, has extremely high similarity on the self‐collected data set and the public data set and the average classification accuracy is achieved up to 91.31% with the selected 10‐channel EEG signals. The findings are valuable for the practical EEG‐based emotion recognition systems. 相似文献

14.

Modular neural-SVM scheme for speech emotion recognition using ANOVA feature selection method

Mansour Sheikhan Mahdi Bejani Davood Gharavian 《Neural computing & applications》2013,23(1):215-227

The speech signal consists of linguistic information and also paralinguistic one such as emotion. The modern automatic speech recognition systems have achieved high performance in neutral style speech recognition, but they cannot maintain their high recognition rate for spontaneous speech. So, emotion recognition is an important step toward emotional speech recognition. The accuracy of an emotion recognition system is dependent on different factors such as the type and number of emotional states and selected features, and also the type of classifier. In this paper, a modular neural-support vector machine (SVM) classifier is proposed, and its performance in emotion recognition is compared to Gaussian mixture model, multi-layer perceptron neural network, and C5.0-based classifiers. The most efficient features are also selected by using the analysis of variations method. It is noted that the proposed modular scheme is achieved through a comparative study of different features and characteristics of an individual emotional state with the aim of improving the recognition performance. Empirical results show that even by discarding 22% of features, the average emotion recognition accuracy can be improved by 2.2%. Also, the proposed modular neural-SVM classifier improves the recognition accuracy at least by 8% as compared to the simulated monolithic classifiers. 相似文献

15.

Low-power architectures for compressed domain video codingco-processor

Jie Chen Liu K.J.R. 《Multimedia, IEEE Transactions on》2000,2(2):111-128

Low power as a de facto is one of the most important criteria for many signal-processing system designs, particularly in multimedia cellular applications and multimedia system on chip design. There have been many approaches to achieve this design goal at many different implementation levels ranging from very-large-scale-integration fabrication technology to system design. In this paper, the multirate low-power design technique will be used along with other methods such as look-ahead, pipelining in designing cost-effective low-power architectures of compressed domain video coding co-processor. Our emphasis is on optimizing power consumption by minimizing computational units along the data path. We demonstrate both low-power and high-speed can be accomplished at algorithm/architecture level. Based on the calculation and simulation results, the design can achieve significant power savings in the range of 60%-80% or speedup factor of two at the needs of users 相似文献

16.

基于个性和OCC的机器人情感建模研究 总被引：1，自引：3，他引：1

王超王志良《微计算机信息》2005,21(3):181-182

机器人不仅要具有简单的机械作业和逻辑推理能力，还应当具有类似人类的情感能力．本文将个性与情绪、情感、理解、表达相结合，采用OCC模型作为评价标准，建立了符合人类情感规律的、可用于情感机器人的情感模型。通过一个应用上述模型的虚拟人情感交互系统．验证了此模型可以很好的对人类的情感进行仿真．可以应用于情感机器人和人性化计算机、游戏等许多领域。相似文献

17.

跨库语音情感识别研究进展

张石清刘瑞欣赵小明《计算机系统应用》2022,31(11):31-48

语音情感识别在人机交互过程中发挥极为重要的作用,近年来备受关注.目前,大多数的语音情感识别方法主要在单一情感数据库上进行训练和测试.然而,在实际应用中训练集和测试集可能来自不同的情感数据库.由于这种不同情感数据库的分布存在巨大差异性,导致大多数的语音情感识别方法取得的跨库识别性能不尽人意.为此,近年来不少研究者开始聚焦跨库语音情感识别方法的研究.本文系统性综述了近年来跨库语音情感识别方法的研究现状与进展,尤其对新发展起来的深度学习技术在跨库语音情感识别中的应用进行了重点分析与归纳.首先,介绍了语音情感识别中常用的情感数据库,然后结合深度学习技术,从监督、无监督和半监督学习角度出发,总结和比较了现有基于手工特征和深度特征的跨库语音情感识别方法的研究进展情况,最后对当前跨库语音情感识别领域存在的挑战和机遇进行了讨论与展望. 相似文献

18.

基于Markov决策过程的交互虚拟人情感计算模型 总被引：1，自引：0，他引：1

王国江王志良陈锋军王玉洁祝长生陶伟《计算机科学》2006,33(12):135-138

情感在生物体的交流和适应性方面起到了关键作用。同样,交互虚拟人也需要有恰如其分的表达情感的能力。由于具有情感交互能力的虚拟人在虚拟现实、电子教育、娱乐等领域均有着广阔的应用前景,当前,在虚拟人中加入情感成分的研究受到了越来越多的重视。本文提出了一个人工心理的情感计算模型,模型用马尔可夫过程来描述情感的变化过程,并且使用马尔可夫决策过程建立了情感、个性与环境之间的联系,并且我们把该模型应用到了一个交互虚拟人系统中。研究结果表明,模型能够构建具有不同性格特征的虚拟人,使之产生较为自然的情感反应。相似文献

19.

基于声学特征的语言情感识别

金琴陈师哲李锡荣杨刚许洁萍《计算机科学》2015,42(9):24-28

语音情感识别是语音处理领域中一个具有挑战性和广泛应用前景的研究课题。探索了语音情感识别中的关键问题之一:生成情感识别的有效的特征表示。从4个角度生成了语音信号中的情感特征表示:(1)低层次的声学特征,包括能量、基频、声音质量、频谱等相关的特征,以及基于这些低层次特征的统计特征;(2)倒谱声学特征根据情感相关的高斯混合模型进行距离转化而得出的特征;(3)声学特征依据声学词典进行转化而得出的特征;(4)声学特征转化为高斯超向量的特征。通过实验比较了各类特征在情感识别上的独立性能,并且尝试了将不同的特征进行融合,最后比较了不同的声学特征在几个不同语言的情感数据集上的效果(包括IEMOCAP英语情感语料库、CASIA汉语情感语料库和Berlin德语情感语料库)。在IEMOCAP数据集上,系统的正确识别率达到了71.9%,超越了之前在此数据集上报告的最好结果。相似文献

20.

Emotion modeling from speech signal based on wavelet packet transform

Varsha N. Degaonkar Shaila D. Apte 《International Journal of Speech Technology》2013,16(1):1-5

The recognition of emotion in human speech has gained increasing attention in recent years due to the wide variety of applications that benefit from such technology. Detecting emotion from speech can be viewed as a classification task. It consists of assigning, out of a fixed set, an emotion category e.g. happiness, anger, to a speech utterance. In this paper, we have tackled two emotions namely happiness and anger. The parameters extracted from speech signal depend on speaker, spoken word as well as emotion. To detect the emotion, we have kept the spoken utterance and the speaker constant and only the emotion is changed. Different features are extracted to identify the parameters responsible for emotion. Wavelet packet transform (WPT) is found to be emotion specific. We have performed the experiments using three methods. Method uses WPT and compares the number of coefficients greater than threshold in different bands. Second method uses energy ratios of different bands using WPT and compares the energy ratios in different bands. The third method is a conventional method using MFCC. The results obtained using WPT for angry, happy and neutral mode are 85 %, 65 % and 80 % respectively as compared to results obtained using MFCC i.e. 75 %, 45 % and 60 % respectively for the three emotions. Based on WPT features a model is proposed for emotion conversion namely neutral to angry and neutral to happy emotion. 相似文献