首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 10 毫秒
1.
图像是一种用来传达情感的重要工具,人类的情感会因不同的视觉刺激而异。采用了一种基于小数据集的数据扩充方式,并将图像的手工提取的低级特征(颜色特征、纹理特征)和网络自动提取到的高级特征(图像对象类别特征和图像深层情感特征)融合的方法,识别图像的复合情感。最终输出包含图像和对象在内的高级语义描述性短语。在公共数据集IAPS和GAPED上进行了实验,并与传统手工提取方法和VGG16、Fine-tune Alexnet两种已有模型进行了比较,该方法在测试性能上优于其他的识别方法,情感识别准确率能达到66.54%。  相似文献   

2.
The detection and monitoring of emotions are important in various applications, e.g., to enable naturalistic and personalised human-robot interaction. Emotion detection often require modelling of various data inputs from multiple modalities, including physiological signals (e.g., EEG and GSR), environmental data (e.g., audio and weather), videos (e.g., for capturing facial expressions and gestures) and more recently motion and location data. Many traditional machine learning algorithms have been utilised to capture the diversity of multimodal data at the sensors and features levels for human emotion classification. While the feature engineering processes often embedded in these algorithms are beneficial for emotion modelling, they inherit some critical limitations which may hinder the development of reliable and accurate models. In this work, we adopt a deep learning approach for emotion classification through an iterative process by adding and removing large number of sensor signals from different modalities. Our dataset was collected in a real-world study from smart-phones and wearable devices. It merges local interaction of three sensor modalities: on-body, environmental and location into global model that represents signal dynamics along with the temporal relationships of each modality. Our approach employs a series of learning algorithms including a hybrid approach using Convolutional Neural Network and Long Short-term Memory Recurrent Neural Network (CNN-LSTM) on the raw sensor data, eliminating the needs for manual feature extraction and engineering. The results show that the adoption of deep-learning approaches is effective in human emotion classification when large number of sensors input is utilised (average accuracy 95% and F-Measure=%95) and the hybrid models outperform traditional fully connected deep neural network (average accuracy 73% and F-Measure=73%). Furthermore, the hybrid models outperform previously developed Ensemble algorithms that utilise feature engineering to train the model average accuracy 83% and F-Measure=82%)  相似文献   

3.
近年,情绪识别研究已经不再局限于面部和语音识别,基于脑电等生理信号的情绪识别日趋火热.但由于特征信息提取不完整或者分类模型不适应等问题,使得情绪识别分类效果不佳.基于此,本文提出一种微分熵(DE)、卷积神经网络(CNN)和门控循环单元(GRU)结合的混合模型(DE-CNN-GRU)进行基于脑电的情绪识别研究.将预处理后的脑电信号分成5个频带,分别提取它们的DE特征作为初步特征,输入到CNN-GRU模型中进行深度特征提取,并结合Softmax进行分类.在SEED数据集上进行验证,该混合模型得到的平均准确率比单独使用CNN或GRU算法的平均准确率分别高出5.57%与13.82%.  相似文献   

4.
陈晨  任南 《计算机系统应用》2023,32(10):284-292
情感计算是现代人机交互中的关键问题, 随着人工智能的发展, 基于脑电信号(electroencephalogram, EEG)的情绪识别已经成为重要的研究方向. 为了提高情绪识别的分类精度, 本研究引入堆叠自动编码器(stacked auto-encoder, SAE)对EEG多通道信号进行深度特征提取, 并提出一种基于广义正态分布优化的支持向量机(generalized normal distribution optimization based support vector machine, GNDO-SVM)情绪识别模型. 实验结果表明, 与基于遗传算法、粒子群算法和麻雀搜索算法优化的支持向量机模型相比, 所提出的GNDO-SVM模型具有更优的分类性能, 基于SAE深度特征的情感识别准确率达到了90.94%, 表明SAE能够有效地挖掘EEG信号不同通道间的深度相关性信息. 因此, 利用SAE深度特征结合GNDO-SVM模型可以有效地实现EEG信号的情绪识别.  相似文献   

5.
基于局部二值模式和深度学习的人脸识别   总被引:2,自引:0,他引:2  
张雯  王文伟 《计算机应用》2015,35(5):1474-1478
针对人脸识别中深度学习直接提取人脸特征时忽略了其局部结构特征的问题,提出一种将分块局部二值模式(LBP)与深度学习相结合的人脸识别方法.首先,将人脸图像分块,利用均匀LBP算子分别提取图像各局部的LBP直方图特征,再按照顺序连接在一起形成整个人脸的LBP纹理特征; 其次,将得到的LBP特征作为深度信念网络(DBN)的输入,逐层训练网络,并在顶层形成分类面; 最后,用训练好的深度信念网络对人脸样本进行识别.在ORL、YALE和FERET人脸库上的实验结果表明,所提算法与采用支持向量机(SVM)的方法相比,在小样本的人脸识别中有很好的识别效果.  相似文献   

6.

Speech emotion recognition (SER) systems identify emotions from the human voice in the areas of smart healthcare, driving a vehicle, call centers, automatic translation systems, and human-machine interaction. In the classical SER process, discriminative acoustic feature extraction is the most important and challenging step because discriminative features influence the classifier performance and decrease the computational time. Nonetheless, current handcrafted acoustic features suffer from limited capability and accuracy in constructing a SER system for real-time implementation. Therefore, to overcome the limitations of handcrafted features, in recent years, variety of deep learning techniques have been proposed and employed for automatic feature extraction in the field of emotion prediction from speech signals. However, to the best of our knowledge, there is no in-depth review study is available that critically appraises and summarizes the existing deep learning techniques with their strengths and weaknesses for SER. Hence, this study aims to present a comprehensive review of deep learning techniques, uniqueness, benefits and their limitations for SER. Moreover, this review study also presents speech processing techniques, performance measures and publicly available emotional speech databases. Furthermore, this review also discusses the significance of the findings of the primary studies. Finally, it also presents open research issues and challenges that need significant research efforts and enhancements in the field of SER systems.

  相似文献   

7.
Different physiological signals are of different origins and may describe different functions of the human body. This paper studied respiration (RSP) signals alone to figure out its ability in detecting psychological activity. A deep learning framework is proposed to extract and recognize emotional information of respiration. An arousal-valence theory helps recognize emotions by mapping emotions into a two-dimension space. The deep learning framework includes a sparse auto-encoder (SAE) to extract emotion-related features, and two logistic regression with one for arousal classification and the other for valence classification. For the development of this work an international database for emotion classification known as Dataset for Emotion Analysis using Physiological signals (DEAP) is adopted for model establishment. To further evaluate the proposed method on other people, after model establishment, we used the affection database established by Augsburg University in Germany. The accuracies for valence and arousal classification on DEAP are 73.06% and 80.78% respectively, and the mean accuracy on Augsburg dataset is 80.22%. This study demonstrates the potential to use respiration collected from wearable deices to recognize human emotions.  相似文献   

8.
针对情感识别进行研究,提出基于主成分分析法(PCA)过滤小波变换结合自回归模型提取的信号特征方法,并基于梯度提升分类树以实现情感分类.将特征提取的重点放在脑电信号变化情况以及小波分量变化情况作为脑电信号特征.采用Koelstra等提出的分析人类情绪状态的多模态标准数据库DEAP,提取8种正负情绪代表各个脑区的14个通道脑电数据.结果表明,算法对8种情感两两分类识别平均准确率为95.76%,最高准确率为98.75%,可为情感识别提供帮助.  相似文献   

9.

Emotion is considered a physiological state that appears whenever a transformation is observed by an individual in their environment or body. While studying the literature, it has been observed that combining the electrical activity of the brain, along with other physiological signals for the accurate analysis of human emotions is yet to be explored in greater depth. On the basis of physiological signals, this work has proposed a model using machine learning approaches for the calibration of music mood and human emotion. The proposed model consists of three phases (a) prediction of the mood of the song based on audio signals, (b) prediction of the emotion of the human-based on physiological signals using EEG, GSR, ECG, Pulse Detector, and finally, (c) the mapping has been done between the music mood and the human emotion and classifies them in real-time. Extensive experimentations have been conducted on the different music mood datasets and human emotion for influential feature extraction, training, testing and performance evaluation. An effort has been made to observe and measure the human emotions up to a certain degree of accuracy and efficiency by recording a person’s bio- signals in response to music. Further, to test the applicability of the proposed work, playlists are generated based on the user’s real-time emotion determined using features generated from different physiological sensors and mood depicted by musical excerpts. This work could prove to be helpful for improving mental and physical health by scientifically analyzing the physiological signals.

  相似文献   

10.
情感在感知、决策、逻辑推理和社交等一系列智能活动中起到核心作用,是实现人机交互和机器智能的重要元素。近年来,随着多媒体数据爆发式增长及人工智能的快速发展,情感计算与理解引发了广泛关注。情感计算与理解旨在赋予计算机系统识别、理解、表达和适应人的情感的能力来建立和谐人机环境,并使计算机具有更高、更全面的智能。根据输入信号的不同,情感计算与理解包含不同的研究方向。本文全面回顾了多模态情感识别、孤独症情感识别、情感图像内容分析以及面部表情识别等不同情感计算与理解方向在过去几十年的研究进展并对未来的发展趋势进行展望。对于每个研究方向,首先介绍了研究背景、问题定义和研究意义;其次从不同角度分别介绍了国际和国内研究现状,包括情感数据标注、特征提取、学习算法、部分代表性方法的性能比较和分析以及代表性研究团队等;然后对国内外研究进行了系统比较,分析了国内研究的优势和不足;最后讨论了目前研究存在的问题及未来的发展趋势与展望,例如考虑个体情感表达差异问题和用户隐私问题等。  相似文献   

11.
目的 深度置信网络能够从数据中自动学习、提取特征,在特征学习方面具有突出优势。极化SAR图像分类中存在海量特征利用率低、特征选取主观性强的问题。为了解决这一问题,提出一种基于深度置信网络的极化SAR图像分类方法。方法 首先进行海量分类特征提取,获得极化类、辐射类、空间类和子孔径类四类特征构成的特征集;然后在特征集基础上选取样本并构建特征矢量,用以输入到深度置信网络模型之中;最后利用深度置信网络的方法对海量分类特征进行逐层学习抽象,获得有效的分类特征进行分类。结果 采用AIRSAR数据进行实验,分类结果精度达到91.06%。通过与经典Wishart监督分类、逻辑回归分类方法对比,表现了深度置信网络方法在特征学习方面的突出优势,验证了方法的适用性。结论 针对极化SAR图像海量特征的选取与利用,提出了一种新的分类方法,为极化SAR图像分类提供了一种新思路,为深度置信网络获得更广泛地应用进行有益的探索和尝试。  相似文献   

12.
The multi-modal emotion recognition lacks the explicit mapping relation between emotion state and audio and image features, so extracting the effective emotion information from the audio/visual data is always a challenging issue. In addition, the modeling of noise and data redundancy is not solved well, so that the emotion recognition model is often confronted with the problem of low efficiency. The deep neural network (DNN) performs excellently in the aspects of feature extraction and highly non-linear feature fusion, and the cross-modal noise modeling has great potential in solving the data pollution and data redundancy. Inspired by these, our paper proposes a deep weighted fusion method for audio-visual emotion recognition. Firstly, we conduct the cross-modal noise modeling for the audio and video data, which eliminates most of the data pollution in the audio channel and the data redundancy in visual channel. The noise modeling is implemented by the voice activity detection(VAD), and the data redundancy in the visual data is solved through aligning the speech area both in audio and visual data. Then, we extract the audio emotion features and visual expression features via two feature extractors. The audio emotion feature extractor, audio-net, is a 2D CNN, which accepting the image-based Mel-spectrograms as input data. On the other hand, the facial expression feature extractor, visual-net, is a 3D CNN to which facial expression image sequence is feeded. To train the two convolutional neural networks on the small data set efficiently, we adopt the strategy of transfer learning. Next, we employ the deep belief network(DBN) for highly non-linear fusion of multi-modal emotion features. We train the feature extractors and the fusion network synchronously. And finally the emotion classification is obtained by the support vector machine using the output of the fusion network. With consideration of cross-modal feature fusion, denoising and redundancy removing, our fusion method show excellent performance on the selected data set.  相似文献   

13.
蔡军  胡洋揆  张毅  尹春林 《机器人》2018,40(4):510-517
针对DBN(深度置信网络)脑电信号识别率不高的问题,提出了多频带频域深度置信网络(multi-band FDBN)算法进行特征提取.不同频带存在个体性差异,它们对于分类结果的贡献不完全相同,本文利用带通滤波器将原始的脑电信号分成多个频段,再采用FFT(快速傅里叶变换)将时域信号转换为频域信号并作归一化处理,最后将每个频段的频域数据输入DBN进行训练识别.线下实验证明,相比FDBN(频域深度置信网络)算法,多频带FDBN的平均准确率提高了3.25%,且标准差更小,鲁棒性更好.最后,在智能轮椅平台上,利用多频带FDBN算法基于左右手运动想象脑电信号控制轮椅完成了"8"字形路径,证明了该算法在脑电信号特征提取中的有效性.  相似文献   

14.
目前恐高情绪分类中的生理信号主要涉及脑电、心电、皮电等, 考虑到脑电在采集和处理上的局限性以及多模态信号间的融合问题, 提出一种基于6种外周生理信号的动态加权决策融合算法. 首先, 通过虚拟现实技术诱发被试不同程度的恐高情绪, 同步记录心电、脉搏、肌电、皮电、皮温和呼吸这6种外周生理信号; 其次, 提取信号的统计特征和事件相关特征构建恐高情感数据集; 再次, 根据分类性能、模态和跨模态信息提出一种动态加权决策融合算法, 从而对多模态信号进行有效整合以提高识别精度. 最后, 将实验结果与先前相关研究进行对比, 同时在开源的WESAD情感数据集进行验证. 结论表明, 多模态外周生理信号有助于恐高情绪分类性能的提升, 提出的动态加权决策融合算法显著提升了分类性能和模型鲁棒性.  相似文献   

15.
目的 针对当前视频情感判别方法大多仅依赖面部表情、而忽略了面部视频中潜藏的生理信号所包含的情感信息,本文提出一种基于面部表情和血容量脉冲(BVP)生理信号的双模态视频情感识别方法。方法 首先对视频进行预处理获取面部视频;然后对面部视频分别提取LBP-TOP和HOG-TOP两种时空表情特征,并利用视频颜色放大技术获取BVP生理信号,进而提取生理信号情感特征;接着将两种特征分别送入BP分类器训练分类模型;最后利用模糊积分进行决策层融合,得出情感识别结果。结果 在实验室自建面部视频情感库上进行实验,表情单模态和生理信号单模态的平均识别率分别为80%和63.75%,而融合后的情感识别结果为83.33%,高于融合前单一模态的情感识别精度,说明了本文融合双模态进行情感识别的有效性。结论 本文提出的双模态时空特征融合的情感识别方法更能充分地利用视频中的情感信息,有效增强了视频情感的分类性能,与类似的视频情感识别算法对比实验验证了本文方法的优越性。另外,基于模糊积分的决策层融合算法有效地降低了不可靠决策信息对融合的干扰,最终获得更优的识别精度。  相似文献   

16.
提出了一种基于Gabor特征和深度信念网络(DBN)的人脸识别方法,通过提取Gabor人脸图像的不同尺度图进行卷积融合,将融合后的特征图作为DBN的输入数据,训练多层来获得更加抽象的特征表达,整个训练的过程中采用交差熵来微调DBN,模型的最顶层结合Softmax回归分类器对抽取后的特征进行分类.在AR人脸库测试的实验结果表明:将Gabor特征与DBN结合应用于人脸识别,其准确率可高达92.7%,与其他浅层学习模型相比,DBN学习了数据的高层特征的同时还降低了特征维数,提高了分类器的分类精度,最终有效改善了人脸识别率.  相似文献   

17.
基于PCA和SVM的普通话语音情感识别   总被引:1,自引:0,他引:1  
蒋海华  胡斌 《计算机科学》2015,42(11):270-273
在语音情感识别中,情感特征的选取与抽取是重要环节。目前,还没有非常有效的语音情感特征被提出。因此,在包含6种情感的普通话情感语料库中,根据普通话不同于西方语种的特点,选取了一些有效的情感特征,包含Mel频率倒谱系数、基频、短时能量、短时平均过零率和第一共振峰等,进行提取并计算得到不同的统计量;接着采用主成分分析(PCA)进行抽取;最后利用基于支持向量机(SVM)的语音情感识别系统进行分类。实验结果表明, 与其他一些重要的研究结果相比,该方法得到了较高的平均情感识别率, 且情感特征的选取、抽取及建模是合理、有效的。  相似文献   

18.
With an essential demand of human emotional behavior understanding and human machine interaction for the recent electronic applications, speaker emotion recognition is a key component which has attracted a great deal of attention among the researchers. Even though a handful of works are available in the literature for speaker emotion classification, the important challenges such as, distinct emotions, low quality recording, and independent affective states are still need to be addressed with good classifier and discriminative features. Accordingly, a new classifier, called fractional deep belief network (FDBN) is developed by combining deep belief network (DBN) and Fractional Calculus. This new classifier is trained with the multiple features such as tonal power ratio, spectral flux, pitch chroma and Mel frequency cepstral coefficients (MFCC) to make the emotional classes more separable through the spectral characteristics. The proposed FDBN classifier with integrated feature vectors is tested using two databases such as, Berlin database of emotional speech and real time Telugu database. The performance of the proposed FDBN and existing DBN classifiers are validated using False Acceptance Rate (FAR), False Rejection Rate (FRR) and Accuracy. The experimental results obtained by the proposed FDBN shows the accuracy of 98.39 and 95.88 % in Berlin and Telugu database.  相似文献   

19.
针对传统反向传播(BP)神经网络和支持向量机(SVM)存在的过拟合、维数灾难、参数选择困难等问题,提出了一种基于深度学习算法的航空发动机传感器故障检测方法.对发动机参数记录仪采集的多维数据进行预处理,建立基于深度置信网络(DBN)的故障检测模型,利用预处理后的数据对检测模型进行训练,经过DBN故障检测模型逐层特征学习实现了传感器故障检测.仿真结果表明:在无人工特征提取和人工特征提取的情况下,基于DBN故障检测的准确率均高于BP神经网络和SVM模型.  相似文献   

20.
针对目前难以提取到适合用于分类的人脸特征以及在非限条件下进行人脸识别准确率低的问题,提出了一种基于深度神经网络的特征加权融合人脸识别方法(DLWF)。首先,应用主动形状模型(ASM)提取出人脸面部的主要特征点,并根据主要特征点对人脸不同器官区域进行采样;然后,将所得采样块分别输入到对应的深度信念网络(DBN)中进行训练,获得网络最优参数;最后,利用Softmax回归求出各个区域的相似度向量,将多区域的相似度向量加权融合得到综合相似度评分进行人脸识别。经ORL和WFL人脸库上进行实验验证,DLWF算法的识别准确率分别达到97%和88.76%,与传统算法主成分分析(PCA)、支持向量机(SVM)、DBN及FIP+线性判别式分析(LDA)相比,无论是限制条件还是非限制条件下,识别率均有提高。实验结果表明,该算法具有高效的人脸识别能力。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号