首页 | 本学科首页   官方微博 | 高级检索  
     

基于一种改进的监督流形学习算法的语音情感识别
引用本文:张石清,李乐民,赵知劲.基于一种改进的监督流形学习算法的语音情感识别[J].电子与信息学报,2010,32(11):2724-2729.
作者姓名:张石清  李乐民  赵知劲
作者单位:1. 电子科技大学通信与信息工程学院,成都,610054;台州学院物理与电子工程学院,台州,318000
2. 电子科技大学通信与信息工程学院,成都,610054
3. 杭州电子科技大学通信工程学院,杭州,310018
摘    要:为了有效提高语音情感识别的性能,需要对嵌入在高维声学特征空间的非线性流形上的语音特征数据作非线性降维处理。监督局部线性嵌入(SLLE)是一种典型的用于非线性降维的监督流形学习算法。该文针对SLLE存在的缺陷,提出一种能够增强低维嵌入数据的判别力,具备最优泛化能力的改进SLLE算法。利用该算法对包含韵律和音质特征的48维语音情感特征数据进行非线性降维,提取低维嵌入判别特征用于生气、高兴、悲伤和中性4类情感的识别。在自然情感语音数据库的实验结果表明,该算法仅利用较少的9维嵌入特征就取得了90.78%的最高正确识别率,比SLLE提高了15.65%。可见,该算法用于语音情感特征数据的非线性降维,可以较好地改善语音情感识别结果。

关 键 词:语音情感识别    非线性降维    流形学习    监督局部线性嵌入
收稿时间:2009-11-06

Speech Emotion Recognition Based on an Improved Supervised Manifold Learning Algorithm
Zhang Shi-qing,Li Le-min,Zhao Zhi-jin.Speech Emotion Recognition Based on an Improved Supervised Manifold Learning Algorithm[J].Journal of Electronics & Information Technology,2010,32(11):2724-2729.
Authors:Zhang Shi-qing  Li Le-min  Zhao Zhi-jin
Affiliation:(School of Communication and Information Engineering, University of Electronic Science and Technology of China, Chengdu 610054, China)
(School of Telecommunication, Hangzhou Dianzi University, Hangzhou 310018, China)
(School of Physics and Electronic Engineering, Taizhou University, Taizhou 318000, China)
Abstract:To improve effectively the performance on speech emotion recognition, it is needed to perform nonlinear dimensionality reduction for speech feature data lying on a nonlinear manifold embedded in high-dimensional acoustic space. Supervised Locally Linear Embedding (SLLE) is a typical supervised manifold learning algorithm for nonlinear dimensionality reduction. Considering the existing drawbacks of SLLE, this paper proposes an improved version of SLLE, which enhances the discriminating power of low-dimensional embedded data and possesses the optimal generalization ability. The proposed algorithm is used to conduct nonlinear dimensionality reduction for 48-dimensional speech emotional feature data including prosody and voice quality features, and extract low-dimensional embedded discriminating features so as to recognize four emotions including anger, joy, sadness and neutral. Experimental results on the natural speech emotional database demonstrate that the proposed algorithm obtains the highest accuracy of 90.78% with only less 9 embedded features, making 15.65% improvement over SLLE. Therefore, the proposed algorithm can significantly improve speech emotion recognition results when applied for reducing dimensionality of speech emotional feature data.
Keywords:Speech emotion recognition  Nonlinear dimensionality reduction  Manifold learning  Supervised locally linear embedding
本文献已被 万方数据 等数据库收录!
点击此处可从《电子与信息学报》浏览原始摘要信息
点击此处可从《电子与信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号