融合功能性副语言的语音情感识别新方法 New Method of Speech Emotion Recognition Fusing Functional Paralanguages期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

融合功能性副语言的语音情感识别新方法

引用本文：	赵小蕾,毛启容,詹永照.融合功能性副语言的语音情感识别新方法[J].计算机科学与探索,2014(2):186-199.

作者姓名：	赵小蕾毛启容詹永照

作者单位：	[1]江苏大学计算机科学与通信工程学院,江苏镇江 212013 [2]中山大学新华学院,广州 510520

基金项目：	（国家自然科学基金）;（江苏省自然科学基金）.（江苏大学高级人才基金）.

摘要：	针对声音突发特征（笑声、哭声、叹息声等，称之为功能性副语言）携带大量情感信息，而包含这类突发特征的语句由于特征突发性的干扰整体情感识别率不高的问题，提出了融合功能性副语言的语音情感识别方法。该方法首先对待识别语句进行功能性副语言自动检测，根据检测结果将功能性副语言从语句中分离，从而得到较为纯净的两类信号：功能性副语言信号和传统语音信号，最后将两类信号的情感信息使用自适应权重融合方法进行融合，从而达到提高待识别语句情感识别率和系统鲁棒性的目的。在包含6种功能性副语言和6种典型情感的情感语料库上的实验表明：该方法在与人无关的情况下得到的情感平均识别率为67.41%，比线性加权融合、Dempster-Shafer（DS）证据理论、贝叶斯融合方法分别提高了4.2%、2.8%和2.4%，比融合前平均识别率提高了8.08%，该方法针对非特定人语音情感识别具有较好的鲁棒性及识别准确率。
关键词：	语音情感识别功能性副语言自动检测自适应权重融合识别
New Method of Speech Emotion Recognition Fusing Functional Paralanguages

ZHAO Xiaolei,MAO Qirong,ZHAN Yongzhao.New Method of Speech Emotion Recognition Fusing Functional Paralanguages[J].Journal of Frontier of Computer Science and Technology,2014(2):186-199.

Authors:	ZHAO Xiaolei MAO Qirong ZHAN Yongzhao

Affiliation:	ZHAO Xiaolei, MAO Qirong, ZHAN Yongzhao

Abstract:	According to the problem that sound burst features （laughter, cries, sighs, called functional paralanguages） contain a great deal of emotional information while the sentences containing emotional paralanguages have lower recognition accuracy, this paper proposes a method of speech emotion recognition fusing functional paralanguages. In this method, firstly the automatic detection of functional paralanguages is utilized for sentences. Then the functional paralanguages are separated from sentences based on detection results. Then two more pure types of signals：functional paralanguage and traditional speech are gotten. Finally, the emotional information of functional paralanguage and traditional speech is adaptively fused. The experimental results on speaker-independent emotion corpus containing six functional paralanguages and six typical emotions show that： average recognition rate of the proposed method is 67.41%, which is higher than the results of linear weighted fusion, Dempster-Shafer （DS） evidence theory, Bayesian fusion method and before the fusion by 4.2%, 2.8%, 2.4%and 8.08%. Thus, the method has better robustness and recognition accuracy for speaker independent speech emotion recognition.

Keywords:	speech emotion recognition functional paralanguage automatic detection adaptive weight fusion recog-nition
本文献已被 CNKI 维普等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏