自闭症干预中无监督自编码的语音情感识别 A Speech Emotion Recognition Based on Unsupervised Autoencoder in the Intervention of Autism期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

自闭症干预中无监督自编码的语音情感识别

引用本文：	葛磊,强彦,赵涓涓.自闭症干预中无监督自编码的语音情感识别[J].软件学报,2016,27(S2):130-136.

作者姓名：	葛磊强彦赵涓涓

作者单位：	太原理工大学计算机科学与技术学院, 山西太原 030024,太原理工大学计算机科学与技术学院, 山西太原 030024,太原理工大学计算机科学与技术学院, 山西太原 030024

基金项目：	国家自然科学基金（61540007，61373100)；北京航空航天大学虚拟现实技术与系统国家重点实验室开放基金（BUAA-VR-15KF02，BUAA-VR-16KF13)

摘要：	语音情感识别是人机交互中重要的研究内容，儿童自闭症干预治疗中的语音情感识别系统有助于自闭症儿童的康复，但是由于目前语音信号中的情感特征多而杂，特征提取本身就是一项具有挑战性的工作，这样不利于整个系统的识别性能.针对这一问题，提出了一种语音情感特征提取算法，利用无监督自编码网络自动学习语音信号中的情感特征，通过构建一个3层的自编码网络提取语音情感特征，把多层编码网络学习完的高层特征作为极限学习机分类器的输入进行分类，其识别率为84.14%，比传统的基于提取人为定义特征的识别方法有所提高.
关键词：	语音情感识别极限学习机无监督自编码人机交互
收稿时间：	5/1/2016 12:00:00 AM
修稿时间：	2016/11/21 0:00:00
A Speech Emotion Recognition Based on Unsupervised Autoencoder in the Intervention of Autism

GE Lei,QIANG Yan and ZHAO Juan-Juan.A Speech Emotion Recognition Based on Unsupervised Autoencoder in the Intervention of Autism[J].Journal of Software,2016,27(S2):130-136.

Authors:	GE Lei QIANG Yan and ZHAO Juan-Juan

Affiliation:	College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan 030024, China,College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan 030024, China and College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan 030024, China

Abstract:	Speech emotion recognition is an important research area in human computer interaction (HCI). The speech emotion recognition system used in the intervention therapy for autistic children is helpful for their rehabilitation. However, the variation and complexity in speech emotion features, the extraction of which itself is a challenging task, will contribute to the difficulty to improve the recognition performance of the whole system. In view of this problem, this paper proposes a new method of speech emotion feature extraction with unsupervised auto-encoding network to learn emotional feature in speech signal automatically. By constructing a 3-layer auto-encoding network to extract the speech emotional feature, the high level feature is used as the input of extreme learning machine classifier to make final recognition. The speech emotion recognition rate of the system reaches 84.14%, which is higher than the traditional method based on human defined feature extraction.

Keywords:	speech emotion recognition extreme learning machine unsupervised autoencoder human computer interaction

	点击此处可从《软件学报》浏览原始摘要信息
	点击此处可从《软件学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏