基于参数迁移和卷积循环神经网络的语音情感识别 Speech Emotion Recognition Model Based on Parameter Transfer and Convolutional Recurrent Neural Network期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于参数迁移和卷积循环神经网络的语音情感识别

引用本文：	缪裕青,邹巍,刘同来,周明,蔡国永.基于参数迁移和卷积循环神经网络的语音情感识别[J].计算机工程与应用,2019,55(10):135-140.

作者姓名：	缪裕青邹巍刘同来周明蔡国永

作者单位：	桂林电子科技大学计算机与信息安全学院,广西桂林,541004;桂林海威科技股份有限公司,广西桂林,541004

基金项目：	国家自然科学基金;广西自然科学基金;广西壮族自治区高等学校项目;广西自然科学基金重点项目;桂林电子科技大学研究生教育创新项目;桂林电子科技大学研究生教育创新项目

摘要：	在语音情感识别研究中，已有基于深度学习的方法大多没有针对语音时频两域的特征进行建模，且存在网络模型训练时间长、识别准确性不高等问题。语谱图是语音信号转换后具有时频两域的特殊图像，为了充分提取语谱图时频两域的情感特征，提出了一种基于参数迁移和卷积循环神经网络的语音情感识别模型。该模型把语谱图作为网络的输入，引入AlexNet网络模型并迁移其预训练的卷积层权重参数，将卷积神经网络输出的特征图重构后输入LSTM（Long Short-Term Memory）网络进行训练。实验结果表明，所提方法加快了网络训练的速度，并提高了情感识别的准确率。
关键词：	语谱图深度学习参数迁移卷积循环神经网络语音情感识别
Speech Emotion Recognition Model Based on Parameter Transfer and Convolutional Recurrent Neural Network

MIAO Yuqing,ZOU Wei,LIU Tonglai,ZHOU Ming,CAI Guoyong.Speech Emotion Recognition Model Based on Parameter Transfer and Convolutional Recurrent Neural Network[J].Computer Engineering and Applications,2019,55(10):135-140.

Authors:	MIAO Yuqing ZOU Wei LIU Tonglai ZHOU Ming CAI Guoyong

Affiliation:	1.School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, Guangxi 541004, China 2.Guilin Hivision Technology Co. Ltd., Guilin, Guangxi 541004, China

Abstract:	In the study of speech emotion recognition, most methods based on deep learning don’t model the time-frequency characteristics of speech. Moreover, the network model has long training time and the recognition accuracy is not high. The spectrogram is a special image with both time and frequency domains after the conversion of speech signals. In order to fully extract the emotional features of time-frequency domain of the spectrogram, this paper proposes a speech emotion recognition model based on parameter transfer and convolutional recurrent neural network. The proposed model uses the spectrogram as the input of network, introduces the AlexNet network model, and transfers its weighting parameters of pre-trained convolutional layer. The output feature maps of convolutional neural network is put into long short-term memory neural networks for training after being reconstructed. The experimental results show that the proposed method has faster speed of network training and higher accuracy of emotion recognition.

Keywords:	spectrogram deep learning parameter transfer convolutional recurrent neural network speech emotion recognition
本文献已被万方数据等数据库收录！
	点击此处可从《计算机工程与应用》浏览原始摘要信息
	点击此处可从《计算机工程与应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏