首页 | 官方网站   微博 | 高级检索  
     

基于参数迁移和卷积循环神经网络的语音情感识别
引用本文:缪裕青,邹巍,刘同来,周明,蔡国永.基于参数迁移和卷积循环神经网络的语音情感识别[J].计算机工程与应用,2019,55(10):135-140.
作者姓名:缪裕青  邹巍  刘同来  周明  蔡国永
作者单位:桂林电子科技大学 计算机与信息安全学院,广西 桂林,541004;桂林海威科技股份有限公司,广西 桂林,541004
基金项目:国家自然科学基金;广西自然科学基金;广西壮族自治区高等学校项目;广西自然科学基金重点项目;桂林电子科技大学研究生教育创新项目;桂林电子科技大学研究生教育创新项目
摘    要:在语音情感识别研究中,已有基于深度学习的方法大多没有针对语音时频两域的特征进行建模,且存在网络模型训练时间长、识别准确性不高等问题。语谱图是语音信号转换后具有时频两域的特殊图像,为了充分提取语谱图时频两域的情感特征,提出了一种基于参数迁移和卷积循环神经网络的语音情感识别模型。该模型把语谱图作为网络的输入,引入AlexNet网络模型并迁移其预训练的卷积层权重参数,将卷积神经网络输出的特征图重构后输入LSTM(Long Short-Term Memory)网络进行训练。实验结果表明,所提方法加快了网络训练的速度,并提高了情感识别的准确率。

关 键 词:语谱图  深度学习  参数迁移  卷积循环神经网络  语音情感识别

Speech Emotion Recognition Model Based on Parameter Transfer and Convolutional Recurrent Neural Network
MIAO Yuqing,ZOU Wei,LIU Tonglai,ZHOU Ming,CAI Guoyong.Speech Emotion Recognition Model Based on Parameter Transfer and Convolutional Recurrent Neural Network[J].Computer Engineering and Applications,2019,55(10):135-140.
Authors:MIAO Yuqing  ZOU Wei  LIU Tonglai  ZHOU Ming  CAI Guoyong
Affiliation:1.School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, Guangxi 541004, China 2.Guilin Hivision Technology Co. Ltd., Guilin, Guangxi 541004, China
Abstract:In the study of speech emotion recognition, most methods based on deep learning don’t model the time-frequency characteristics of speech. Moreover, the network model has long training time and the recognition accuracy is not high. The spectrogram is a special image with both time and frequency domains after the conversion of speech signals. In order to fully extract the emotional features of time-frequency domain of the spectrogram, this paper proposes a speech emotion recognition model based on parameter transfer and convolutional recurrent neural network. The proposed model uses the spectrogram as the input of network, introduces the AlexNet network model, and transfers its weighting parameters of pre-trained convolutional layer. The output feature maps of convolutional neural network is put into long short-term memory neural networks for training after being reconstructed. The experimental results show that the proposed method has faster speed of network training and higher accuracy of emotion recognition.
Keywords:spectrogram  deep learning  parameter transfer  convolutional recurrent neural network  speech emotion recognition  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号