首页 | 本学科首页   官方微博 | 高级检索  
     

跨库语音情感识别研究进展
引用本文:张石清,刘瑞欣,赵小明.跨库语音情感识别研究进展[J].计算机系统应用,2022,31(11):31-48.
作者姓名:张石清  刘瑞欣  赵小明
作者单位:浙江科技学院 理学院, 杭州 310023;台州学院 智能信息处理研究所, 台州 317000
基金项目:国家自然科学基金 (61976149); 浙江省自然科学基金 (LZ20F020002)
摘    要:语音情感识别在人机交互过程中发挥极为重要的作用, 近年来备受关注. 目前, 大多数的语音情感识别方法主要在单一情感数据库上进行训练和测试 . 然而, 在实际应用中训练集和测试集可能来自不同的情感数据库. 由于这种不同情感数据库的分布存在巨大差异性, 导致大多数的语音情感识别方法取得的跨库识别性能不尽人意. 为此, 近年来不少研究者开始聚焦跨库语音情感识别方法的研究. 本文系统性综述了近年来跨库语音情感识别方法的研究现状与进展, 尤其对新发展起来的深度学习技术在跨库语音情感识别中的应用进行了重点分析与归纳. 首先, 介绍了语音情感识别中常用的情感数据库, 然后结合深度学习技术, 从监督、无监督和半监督学习角度出发, 总结和比较了现有基于手工特征和深度特征的跨库语音情感识别方法的研究进展情况, 最后对当前跨库语音情感识别领域存在的挑战和机遇进行了讨论与展望.

关 键 词:语音情感识别  跨库  深度学习  手工特征  深度特征  语音情感
收稿时间:2022/3/5 0:00:00
修稿时间:2022/4/2 0:00:00

Research Advance of Cross-corpus Speech Emotion Recognition
ZHANG Shi-Qing,LIU Rui-Xin,ZHAO Xiao-Ming.Research Advance of Cross-corpus Speech Emotion Recognition[J].Computer Systems& Applications,2022,31(11):31-48.
Authors:ZHANG Shi-Qing  LIU Rui-Xin  ZHAO Xiao-Ming
Affiliation:School of Science, Zhejiang University of Science and Technology, Hangzhou 310023, China;Institute of Intelligent Information Processing, Taizhou University, Taizhou 317000, China
Abstract:Speech emotion recognition (SER) plays an extremely important role in the process of human-computer interaction (HCI), which has attracted much attention in recent years. At present, most SER approaches are mainly trained and tested on a single emotion corpus. In practical applications, however, the training set and testing set may come from different emotion corpora. Due to the huge difference in the distribution of different emotion corpora, the cross-corpus recognition performance achieved by most SER methods is unsatisfactory. To address this issue, many researchers have started focusing on the studies of cross-corpus SER methods in recent years. This study systematically reviews the research status and progress of cross-corpus SER methods in recent years. In particular, the application of the newly developed deep learning techniques on cross-corpus SER tasks is analyzed and summarized. Firstly, the emotion corpora commonly used in SER are introduced. Then, on the basis of deep learning techniques, the research progress of existing cross-corpus SER methods based on hand-designed features and deep features is summarized and compared from the perspectives of supervised, unsupervised, and semi-supervised learning. Finally, the challenges and opportunities in the field of cross-corpus SER are discussed and predicted.
Keywords:speech emotion recognition  cross-corpus  deep learning  hand-designed features  deep features  speech emotion
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号