首页 | 本学科首页   官方微博 | 高级检索  
     

基于互信息自适应估计的说话人确认方法
引用本文:陈晨,季超群,李文文,陈德运,王莉莉,杨海陆. 基于互信息自适应估计的说话人确认方法[J]. 电子科技大学学报(自然科学版), 2023, 52(1): 125-131. DOI: 10.12178/1001-0548.2022174
作者姓名:陈晨  季超群  李文文  陈德运  王莉莉  杨海陆
作者单位:哈尔滨理工大学计算机科学与技术学院 哈尔滨 150080;哈尔滨理工大学计算机科学与技术博士后流动站 哈尔滨 150080;哈尔滨理工大学计算机科学与技术学院 哈尔滨 150080
基金项目:国家自然科学基金(62101163);黑龙江省自然科学基金(LH2021F029);中国博士后科学基金(2021M701020);黑龙江省博士后专项(LBH-Z20020);黑龙江省普通高校基本科研业务费(2020-KYYWF-0341)
摘    要:为了更准确地度量特征间的关系,提出了一种基于互信息自适应估计的目标函数表示方法。将具有自适应特性的度量方法引入到目标函数中,该目标函数以最大化类内相似度、最小化类间相似度为目标,并能根据深层特征的真实分布情况对相似度进行动态的调整,从而使深度神经网络朝着区分性更强的方向进行优化。此自适应度量方式还被用于特征筛选,其能够根据特征的特点进行有针对性的参数更新,使得选取的特征具有典型性,提升目标函数对于深度神经网络优化方向的指导能力。实验结果表明,相比于其他深度神经网络方法,该方法的相对等错误率最多降低了28%,显著提升了说话人确认系统的性能。

关 键 词:互信息估计  目标函数  自适应学习  特征表示学习  说话人确认
收稿时间:2022-06-08

Mutual Information Adaptive Estimation for Speaker Verification
Affiliation:1.School of Computer Science and Technology, Harbin University of Science and Technology Harbin 1500802.Postdoctoral Research Station of Computer Science and Technology, Harbin University of Science and Technology Harbin 150080
Abstract:In order to measure the relationship between features more accurately, an objective function representation method based on mutual information adaptive estimation is proposed for speaker verification systems. This objective function introduces an adaptive metric learning method, and the optimization objective is maximizing the intra-class similarity and minimizing the inter-class similarity. Meanwhile, the objective function can dynamically adjust the similarity according to the real distribution of deep features. Based on dynamically adjusting, the deep neural networks can be optimized towards the direction of stronger discrimination. In addition, the adaptive metric method is used for feature sampling and update the parameters according to the characteristics of the features. Thus, the feature can be more typical and beneficial to improve the supervised ability of the optimization direction of the deep neural networks. Experimental results show that, compared with other deep neural networks, the relative equal error rate of the proposed method is reduced by up to 28%, and the performance of the speaker verification system is significantly improved.
Keywords:
本文献已被 万方数据 等数据库收录!
点击此处可从《电子科技大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《电子科技大学学报(自然科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号