首页 | 本学科首页   官方微博 | 高级检索  
     

信噪比信息与时频特征修正相位的语音增强
引用本文:贾海蓉,王卫梅,吉慧芳.信噪比信息与时频特征修正相位的语音增强[J].西安电子科技大学学报,2019,46(5):162-170.
作者姓名:贾海蓉  王卫梅  吉慧芳
作者单位:太原理工大学 信息与计算机学院,山西 太原 030024
基金项目:国家自然科学基金(61371193);山西省自然科学基金(201701D121058)
摘    要:针对在基于谐波模型的相位谱语音增强算法中,只对浊音段相位进行重构导致语音失真和听觉不连贯的问题,提出了用信噪比信息与时频特征改进相位重构的新方法。首先,引入与相位失真有关的时频特征并计算决策阈值;然后利用信噪比信息计算带噪语音与纯净语音的相位偏差,两项比较进一步估计清音段与浊音段的语音相位,能有效改善语音的连贯性;最后将重构的相位与改进二元假设模型的幅值估计结合并进行语音增强。经过对不同噪声背景下的不同语音进行实验表明:新算法的相位差更接近于原信号。与对比算法相比,增强语音的信噪比平均提高2.39dB,语音感知评价指标平均提高0.12,有效地降低了语音失真,提高了语音可懂度。

关 键 词:相位重构  信噪比信息  时频特征  决策阈值  相位偏差  
收稿时间:2019-06-12

Speech enhancement based on the modified phase using signal-to-noise ratio information and time-frequency characteristics
JIA Hairong,WANG Weimei,JI Huifang.Speech enhancement based on the modified phase using signal-to-noise ratio information and time-frequency characteristics[J].Journal of Xidian University,2019,46(5):162-170.
Authors:JIA Hairong  WANG Weimei  JI Huifang
Affiliation:College of Information and Computer, Taiyuan University of Technology, Taiyuan 030024, China
Abstract:Aiming for the problem that the harmonic model-based phase spectrum speech enhancement algorithm can only reconstruct the phase of voiced segment, which leads to speech distortion and auditory discontinuity, a new method to improve phase reconstruction by using signal-to-noise ratio (SNR) information and time-frequency features is proposed. First, the time-frequency characteristics related to phase distortion are introduced and the decision threshold is calculated. Then the phase deviation between noisy speech and clean speech is calculated by using the signal-to-noise ratio information. The two comparisons further estimate the phase of voiced and unvoiced speech, which can effectively improve the coherence of speech. Finally, the reconstructed phase is combined with the amplitude estimation of the improved binary hypothesis model and the speech enhancement is performed. Experiments on different speeches in different noise backgrounds show that phase deviation of the new algorithm is closer to the original signal. Compared with the comparison algorithm, the signal-to-noise ratio of the enhanced speech is increased by 2.39dB on average, and the perceptual evaluation of speech quality is increased by 0.12 on average, which effectively reduces the speech distortion and improves speech intelligibility.
Keywords:phase reconstruction  SNR information  time-frequency characteristics  decision threshold  phase deviation  
点击此处可从《西安电子科技大学学报》浏览原始摘要信息
点击此处可从《西安电子科技大学学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号