Speech enhancement method based on the perceptual joint optimization deep neural network |
| |
Authors: | YUAN Wenhao LOU Yingxi LIANG Chunyan WANG Zhiqiang |
| |
Affiliation: | College of Computer Science and Technology, Shandong University of Technology, Zibo 255000, China |
| |
Abstract: | In the training of speech enhancement models based on the deep neural network (DNN), the mean square error is generally adopted as the cost function, which is not optimized for the speech enhancement problem. In view of this problem, to consider the correlation between the adjacent frames of the network’s output and the presence of the speech component in each time-frequency unit, by correlating the adjacent frames of the network’s output and designing a perceptual coefficient related to the presence of the speech component in time-frequency units in the cost function, a speech enhancement method based on the joint optimization DNN is proposed. Experimental results show that compared with the speech enhancement method based on the mean square error, the proposed method significantly improves the quality and intelligibility of the enhanced speech and has a better speech enhancement performance. |
| |
Keywords: | speech enhancement deep neural network cost function correlation |
|
| 点击此处可从《西安电子科技大学学报》浏览原始摘要信息 |
|
点击此处可从《西安电子科技大学学报》下载全文 |
|