首页 | 本学科首页   官方微博 | 高级检索  
     

基于区域自适应多尺度卷积的单声道语音增强算法
引用本文:王钇翔,吕忆蓝,台文鑫,孙建强,蓝天. 基于区域自适应多尺度卷积的单声道语音增强算法[J]. 计算机应用研究, 2021, 38(11): 3264-3267. DOI: 10.19734/j.issn.1001-3695.2021.03.0131
作者姓名:王钇翔  吕忆蓝  台文鑫  孙建强  蓝天
作者单位:电子科技大学信息与软件工程学院,成都610054
基金项目:国家自然科学基金资助项目(U19B2028,61772117);科技委创新特区资助项目(19-163-21-TS-001-042-01);提升政府治理能力大数据应用技术国家工程实验室重点项目(10-2018039);中央高校基本科研业务费资助项目(ZYGX2019J077)
摘    要:卷积神经网络的感受野大小与卷积核的尺寸相关,传统的卷积采用了固定大小的卷积核,限制了网络模型的特征感知能力;此外,卷积神经网络使用参数共享机制,对空间区域中所有的样本点采用了相同的特征提取方式,然而带噪频谱图噪声信号与干净语音信号的分布存在差异,特别是在复杂噪声环境下,使得传统卷积方式难以实现高质量的语音信号特征提取和过滤.为了解决上述问题,提出了多尺度区域自适应卷积模块,利用多尺度信息提升模型的特征感知能力;根据对应采样点的特征值自适应地分配区域卷积权重,实现区域自适应卷积,提升模型过滤噪声的能力.在TIMIT公开数据集上的实验表明,提出的算法在语音质量和可懂度的评价指标上取得了更优的实验结果.

关 键 词:语音增强  卷积神经网络  多尺度卷积  区域自适应
收稿时间:2021-03-05
修稿时间:2021-06-29

Monaural speech enhancement algorithm based on region-aware multi-scale convolution
Wang Yixiang,Ly Yilan,Tai Wenxin,Sun Jianqiang and Lan Tian. Monaural speech enhancement algorithm based on region-aware multi-scale convolution[J]. Application Research of Computers, 2021, 38(11): 3264-3267. DOI: 10.19734/j.issn.1001-3695.2021.03.0131
Authors:Wang Yixiang  Ly Yilan  Tai Wenxin  Sun Jianqiang  Lan Tian
Affiliation:School of Information & Software Engineering, University of Electronic Science & Technology of China,,,,
Abstract:The size of the receptive field of the convolutional neural network is related to the size of the convolution kernel. And the traditional convolution uses a fixed-size convolution kernel, which limits the feature perception ability of the network model. In addition, due to the parameter sharing mechanism of the convolutional neural network, it used the same feature extraction method for all pixels in the spatial region. However, there are differences in the distribution of noise signals and clean speech signals in the noisy spectrogram, especially in the complex noise environment, the general convolution method is difficult to achieve high-quality speech signal feature extraction and choosing. In order to solve the above problems, this paper proposed a multi-scale region adaptive convolution module, which used multi-scale information to improve the feature perception ability of the model and automatically allocated the area convolution the area adaptive convolution achiece and improved the denoising ability of the model. The experiments on the TIMIT public datasets show that the proposed algorithm has achieved satisfactory results in the metrics of speech quality and intelligibility.
Keywords:speech enhancement   convolutional neural network   multi-scale convolution   region-aware
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号