首页 | 官方网站   微博 | 高级检索  
     

基于FFTNet-GAN的音频超分辨率方法研究
引用本文:徐峰,李平.基于FFTNet-GAN的音频超分辨率方法研究[J].信号处理,2021,37(1):59-65.
作者姓名:徐峰  李平
作者单位:华侨大学信息科学与工程学院
基金项目:福建省科技重大专项(2020HZ02014);福建省自然科学基金项目(2018J01095);福建省高校产学研合作科技重大项目(2013H6016);华侨大学中青年教师科技创新资助计划项目(ZQN-PY509)
摘    要:本文提出了一种基于FFTNet的生成对抗网络模型来实现极端音频超分辨率任务。生成器采用并行、非因果、Non-local运算的三路分裂求和FFTNet,此浅层模型速度快,精度高,能更好的提取时域音频的长期相关结构,以期望分辨率提取特征,提升重建性能;设计匹配性能的判别器,稳定适应生成对抗架构;融合基于频域的感知损失,与样本空间损失固定加权减少重建失真和提高感知质量。从主客观进行系统评价,本文方法都优于基线模型,从2x/4x/6x倍还原效果来看,模型具有极端的高频重建能力,有助于提高音频信号的时间分辨率。 

关 键 词:音频超分辨率    带宽扩展    FFTNet    生成对抗网络    高频重建
收稿时间:2020-10-08

Research on Audio Super-resolution Method Based on FFTNet-GAN
Affiliation:Academy of Information Science and Engineering, Huaqiao University
Abstract:This paper proposes a generative adversarial network model based on FFTNet to achieve extreme audio super-resolution tasks. The generator uses parallel, non-causal, and non-local three-way split-sum FFTNet. This shallow model is fast and accurate. It can better extract the long-term correlation structure of time-domain audio and extract features at the desired resolution, can help improve reconstruction performance.In addition, a discriminator with matching performance is designed to stably adapt to the generation adversarial architecture. Fusion based on the frequency domain perceptual loss, fixed weight with sample space loss to reduce reconstruction distortion and improve perceptual quality. From the subjective and objective system evaluation, the method in this paper is better than the baseline model. Judging from the 2x/4x/6x times reduction effect, the model has extreme high-frequency reconstruction ability, which helps to improve the time resolution of the audio signal. 
Keywords:
点击此处可从《信号处理》浏览原始摘要信息
点击此处可从《信号处理》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号