首页 | 本学科首页   官方微博 | 高级检索  
     

增强变分自编码器做非平行语料语音转换
引用本文:黄国捷,金慧,俞一彪.增强变分自编码器做非平行语料语音转换[J].信号处理,2018,34(10):1246-1251.
作者姓名:黄国捷  金慧  俞一彪
作者单位:苏州大学电子信息学院
摘    要:提出一种增强变分自编码器进行非平行语料语音转换的新方法。源语音首先经过编码网络生成一个服从高斯分布的语音编码,解码网络将该语音编码重构为指定的目标语音,最后通过增强网络优化生成的目标语音。增强网络的一个输入对应一个输出的,这使得整体转换系统有较好的去噪能力。此外,本文还引入了循环训练方法以改善转换语音的目标倾向性。实验结果显示,与基准语音转换系统相比,本文提出的增强变分自编码器语音转换系统在跨性别语音转换上的客观评价指标谱失真上下降10.3%,在主观评价指标相似度与清晰度方面同样有所改善。这一结果表明,本文提出的方法能够使转换语音具有良好目标倾向性,同时有较好的语音转换质量。 

关 键 词:语音转换    增强变分自编码网络    非平行语料
收稿时间:2018-05-04

Voice Conversion Using Non-parallel Corpora Based on Enhanced Variation Auto-encoder
Affiliation:School of Electronic and Information Engineering, Soochow University
Abstract:This paper proposed a novel enhanced variational auto-encoder(EVAE) for voice conversion using non-parallel corpora. Firstly, the source speech was encoded into a speech code with Gaussian distribution through the encoder , then the decoder reconstructed the speech code to the specified target speech. Finally, the generated target speech was optimized through the EVAE. The EVAE was one input corresponding to one output and this made the algorithm of this paper had better denoising ability. In addition, this article also introduced a cyclic training method to improve the target orientation of the converted speech. The experimental results showed that compared with the basic variational auto-encode voice conversion system without an enhanced network, the enhanced conversion system was about 10.3% lower in the objective evaluation of spectral distortions in inter-gender voice conversion. Improvements in the similarity and clearness of subjective evaluation standards had also been achieved. The result shows that the novel algorithm proposed in this paper can make the converted speech have a good target orientation. At the same time, the voice quality is better. 
Keywords:
点击此处可从《信号处理》浏览原始摘要信息
点击此处可从《信号处理》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号