首页 | 本学科首页   官方微博 | 高级检索  
     

基于局部余弦变换的低比特变速率语音编码算法研究
引用本文:董恩清,蔡光跃,李永利. 基于局部余弦变换的低比特变速率语音编码算法研究[J]. 通信学报, 2005, 26(5): 122-127
作者姓名:董恩清  蔡光跃  李永利
作者单位:1. 苏州大学,电子信息学院,江苏,苏州,21502l
2. 西安交通大学,电子与信息工程学院,陕西,西安,710049
基金项目:江苏省高校自然科学研究计划资助项目(03KJB510127)
摘    要:提出将局部余弦变换(LCT)算法应用于语音编码中,系统设计了一个平均比特率近1.6kbit/s的低比特变速率语音编码器。在变比特率编码器设计中采用SVM算法进行VAD检测。激活语音帧的语音模式采用GSM半速率编码中的划分方法,但将其中的强浊音模式和中浊音模式合并为一个中强浊音模式。对各类语音模式和无声帧(背景噪声)的局部余弦变换系数采用分维矢量量化算法进行量化,码书设计采用LGB算法。编码中的码书搜索采用树形快速搜索算法。通过主观非正式听力测试表明设计的变比特率编码器编码的重建语音MOS约为3.15,与比特率为2.4kbit/s美国联邦声码器标准MELP的重建语音相当,具有较强的顽健性,适合于对存在各种环境噪声的语音进行编码。

关 键 词:局部余弦变换  语音编码  变速率编码  矢量量化  支持向量机
文章编号:1000-436X(2005)05-0122-06
修稿时间:2004-04-18

Study on low bit and variable rate speech coding based on local cosine transform
DONG En-qing,CAI Guang-yue,LI Yong-li. Study on low bit and variable rate speech coding based on local cosine transform[J]. Journal on Communications, 2005, 26(5): 122-127
Authors:DONG En-qing  CAI Guang-yue  LI Yong-li
Affiliation:DONG En-qing,1 CAI Guang-yue,1 LI Yong-li2
Abstract:A 1.6kbit/s low bit and variable rate speech coder based on local cosine transform (LCT) algorithm for two-way conversational speech was designed for the first time. The VAD (voice activity detector) based on SVM (support vector machine) and the classification method of speech modes of the GSM half rate standard for active speech were adopted in the design of the variable rate coder. A little modification for speech mode was made where the moderately voiced mode and the strongly voiced mode were combined as a speech mode. The new combined speech mode was named as moderately and strongly voiced mode. A few segment vector quantizers of the local cosine transform coefficients for each speech mode and silence speech frame (background noise) were employed, and LGB algorithm was applied to design the codebooks. A tree fast search technique was used to select the vector of local cosine transform coefficients for each segment. The evaluation, using subject informal listening tests, indicated that the quality of the synthesized speech of the designed low bit and variable rate speech coder was comparative with that of the U. S. Federal Standard MELP vocoder (2.4 kbit/s). The subjective MOS mean opinion score of the designed speech coder was about 3.15. The new coder has higher robust than the U. S. Federal Standard MELP vocoder, which is suitable for speech coding in any environment.
Keywords:LCT  speech coding  variable rate coding  vector quantization  SVM
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《通信学报》浏览原始摘要信息
点击此处可从《通信学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号