基于局部余弦变换的低比特变速率语音编码算法研究 Study on low bit and variable rate speech coding based on local cosine transform期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于局部余弦变换的低比特变速率语音编码算法研究

引用本文：	董恩清,蔡光跃,李永利. 基于局部余弦变换的低比特变速率语音编码算法研究[J]. 通信学报, 2005, 26(5): 122-127

作者姓名：	董恩清蔡光跃李永利

作者单位：	1. 苏州大学,电子信息学院,江苏,苏州,21502l 2. 西安交通大学,电子与信息工程学院,陕西,西安,710049

基金项目：	江苏省高校自然科学研究计划资助项目(03KJB510127)

摘要：	提出将局部余弦变换(LCT)算法应用于语音编码中,系统设计了一个平均比特率近1.6kbit/s的低比特变速率语音编码器。在变比特率编码器设计中采用SVM算法进行VAD检测。激活语音帧的语音模式采用GSM半速率编码中的划分方法,但将其中的强浊音模式和中浊音模式合并为一个中强浊音模式。对各类语音模式和无声帧(背景噪声)的局部余弦变换系数采用分维矢量量化算法进行量化,码书设计采用LGB算法。编码中的码书搜索采用树形快速搜索算法。通过主观非正式听力测试表明设计的变比特率编码器编码的重建语音MOS约为3.15,与比特率为2.4kbit/s美国联邦声码器标准MELP的重建语音相当,具有较强的顽健性,适合于对存在各种环境噪声的语音进行编码。
关键词：	局部余弦变换语音编码变速率编码矢量量化支持向量机
文章编号：	1000-436X(2005)05-0122-06
修稿时间：	2004-04-18
Study on low bit and variable rate speech coding based on local cosine transform

DONG En-qing,CAI Guang-yue,LI Yong-li. Study on low bit and variable rate speech coding based on local cosine transform[J]. Journal on Communications, 2005, 26(5): 122-127

Authors:	DONG En-qing CAI Guang-yue LI Yong-li

Affiliation:	DONG En-qing,1 CAI Guang-yue,1 LI Yong-li2

Abstract:	A 1.6kbit/s low bit and variable rate speech coder based on local cosine transform (LCT) algorithm for two-way conversational speech was designed for the first time. The VAD (voice activity detector) based on SVM (support vector machine) and the classification method of speech modes of the GSM half rate standard for active speech were adopted in the design of the variable rate coder. A little modification for speech mode was made where the moderately voiced mode and the strongly voiced mode were combined as a speech mode. The new combined speech mode was named as moderately and strongly voiced mode. A few segment vector quantizers of the local cosine transform coefficients for each speech mode and silence speech frame (background noise) were employed, and LGB algorithm was applied to design the codebooks. A tree fast search technique was used to select the vector of local cosine transform coefficients for each segment. The evaluation, using subject informal listening tests, indicated that the quality of the synthesized speech of the designed low bit and variable rate speech coder was comparative with that of the U. S. Federal Standard MELP vocoder (2.4 kbit/s). The subjective MOS mean opinion score of the designed speech coder was about 3.15. The new coder has higher robust than the U. S. Federal Standard MELP vocoder, which is suitable for speech coding in any environment.

Keywords:	LCT speech coding variable rate coding vector quantization SVM
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《通信学报》浏览原始摘要信息
	点击此处可从《通信学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏