共查询到20条相似文献,搜索用时 125 毫秒
1.
2.
基于语音和音频信号的固有周期性特征,本文构建了一种适合语音和音频信号的统一分析/合成模型,并分别在24kbps和32kbps码率下,实现了对宽带语音和音频信号的高质量分层编码.首先,本文将具有时变周期的输入信号规整为具有固定周期的信号,并对规整后的周期信号构建规整矩阵;其次,对规整矩阵的行和列分别进行调制叠接变换(MLT)和离散余弦变换(DCT),完成规整矩阵的稀疏化;最后,利用分带量化和矢量哈夫曼编码完成稀疏矩阵元素的量化和编码.主客观测试结果表明,本文所提方法的语音、音频及其混合信号的编码质量均优于同等速率下的ITU-T G.722.1和AMR-WB编码器. 相似文献
3.
本文就低速率WI语音编码中的基音检测技术进行研究,针对基音检测在不同噪声与信噪比下容易发生清浊误判的问题,在基音检测前端引入基于DCT分带谱熵的语音检测算法划分语音段与非语音段;为了向基音检测算法提供更能准确反映基音周期实际变化的输入语音,基于谐波-噪声模型提出了一种改进的DCT域语音分解算法.然后,根据变形的MCAMDF(Modified Circular Average Magnitude Difference Function)与NCCF(Normalized Cross-Correlation Function)的峰值共性,结合上述两项基音检测前端处理技术,提出了MCAMDF-NCCF基音检测组合算法.为了满足不同环境下WI编码器对基音检测高精度的要求,在合成端更准确地恢复相位轨迹,本文又基于MCAMDF-NCCF算法提出了高精度MCAMDF-NCCF-FRAC基音检测算法以计算分数基音.将算法应用与2kb/s WI编码器,主观A/B听力测试结果表明,本文提出的基音检测算法在低信噪比下明显抑制了基音加倍减半及清浊误判现象的发生,得到了优异的基音检测结果,合成语音质量完全满足低速率WI编码器对基音检测技术的要求. 相似文献
4.
基于国际电信联盟标准化组织(ITU-T)编码标准G.729.1,本文提出了一种嵌入式变速率立体声语音与音频编码方法.本算法利用G.729.1和改进的调制叠接变换(Modulated Lapped Transform,MLT)编码技术对输入信号的中值与边带信息进行分层编码,形成具有嵌入式结构的码流.编码器可处理宽带和超宽带的立体声信号,宽带立体声信号编码的最大码率为48kb/s,超宽带立体声信号编码的最大速率为64kb/s.实现结果表明,本编码器的编码质量均达到了ITU-T对G.EV-VBR立体声编码的指标要求. 相似文献
5.
6.
为有效解决现有单一模型编码器无法在中低速率对语音和音频信号进行高质量通用编码的问题,本文借助语音与音频信号的谐波特性,建立了一种对语音和音频信号统一编码的方法。首先,本文利用经验模态分解(Empirical Mode Decomposition, EMD)提取输入信号的谐波成分;其次,利用感知匹配追踪算法,并结合正弦参数建模对谐波成分进行参数提取与量化;第三,对于量化谐波后的残差进行抖动格型矢量量化,以提升重建音频的主观听觉质量,并最终实现一套包含24kbps和32kbps码率的宽带语音与音频通用编码器;最后,对所提算法进行了客观PESQ/PEAQ和主观A/B测试,并与ITU-T G.722.1和G.722.2编码器进行了比较,实验结果表明,所提编码器对语音和音频信号的编码质量均优于参考编码器。 相似文献
7.
介绍了一种语音混合编码技术,它结合了频域参数编码器(用于平稳浊音和平稳清音)和时域波形编码器(用于过滤语音)。主观听音测试证明,这种4kbit/s混合编码方案的质量可和低速率的CELP编码器相媲美。 相似文献
8.
9.
本文提出一种新的用于LPC语音编码器的BSP激励信号,即根据语音产生的原理,以一个幅度受到二项式调制的正弦波BSP(Binomial Sine Pulse)作为LPC激励源,该二项式反映了激励信号在一个基音周期内的变化趋势。本文同时推导了BSP激励参数的求取和改进方法。实验结果表明,在此基础上构造的BSP语音编解码器具有低复杂度、低时延的优点,同时编码速率在低至2.65kb/s时,具有较高的合成语音质量。 相似文献
10.
李斯伟 《中国数据通信网络》2000,(5):11-14
CE-LPC称为码激励线性预测编码,它属于声编码器类。这类编码器从时间波形中提取重要的特征,它在低比特率编码器最适用。本通过CE-LPC编码的特点、系统组成和编码原理等几个方面,说明民航语音交换系统采用CE-LPC编码可在4.8kbit/s的速率上传输高质量的话音信号。 相似文献
11.
12.
13.
In wireless commercial and military communications systems, where bandwidth is at a premium, robust low-bit-rate speech coders are essential. They operate at fix bit rates and those bit rates cannot be altered without major modifications in the vocoder design. A novel approach to vocoders, in order to reduce the bit rate required to transmit speech signal, is proposed. While traditional low-bit-rate vocoders code original input speech, the proposed procedure operates on the time-scale modified signal. The proposed method offers any bit rate from 2400 b/s to downwards without modifying the principle vocoder structure, which is the new NATO standard, Stanag 4591, Mixed Excitation Linear Prediction (MELP) vocoder. We consider the application of transmitting MELP-encoded speech over noisy communication channels by applying different modulation techniques, after time-scale compression is applied. Three different time-scale modification algorithms have been evaluated and waveform similarity overlap and add (WSOLA) algorithm has been selected for time-scale modification purposes. Computer simulation results, both source and channel, are presented in terms of objective speech quality metrics and informal subjective listening tests. Design parameters such as codec complexity and delay are also investigated. Simulation results lead to a possible wireless communications system, whose performance might be enhanced by using the spared bits offered by the procedure. 相似文献
14.
A pitch synchronous differential predictive encoding system (p.s.d.p.e.) is described, which reduces the dynamic range of voiced speech to a value similar to that of unvoiced speech. As a consequence, the signal encoded has a smaller dynamic range than the speech signal and results in an improvement in the signal/noise ratio for a given transmitted number of bits per sample. This improvement is approximately 8 dB compared with an a.d.p.c.m. codec, when the p.s.d.p.e. system uses an adaptive p.c.m. encoder and the transmission rate is 3 bit/sample. 相似文献
15.
语音增强的目的是从带有噪声的语音中分离出纯净语音,实现语音的质量和可懂度的提高。近年来,采用有监督学习的深度神经网络已经成为了语音增强的主流方法。卷积循环网络是一种新型的神经网络结构,包含编码层、中间层、解码层三个主要模块,其已经在语音增强任务中取得了较好的效果。时频注意力机制是一个由数个相连的卷积层通过跳跃连接构成的简单网络模块,在训练过程中可以计算语音幅度谱特征图的非邻域相关性,从而更加有利于网络关注到语音的谐波特性。本文将时频注意力机制引入卷积循环网络的编码层和解码层中,实验结果表明,在不同信噪比条件下,该方法相比基线卷积循环网络能够进一步提高语音质量和可懂度,且增强后的语音信号可以保留更多的语谱谐波信息,实现更低程度的语音失真。 相似文献
16.
17.
A system called p.s.f.o.l.d. is described which exploits the correlation between successive pitch periods of a speech signal. This system is a differential one and can employ various types of encoders. We describe a p.s.f.o.l.d. system using a 1st-order d.p.c.m. encoder and show that for a speech utterance this system has a peak signal/noise ratio which is 6 dB larger, and has an increase in dynamic range of 13 dB, compared with a 1st-order d.p.c.m. codec. 相似文献
18.
19.
希尔伯特-黄变换是一种全数据驱动的自适应非平稳信号时频分析方法,但是在强噪声环境下语音信号的希尔伯特能量谱曲线波动较大,对语音端点检测造成很大的影响,该文提出了一种基于希尔伯特-黄变换和顺序统计滤波的检测方法。该方法将含噪语音信号进行经验模态分解,通过对固有模态函数进行自适应权重选取获得信号的希尔伯特能量谱,利用顺序统计滤波器对每帧的能量谱进行平滑处理作为语音/非语音的鉴别特征。实验结果表明,该方法适用于复杂噪声环境的端点检测,在低信噪比情况下仍然能够有效地检测出语音信号,降低信号误检率。 相似文献
20.
A simple but effective method to reduce overload noise and improve the overall performance in delta-modulation (DM) systems is presented in this paper. Two identical DM encoders operate on a different time-basis in such a manner that a running average of the error signal from the first encoder is taken into consideration by the second one. The resulting operation is a time adaptation process yielding diminished overload noise during overload bursts. Results obtained by computer simulation for speech signals show an overall signal-to-quantization noise ratio (SQNR) improvement up to 2.5 dB at optimum step size over classical schemes. 相似文献