首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
提出了一种在自相关域上,以相关函数值为参数,利用单边自相关序列的线性预测误差去除语音中加性噪声的方法。该方法首先对含噪语音进行单边自相关处理,以语音信号的单边自相关序列替代语音信号序列,进而对该序列进行线性预测分析后,获得线性预测分析系数,并求得线性预测误差。根据误差能量与信号能量的比例关系,确定减因子u,从含噪语音中根据减因子u的大小减去预测误差,即可抑制噪声误差能量。实验表明;上述方法在低信噪比时,仍能较好地保留语音信号的频谱结构,使音质不至于下降。  相似文献   

2.
Burg's method of maximum entropy spectral analysis is used to analyse voiced speech signal and its performance is compared with that of the autocorrelation and covariance methods of linear prediction using the following three criteria: (1) normalized total-squared linear prediction error, (2) error in estimating the power spectrum and (3) errors in estimating the first three formant frequencies and bandwidths. Results of pitch-synchronous and pitch-asynchronous analyses when applied to synthetic vowel signals are discussed.  相似文献   

3.
A simple method is presented to compensate for noise effects before performing linear prediction analysis of speech signals in the presence of white noise with unknown variance. The method determines a suitable bias that should be subtracted from the zero-lag autocorrelation function, rather than deriving the exact noise variance. The resulting linear prediction filter is guaranteed to be stable. Since the bias used is always smaller than the minimum eigenvalue of the autocorrelation matrix. In addition to a comparison with other methods, the proposed method is examined from various viewpoints, including the degree of formant intensity, signal-to-noise ratio (SNR), deviation of compensated spectra and objective distortion measures. The improvements observed across a data set, consisting of four sentences uttered by six speakers, indicate that the compensated spectra measured with low SNRs are comparable to the uncompensated counterparts measured with approximately 5 dB higher SNRs  相似文献   

4.
A new method for representation of speech spectra based on a pole-zero decomposition technique is proposed in this paper. In this method the parameters of a pole-zero model for the smoothed short-time spectrum of speech are determined by adopting a cepstral matching criterion. The cepstral coefficients of the impulse response of the model are equal to the cepstral coefficients of the signal up to a specified number which determine the order of the model system. This is analogous to autocorrelation matching in linear prediction analysis. It is shown that the model spectrum represents both peaks and valleys of the smoothed spectrum equally well, unlike the all pole model of linear prediction analysis where only the peaks are well represented. The pole and zero parameters are derived in an identical manner by approximately deconvolving the pole and zero contributions in the cepstral domain. The residual from the inverse pole-zero system can be used to obtain information about the excitation signal.  相似文献   

5.
语音合成是实现人机语音通信的关键技术。文中介绍了一种基于语音信号线性预测分析的语音合成方法,以及什么是线性预测系数和如何提取线性预测系数,然后采用了重叠存储法,用预测系数合成语音。这种合成方法可以应用到语音信号的传输过程中,能减小信号的传输带宽,提高传输速率。  相似文献   

6.
基于小波变换和时域波形的基音检测算法   总被引:1,自引:0,他引:1  
为了准确地检测语音信号的基音周期,采用小波变换和时域波形相结合的方法,分别用传统的自相关法、平均幅度差法及文中算法对纯净语音和不同信噪比时的含噪语音进行基音检测。实验表明,自相关法易出现半频错误,平均幅度差法易出现倍频错误,且两者随着信噪比的降低,错误帧数呈增加趋势。然而文中算法倍频、半频错误相对较少,基音轮廓清晰、平滑,无大的跳变,符合语音信号慢时变性的一般规律,从而提高了基音检测的精度。  相似文献   

7.
Discrete all-pole modeling   总被引:3,自引:0,他引:3  
A method for parametric modeling and spectral envelopes when only a discrete set of spectral points is given is introduced. This method, called discrete all-pole (DAP) modeling, uses a discrete version of the Itakura-Saito distortion measure as its error criterion. One result is an autocorrelation matching condition that overcomes the limitations of linear prediction and produces better fitting spectral envelopes for spectra that are representable by a relatively small discrete set of values, such as in voiced speech. An iterative algorithm for DAP modeling that is shown to converge to a unique global minimum is presented. Results of applying DAP modeling to real and synthetic speech are also presented. DAP modeling is extended to allow frequency-dependent weighting of the error measure, so that spectral accuracy can be enhanced in certain frequency regions  相似文献   

8.
A simple method of proof is presented for the minimum-phase property of the all-pole model obtained in the autocorrelation method of linear prediction. The proof does not require knowledge of Levinson's recursion and extends easily to some special cases of the covariance method of linear prediction.  相似文献   

9.
基于自相关观测的语音信号压缩感知   总被引:1,自引:0,他引:1  
季云云  杨震 《信号处理》2011,27(2):207-214
本文基于压缩感知技术,根据语音信号的特点,提出了一种基于自相关特性的截断循环自相关矩阵作为观测矩阵,并在此基础上,从实用的角度出发,提出了基于模板匹配的近似截断循环自相关矩阵作为观测矩阵,并证明其满足RIP特性。由语音信号与截断循环自相关矩阵、近似截断循环自相关矩阵和高斯随机矩阵分别构造相应的观测,采用BP算法来重构原始语音信号。实验表明,由2个模板元素线性组合而成的近似截断循环自相关矩阵重构原始语音信号的性能与截断循环自相关矩阵的重构性能相当,且优于经典高斯随机矩阵,而且在相同的重构性能下,其压缩比远大于高斯随机观测矩阵,对语音信号的压缩性能有了明显地提高。   相似文献   

10.
For linear predictive coding (LPC) of speech, the speech waveform is modeled as the output of an all-pole filter. The waveform is divided into many short intervals (10–30 msec) during which the speech signal is assumed to be stationary. For each interval the constant coefficients of the all-pole filter are estimated by linear prediction by minimizing a squared prediction error criterion. This paper investigates a modification of LPC, called time-varying LPC, which can be used to analyze nonstationary speech signals. In this method, each coefficient of the all-pole filter is allowed to be time-varying by assuming it is a linear combination of a set of known time functions. The coefficients of the linear combination of functions are obtained by the same least squares error technique used by the LPC. Methods are developed for measuring and assessing the performance of time-varying LPC and results are given from the time-varying LPC analysis of both synthetic and real speech.  相似文献   

11.
方杰  李英  钱红 《电声技术》2006,(8):46-49
在研究双门限比较法的基础上,提出了语音端点检测不变门限三次搜索检测法,该方法主要由多词检测、端点修复和漏点检测3部分组成,有效解决了双门限比较法检测连续词端点的门限设置问题;在语音信号归一化的前提下,能以同一门限准确检测出语音信号的端点。在较低信噪比情况下,基于语音信号的短时相对自相关序列的短时平均幅度的端点检测能够获得较高的检测精度。  相似文献   

12.
为了进一步压缩比特率,在线性预测(LP)语音编码中使用了可变阶数方法。即根据当前语音帧的性质决定相应LP滤波器的阶数。但是,如果预测阶数太小,由于语音频谱的动态范围大,可能使LP分析不能够正确地匹配较高的共振峰。讨论了一个用于语音编码的频域技术,用以在浊音语音共振峰模型方面改善低阶数线性预测(LP)的性能。  相似文献   

13.
To reduce the computational complexity of algebraic code-excited linear prediction (ACELP) coders, an efficient codebook search mechanism based on a simplified correlation matrix (SCM) of the vocal impulse response is proposed. In the proposed approach, the statistical characteristics of the vocal impulse response are identified such that only a small proportion of the total number of correlation coefficients in the correlation matrix need be calculated before the ACELP search procedure is carried out. Furthermore, the proposed joint scheme, by combining the SCM method and a pulse position prediction scheme, not only decreases the arithmetic complexity in the pre-computing autocorrelation matrix but also reduces the number of pulse position combinations. The simulation and experimental results show that the proposed method provides an effective reduction in the computational load of the ACELP codebook search procedure with no discernible degradation of the speech quality  相似文献   

14.
语音产生过程包含非线性过程,传统的线性预测方法不能很好地解决这些非线性成份。局部线性预测是一种高精度的预测算法,但计算复杂度较大。为提高非线性预测的速度,提出了一种自适应递推局部线性预测算法.并设计算法的步骤,分析算法的复杂性。仿真结果表明,该算法比线性预测算法精度高,是一种有效的语音信号非线性预测方法。  相似文献   

15.
Multipulse LPC analysis can substantially eliminate the pitch-related bias in the LPC filter parameters. However, the procedure is computationally intensive. We present a more efficient algorithm, based on the autocorrelation method of linear prediction, which has application in voice synthesis and vocal-tract area-function recovery.  相似文献   

16.
基于神经网络的线性预测语音编码算法   总被引:1,自引:0,他引:1  
李浩  陈跃 《电子工程师》2004,30(8):15-16,20
语音压缩是多媒体通信技术的重要环节,线性预测编码(LPC)技术是参数编码技术的重要内容,线性预测是语音信号处理中最有效的方法之一.文中从LPC原理入手,阐述了最佳LPC系数的计算,针对目前自相关法和协方差等存在着估计误差的特点,提出一种基于神经网络的线性预测算法,最后通过实验数据证明这种方法既提高了解的精度,又保证了系统的稳定性.  相似文献   

17.
In this paper, we present a comparison of Khasi speech representations with four different spectral features and novel extension towards the development of Khasi speech corpora. These four features include linear predictive coding (LPC), linear prediction cepstrum coefficient (LPCC), perceptual linear prediction (PLP), and Mel frequency cepstral coefficient (MFCC). The 10-hour speech data were used for training and 3-hour data for testing. For each spectral feature, different hidden Markov model (HMM) based recognizers with variations in HMM states and different Gaussian mixture models (GMMs) were built. The performance was evaluated by using the word error rate (WER). The experimental results show that MFCC provides a better representation for Khasi speech compared with the other three spectral features.  相似文献   

18.
Traditional speech processing methods for laryngeal pathology assessment assume linear speech production with measures derived from an estimated glottal flow waveform. They normally require the speaker to achieve complete glottal closure, which for many vocal fold pathologies cannot be accomplished. To address this issue, a nonlinear signal processing approach is proposed which does not require direct glottal flow waveform estimation. This technique is motivated by earlier studies of airflow characterization for human speech production. The proposed nonlinear approach employs a differential Teager energy operator and the energy separation algorithm to obtain formant AM and FM modulations from filtered speech recordings. A new speech measure is proposed based on parameterization of the autocorrelation envelope of the AM response. This approach is shown to achieve impressive detection performance for a set of muscular tension dysphonias. Unlike flow characterization using numerical solutions of Navier-Stokes equations, this method is extremely computationally attractive, requiring only a small time window of speech samples. The new noninvasive method shows that a fast, effective digital speech processing technique can be developed for vocal fold pathology assessment without the need for direct glottal flow estimation or complete glottal closure by the speaker. The proposed method also confirms that alternative nonlinear methods can begin to address the limitations of previous linear approaches for speech pathology assessment  相似文献   

19.
Palomar  D.P. Price  M. Sandler  M. 《Electronics letters》1999,35(13):1058-1059
A method for optimising LPC filters in linear prediction based speech coders is described. The optimisation process compensates for errors incurred through coding the excitation signal, providing an improvement in the quality of the decoded speech, with no increase in bit rate  相似文献   

20.
Techniques for improving the performance of CELP-type speech coders   总被引:1,自引:0,他引:1  
Techniques for improving the performance of CELP (code excited linear prediction)-type speech coders while maintaining reasonable computational complexity are explored. A harmonic noise weighting function, which enhances the perceptual quality of the processed speech, is introduced. The combination of harmonic noise weighting and subsample pitch lag resolution significantly improves the coder performance for voiced speech. Strategies for reducing the speech coder's data rate, while maintaining speech quality, are presented. These include a method for efficient encoding of the long-term predictor lags, utilization of multiple gain vector quantizers, and a multimode definition of the speech coder frame. A 5.9-kb/s VSELP speech coder that incorporates these features is described. Complexity reduction techniques which allow the coder to be implemented using a single fixed-point DSP (digital signal processor) are discussed  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号