期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Ekman L.A. Kleijn W.B. Murthi M.N. 《IEEE transactions on audio, speech, and language processing》2008,16(1):65-73

All-pole spectral envelope estimates based on linear prediction (LP) for speech signals often exhibit unnaturally sharp peaks, especially for high-pitch speakers. In this paper, regularization is used to penalize rapid changes in the spectral envelope, which improves the spectral envelope estimate. Based on extensive experimental evidence, we conclude that regularized linear prediction outperforms bandwidth-expanded linear prediction. The regularization approach gives lower spectral distortion on average, and fewer outliers, while maintaining a very low computational complexity. 相似文献

2.

基于矢量模型的语音线性预测研究

钱正祥李玉阁施清苑刘传强《数据采集与处理》2000,15(4):462-466

研究语音参数线性预测的并行处理问题。通过把语音源序列的相邻样本分组能够构成一个均方差平稳的语音向量自回归序列,在Hilbert空间中运用正交投影原理导出具有高度并行处理能力的一预测编码策略,由此可推出参数线性预测的并行处理自适应算法。同传统格型算法相比,这种算法的计算复杂度及存贮量有明显改善。最后通过仿真运算检测了算法的性能。相似文献

3.

基于盲源分离和噪声抑制的语音信号识别

下载免费PDF全文

刘晶《计算机测量与控制》2018,26(12):140-144

为了更准确地在噪声环境中对不同语音信号进行识别,提出了一种用于普适语音环境下的自优化语音活动检测(VAD)算法,该算法运用个性化语音命令自动识别系统的语音信号,并能够有效地从多个发声者的混合语音中分离出个体发声者的声音,通过跟踪语音功率谱的较高幅度部分和自适应地抑制噪声来检测发声者的语音信号;设计并实现了一种处理多个发声者任务的自动语音识别(ASR),免去了对干净的语音变化进行先验估计,直接利用噪声本身产生语音/非语音判决的阈值以完成自优化过程;使用语音数据库NOIZEUS进行了评价测试,实验结果表明,所提出的盲源分离和噪声抑制方法不需要任何额外的计算过程,有效地减少了计算负担。相似文献

4.

浅海混响时间序列的支持向量机预测 总被引：3，自引：1，他引：2

下载免费PDF全文

高伟王宁《计算机工程》2008,34(6):25-27

把基于结构风险最小化原则的支持向量机应用到混响时间序列预测中,与径向基函数(RBF)神经网络方法预测结果进行了对比分析。采用海上实验混响数据进行预测,处理结果表明,支持向量机的方法优于RBF神经网络的方法,对混响时间序列有很好的预测效果。相似文献

5.

Speech Recognition Using Linear Dynamic Models

Joe Frankel Simon King 《IEEE transactions on audio, speech, and language processing》2007,15(1):246-256

The majority of automatic speech recognition systems rely on hidden Markov models, in which Gaussian mixtures model the output distributions associated with sub-phone states. This approach, whilst successful, models consecutive feature vectors (augmented to include derivative information) as statistically independent. Furthermore, spatial correlations present in speech parameters are frequently ignored through the use of diagonal covariance matrices. This paper continues the work of Digalakis and others who proposed instead a first-order linear state-space model which has the capacity to model underlying dynamics, and furthermore give a model of spatial correlations. This paper examines the assumptions made in applying such a model and shows that the addition of a hidden dynamic state leads to increases in accuracy over otherwise equivalent static models. We also propose a time-asynchronous decoding strategy suited to recognition with segment models. We describe implementation of decoding for linear dynamic models and present TIMIT phone recognition results 相似文献

6.

基于线性预测技术的语音基音检测

王琛姜占才《电脑开发与应用》2015,(3)

针对语音基音检测易受加性背景噪声和共振峰干扰的问题,提出了一种基于线性预测技术的语音信号基音检测算法。该算法在维纳滤波基础上,利用线性预测(LPC)技术得到预测残差信号,再对其做自相关(ACF)和平均幅度差(AMDF),得到基音的检测结果。其检测效果比单一自相关函数法和平均幅度差函数法有明显的改善。相似文献

7.

线性预测分析在连接词语音识别中的研究 总被引：1，自引：0，他引：1

李永恒严家明揭峰《计算机仿真》2010,27(11)

特征参数的提取是关系到语音识别系统性能好坏的关键,而线性预测分析是目前普遍采用的特征参数提取方法.针对在连接词和连续语音识别系统中,传统的线性预测系数已不能满足特征提取的要求,研究采用了三种主要的线性预测推演参数,即线性预测反射系数、线谱对系数和线性预测倒谱系数,及其在连接词语音识别系统中的应用,并进行计算机仿真.仿真结果表明,在输入语音库与信噪比一致的情况下,线性预测倒谱系数的识别率最高.从而证明,在包含语义特征信息和说话人特征方面,线性预测倒谱系数性能要优于线谱对系数和线性预测反射系数. 相似文献

8.

语音线性预测技术新探

下载免费PDF全文

徐静波冉崇森《计算机工程与科学》2004,26(5):93-95

本文提出了一种线性预测分析方法。通过估计频率抽样获得谱包,由归一化频率估计谱包;谱包规定在mel频率级,由IDFT提取抽样自相关估计,我们从抽样自相关的结果最终获得谱包cepstral系数(SEC)。HMM(Hidden Markov Model)识别实验表明,SEC与其它算法相比较,在低信噪比时,识别性能明显提高。相似文献

9.

硅微陀螺信号前向线性预测滤波方法研究 总被引：1，自引：1，他引：0

马从兵《传感技术学报》2008,21(2):350-352

介绍了前向线性预测滤波算法的基本原理,提出了一种自适应滤波过程中各参数的确定方法,对某硅微陀螺的静态漂移信号和实际动态信号进行了处理,给出了静态漂移信号滤波前后的Allan方差和标准差的大小,对滤波前后的误差大小和误差分布进行了分析,并与小波中值滤波效果进行了比较.结果表明,前向线性预测滤波方法无论是在去噪效果,还是实时性等方面,都明显优于小波中值滤波. 相似文献

10.

Integrated Speech Enhancement Method Using Noise Suppression and Dereverberation

Yoshioka T. Nakatani T. Miyoshi M. 《IEEE transactions on audio, speech, and language processing》2009,17(2):231-246

This paper proposes a method for enhancing speech signals contaminated by room reverberation and additive stationary noise. The following conditions are assumed. 1) Short-time spectral components of speech and noise are statistically independent Gaussian random variables. 2) A room's convolutive system is modeled as an autoregressive system in each frequency band. 3) A short-time power spectral density of speech is modeled as an all-pole spectrum, while that of noise is assumed to be time-invariant and known in advance. Under these conditions, the proposed method estimates the parameters of the convolutive system and those of the all-pole speech model based on the maximum likelihood estimation method. The estimated parameters are then used to calculate the minimum mean square error estimates of the speech spectral components. The proposed method has two significant features. 1) The parameter estimation part performs noise suppression and dereverberation alternately. (2) Noise-free reverberant speech spectrum estimates, which are transferred by the noise suppression process to the dereverberation process, are represented in the form of a probability distribution. This paper reports the experimental results of 1500 trials conducted using 500 different utterances. The reverberation time RT₆₀ was 0.6 s, and the reverberant signal to noise ratio was 20, 15, or 10 dB. The experimental results show the superiority of the proposed method over the sequential performance of the noise suppression and dereverberation processes. 相似文献

11.

Precise Dereverberation Using Multichannel Linear Prediction

Delcroix M. Hikichi T. Miyoshi M. 《IEEE transactions on audio, speech, and language processing》2007,15(2):430-440

In this paper, we discuss the numerical problems posed by the previously reported LInear-predictive Multi-input Equalization (LIME) algorithm when dealing with dereverberation of long room transfer functions (RTF). The LIME algorithm consists of two steps. First, a speech residual is calculated using multichannel linear prediction. The residual is free from the room reverberation effect but it is also excessively whitened because the average speech characteristics have been removed. In the second step, LIME estimates such average speech characteristics to compensate for the excessive whitening. When multiple microphones are used, the speech characteristics are common to all microphones whereas the room reverberation differs for each microphone. LIME estimates the average speech characteristics as the characteristics that are common to all the microphones. Therefore, LIME relies on the hypothesis that there are no zeros common to all channels. However, it is known that RTFs have a large number of zeros close to the unit circle on the z-plane. Consequently, the zeros of the RTFs are distributed in the same regions of the z-plane and, if an insufficient number of microphones are used, the channels would present numerically overlapping zeros. In such a case, the dereverberation algorithm would perform poorly. We discuss the influence of overlapping zeros on the dereverberation performance of LIME. Spatial information can be used to deal with the problem of overlapping zeros. By increasing the number of microphones, the number of overlapping zeros decreases and the dereverberation performance is improved. We also examine the use of cepstral mean normalization for post-processing to reduce the remaining distortions caused by the overlapping zeros 相似文献

12.

语音压缩中的线性预测编码技术 总被引：5，自引：0，他引：5

王尚武《微机发展》2002,12(6):40-42

语音压缩是多媒体信息压缩技术中的一个重要部分，线性预测编码技术是参数编码技术的重要内容。从线性预测编码技术的概念入手，分析和研究了线性预测编码技术及其LPC正则方程的自相关解法。相似文献

13.

语音压缩中的线性预测编码技术

王尚武《计算机技术与发展》2002,12(6)

语音压缩是多媒体信息压缩技术中的一个重要部分,线性预测编码技术是参数编码技术的重要内容.从线性预测编码技术的概念入手,分析和研究了线性预测编码技术及其LPC正则方程的自相关解法. 相似文献

14.

基于线性预测分析的语音三维语图绘图方法

田岚白树忠《控制与决策》1997,12(4):365-368

介绍在通用PC机上实现语音三维语谱绘图软件的设计方法。根据线性预测分析技术,采用两种出图方式,通过软件编程将提取的频谱包络在显示器和打印机上绘制成图。其绘图功能及效果可与绘图仪相比拟,具有方便灵活的特点。相似文献

15.

连续语音识别的线性词典动态规划研究

林生佑金一庆《计算机应用研究》2001,18(1):27-29

从介绍隐马可夫模型和Bayes选择规则着手,进而介绍了语音识别中基础性算法一线性词典动态规划搜索算法,实现了一个数字音识别系统,并对该实现系统作了较为详尽的描述。相似文献

16.

Dereverberation and Denoising Using Multichannel Linear Prediction

Delcroix M. Hikichi T. Miyoshi M. 《IEEE transactions on audio, speech, and language processing》2007,15(6):1791-1801

Reverberation in a room severely degrades the characteristics and auditory quality of speech captured by distant microphones, thus posing a severe problem for many speech applications. Several dereverberation techniques have been proposed with a view to solving this problem. There are, however, few reports of dereverberation methods working under noisy conditions. In this paper, we propose an extension of a dereverberation algorithm based on multichannel linear prediction that achieves both the dereverberation and noise reduction of speech in an acoustic environment with a colored noise source. The method consists of two steps. First, the speech residual is estimated from the observed signals by employing multichannel linear prediction. When we use a microphone array, and assume, roughly speaking, that one of the microphones is closer to the speaker than the noise source, the speech residual is unaffected by the room reverberation or the noise. However, the residual is degraded because linear prediction removes an average of the speech characteristics. In a second step, the average of the speech characteristics is estimated and used to recover the speech. Simulations were conducted for a reverberation time of 0.5 s and an input signal-to-noise ratio of 0 dB. With the proposed method, the reverberation was suppressed by more than 20 dB and the noise level reduced to -18 dB. 相似文献

17.

利用FFT实现对LFM信号的快速稀疏分解

欧国建张淑芳邓剑勋蒋清平《数据采集与处理》2018,33(5):865-871

针对传统稀疏分解算法致使冗余字典中原子数量巨大的缺陷,提出一种线性调频信号的快速稀疏分解算法。这种算法根据线性调频信号本身的特点构建冗余字典中的原子,构建了两个冗余字典,通过级联的方式,完成了线性调频信号的快速稀疏分解。通过分析,采用这种级联的方式使得总的原子数量远小于一个冗余字典中的原子数量。在利用第一个冗余字典进行稀疏分解时,该算法通过快速傅里叶变换寻找最大值在另一个冗余字典中同时得到最匹配的原子。实验结果证实这种算法比其他3种采用单一冗余字典的稀疏分解算法,不仅加快了稀疏分解速度,而且具有更好的收敛性。相似文献

18.

基于分数阶自相关的线性调频信号的参数提取

闫哲孙婉君王奇伟《测控技术》2016,35(2):32-35

线性调频(LFM)信号的参数包括瞬时频率、起始频率和初始相位.提出了一种基于分数自相关的线性调频信号的参数提取算法.首先该算法通过计算LMF信号在旋转角α的分数自相关以及相对应的p序的积分,这样当有确定的角度α时,分数阶傅里叶变换(FrFT)域中会出现一个峰值;然后根据这个峰值检测到信号,同时能够把参数提取出来;最后通过仿真验证该算法的有效性. 相似文献

19.

主动声纳均匀线列阵的混响仿真研究

张蔚严胜刚刘建国《计算机仿真》2012,(8):383-386,412

混响是限制主动声纳性能的主要因素之一。为了深入研究海洋混响的形成机理,推导海洋混响多普勒频率和空间锥角的数学关系式,根据单元散射特点提出了小尺度主动声呐阵列模型。以收发合置的标准均匀线列阵为例,通过仿真得到了海洋混响的等效平面混响级、混响时间序列、频谱和自相关波形,并通过统计验证了海洋混响的瞬时值服从高斯分布,包络服从瑞利分布。最后绘制基阵轴向与运动方向夹角δ=0°,90°和45°时所接收到混响的频率与方位的空时谱图,并进行了仿真,验证了提出的混响建模的准确性和有效性。相似文献

20.

MLLR和MAP在远场噪声混响下的语音识别研究

下载免费PDF全文

娄英丹徐静林黄丽霞张雪英《计算机工程与应用》2020,56(10):122-126

自适应技术可以用较少的数据来调整声学模型参数,从而达到较好的语音识别效果,它们大多用于自适应有口音的语音。将最大似然线性回归（Maximum Likelihood Linear Regression,MLLR）、最大后验概率（Maximum A Posteriori,MAP）自适应技术用在远场噪声混响环境下来分析其在此环境下的识别性能。实验结果表明,仿真条件下,在墙壁反射系数为0.6,各种噪声环境下MAP有最好的自适应性能,在信噪比（Signal-to-Noise Ratio,SNR）分别为5 dB、10 dB、15 dB时,MAP使远场连续语音词错率（Word Error Rate,WER）平均降低了1.51%、12.82%、2.95%。真实条件下,MAP使WER下降幅度最大达到了37.13%。进一步验证了MAP良好的渐进性,且当自适应句数为1 000时,用MAP声学模型自适应方法得到的远场噪声混响连续语音的识别词错率比自适应前平均降低了12.5%。相似文献