期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

宋倩倩于凤芹《电声技术》2009,33(8):60-63

语音端点检测的准确与否直接影响到语音识别系统的计算复杂度和识别能力,在基于短时能量和过零率的端点检测算法中,能量计算方法不尽合理而且在低信噪比下检测效果大大降低。对此提出了一种基于经验模式分解和改进双门限法的语音端点检测算法,仿真结果表明在低信噪比情况下本文算法有更好的端点检测能力,显示了算法的优越性。相似文献

2.

One Dictionary vs. Two Dictionaries in Sparse Coding Based Denoising

XIE Yining HUANG Jinjie HE Yongjun 《电子学报:英文版》2017,26(2)

As a promising technique, sparse coding can be widely used for representation, compression, de-noising and separation of signals. This technique has been introduced into noisy speech processing, where enhancing speech itself or speech feature remains a challenge. Unlike other fields where noises are dense, the noises in speech are often sparse or partly sparse over the speech dictionary, re-sulting in performance degradation. It is necessary to un-derstand the noise conditions of speech environments and the applied range of sparse coding. This paper analyzes the assumptions of sparse coding and provides the bounds of reconstruction error for two sparse coding methods which are widely used. Based on this analysis, the performance of the two methods under different conditions are com-pared. The results show that the performance of sparse coding can be improved by a well-prepared noise dictio-nary. Experiments on speech enhancement and recognition are conducted, and the results coincide with the theoretical analysis well. 相似文献

3.

两种改进型LPC特征参数对语音识别效果的影响

卢珞先刘建辉黄涛《电声技术》2005,(10):51-53

介绍了2种改进型的语音识别方法：单边自相关LPC系数法和线性预测误差法。这2种方法与传统的线性预测编码LPC法相比，其抗噪能力增强，即在强噪声环境下仍能达到较高的识别率。把这3种方法分别应用于端点检测和语音识别，用实验数据说明了2种改进型方法显著的抗噪性特点。相似文献

4.

Performance Evaluation of Silence-Feature Normalization Model using Cepstrum Features of Noise Signals

SangYeob Oh Kyungyong Chung 《Wireless Personal Communications》2018,98(4):3287-3297

Speech enhancement algorithms play an important role in speech signal processing. Over the past several decades, many algorithms have been studied for speech enhancement. A speech enhancement algorithm uses a noise removal method and a statistical model filter to analyze the speech signal in the frequency domain. Spectral subtraction and Wiener filters have been used as representative algorithms. These algorithms have excellent speech enhancement performance, but suffer from deterioration in performance due to specific noise or low signal-to-noise ratio (SNR) environments. In addition, according to estimations of erroneous noise, a noise existing in a voice signal is maintained so that a spectrum corresponding to a voice signal is distorted, or a frame corresponding to a voice signal cannot be retrieved, and voice recognition performance deteriorates. The problem of deterioration in speech recognition performance arises from the difference between speech recognition and training model. We use silence-feature normalization model as a methodology to improve the recognition rate resulting from the difference in the noisy environments. Conventional silence-feature normalization has a problem in that the silent part of the energy increases, which affects recognition performance due to unclear boundaries categorizing the voice. In this study, we use the cepstrum feature of the noise signals in the silence-feature normalization model to improve the performance of silence-feature normalization in a signal with a low SNR by setting a reference value for voiced and unvoiced classification. As a result of recognition rate confirmation, the recognition rates improve in performance, compared with other methods. 相似文献

5.

抗噪声的小波谱压缩特征提取算法在语音识别中的应用

付丽辉《量子电子学报》2009,26(4):398-404

针对语音识别实际应用过程中的噪声问题,给出了一种新的抗噪声的特征提取算法,即先利用小波变换将语音信号进行小波子带分解,再根据人耳的听觉掩蔽效应,由谱压缩的技术,将小波变换后的子带语音信号进行压缩,从而提取其对应的语音特征。通过MATLAB软件建立实验平台,仿真实验结果表明该语音特征可以在噪声环境下得到较高的识别率。新的特征参数即充分利用了小波的抗噪声特性又有效地降低了语音识别中的训练环境和识别环境间的失配,具有抗噪声的特点。相似文献

6.

Context-adaptive pre-processing scheme for robust speech recognition in fast-varying noise environment

Iosif Mporas Todor Ganchev Otilia Kocsis Nikos Fakotakis 《Signal processing》2011,91(8):2101-2111

Based on the observation that dissimilar speech enhancement algorithms perform differently for different types of interference and noise conditions, we propose a context-adaptive speech pre-processing scheme, which performs adaptive selection of the most advantageous speech enhancement algorithm for each condition. The selection process is based on an unsupervised clustering of the acoustic feature space and a subsequent mapping function that identifies the most appropriate speech enhancement channel for each audio input, corresponding to unknown environmental conditions. Experiments performed on the MoveOn motorcycle speech and noise database validate the practical value of the proposed scheme for speech enhancement and demonstrate a significant improvement in terms of speech recognition accuracy, when compared to the one of the best performing individual speech enhancement algorithm. This is expressed as accuracy gain of 3.3% in terms of word recognition rate. The advance offered in the present work reaches beyond the specifics of the present application, and can be beneficial to spoken interfaces operating in fast-varying noise environments. 相似文献

7.

Comparison of some noise-compensation methods for speechrecognition in adverse environments

Milner B.P. Vaseghi S.V. 《Vision, Image and Signal Processing, IEE Proceedings -》1994,141(5):280-288

A comparative study is presented of three noise-compensation schemes, namely spectral subtraction, Wiener filters, and noise adaptation, for hidden-Markov-model-based speech recognition in adverse environments. The noise-compensation methods are evaluated on a spoken-digit database, in the presence of car noise and helicopter noise at different signal-to-noise ratios. Experimental results demonstrate that the noise-compensation methods achieve a substantial improvement in recognition accuracy across a wide range of signal-to-noise ratios. At a signal-to-noise ratio of -6 dB the recognition accuracy is improved from 11% to 83%. The use of cepstral-time matrices as an improved speech representation is also considered, and their combination with the noise-compensation methods is shown. Experiments show that the cepstral-time matrix is a more robust feature than a vector of identical size, composed of a combination of cepstral and differential cepstral features 相似文献

8.

高脉冲噪声坏境中双门限法语音端点检测研究 总被引：1，自引：0，他引：1

刘超庄圣贤《电子科技》2013,26(4):116-118,123

语音端点检测是对有效语音段的识别关键技术,准确的端点检测使语音信号的后续处理计算量减少,有效地节约资源。现在多数语音端点检测技术例如能频值、谱熵、小波能量熵变换等都能准确检测出有效的语音段。文中介绍了一种双门限端点检测法,即利用短时平均过零率和短时平均能量法进行双门限检测,再设置一个最短时间门限,有效地在高脉冲噪声环境中准确识别汉语发音。通过与其他方法对比实验,文中双门限技术在短时高脉冲噪声环境下能有效提高语音识别率。仿真结果表明,端点检测正确率达93%。相似文献

9.

噪声环境下说话人识别性能的研究

张飞云蔡子亮盛胜我《电声技术》2007,31(6):41-43

为了提高噪声环境下说话人识别系统的识别性能,将基于听觉掩蔽效应的语音增强技术作为预处理器,对语音信号首先进行降噪处理,提高输入信号的信噪比。实验证明,经过降噪处理的语音信号送入说话人识别系统,提高了系统的识别性能。相似文献

10.

基于听觉掩蔽效应的改进MFCC特征提取算法

鲁五一吴德华谢志明刘建《电子工程师》2009,35(9):16-18

目前,关于语音识别的研究尚处在实验室环境中,而实际的语音总是与噪声和干扰并存。人类能够在信噪比很低甚至在有干扰声音存在的环境中正确识剐语音主要是依靠人的双耳输入作用,本文就模仿人耳的听觉掩蔽效应来掩蔽噪声信号,提出了一种MFCC（Mel频率倒谱系数）改进提取算法。该算法能更好地减少噪声信号对纯净语音信号的影响,从而提高语音信号的识别率。实验表明改进后的算法相对于传统的MFCC提取算法大约有4．43％～8．42％的相对性能提升。相似文献

11.

语音识别的鲁棒性特征提取方法研究

魏勋耿志辉王晓攀《无线电工程》2010,40(8):59-61

训练环境和测试环境的不匹配是造成实际情况下语音识别性能下降的主要原因。在深入研究语音识别的噪声环境和Mel域倒谱系数(MFCC)流程的基础上,基于累计分布函数匹配思想,给出了3种通过减小训练环境和测试环境的不匹配度来提高系统在不同环境下适应性的鲁棒性特征提取方法,分析了它们的理论基础、基本算法,并在Aurora2.0数据库上进行了实现,验证了方法的有效性,为实际应用中如何选择语音识别系统提供了参考。相似文献

12.

Representation of hidden Markov model for noise adaptive speechrecognition

Lee L.-M. Wang H.-C. 《Electronics letters》1995,31(8):616-617

The state parameters of the hidden Markov model are represented by the autocorrelation coefficients of a context window that can be adaptively transformed to cepstral and delta cepstral coefficients according to the environmental noise. Experimental results show that it can significantly improve the speech recognition rate under noisy environments 相似文献

13.

An Improved Endpoint Detection Algorithm Based on MFCC Cosine Value

Danyang Cao Xue Gao Lei Gao 《Wireless Personal Communications》2017,95(3):2073-2090

Endpoint detection is one of the most important steps in speech recognition. In a high SNR environment, the algorithm based on short-time energy and zero rate could be used. But when the SNR is low, this method may not be accurate. Some researchers proposed an algorithm which is based on MFCC Euclidean distance. It has a better performance in a noise environment. But that algorithm needs two thresholds to find the start and end point. However, when the values of two thresholds are not suitable, the detected result could be extremely bad. In this paper, we proposed an improved algorithm which is based on MFCC cosine value. This method can reduce errors, since it only needs one single threshold. The benefit of this improved algorithm is that the result can surely contain the real voice component. According to the experiment data, this improved algorithm can improve the speech recognition rate by 10% even in noise environment (SNR = 0). Thus, it proved that this improved methods has better robustness. 相似文献

14.

Frame-synchronous noise compensation for hands-free speechrecognition in car environments

Chien J.-T. Lin M.-S. 《Vision, Image and Signal Processing, IEE Proceedings -》2000,147(6):508-515

It has become increasingly important to develop hands-free speech recognition techniques for the human-computer interface in car environments. However, severe car noise degrades the speech recognition performance substantially. To compensate the performance loss, it is necessary to adapt the original speech hidden Markov models (HMMs) to meet changing car environments. A novel frame-synchronous adaptation mechanism for in-car speech recognition is presented. This mechanism is intended to perform unsupervised model adaptation efficiently on a frame-by-frame basis instead of a conventional adaptation algorithm relying on batch adaptation data and supervision information. The proposed adaptation scheme is performed during frame likelihood calculation where an optimal equalisation factor is first computed to equalise the model mean vector and the input frame vector. This equalisation factor then serves as a reference index to retrieve an additional bias vector for model mean adaptation. As a result, a rapid and flexible algorithm is exploited to establish a new robust likelihood measure. In experiments on hands-free in-car speech recognition with the microphone far from the talker, this framework is found to be effective in terms of recognition rate and computational cost under various driving speeds 相似文献

15.

听觉模型用于语音识别以及与一般方法的比较

黄泰翼高雨青《电子学报》1993,21(10):1-6

本文在文献（１）建立的外周听觉系统以及部分中枢听觉神经系统的基础上，建立了一个主意识别器。它由听觉模型作为语音声学前端处理器（即特征提取），由具有ｔｏｎｏｔｏｐｉｃ组织结构的神经网络作为识别分类器。大量实验表明，由该听觉模型提取的特征参数不仅能很好地表示主意区别意义，而且对于噪声环境下的语音特征表示有较好ｔｏｂｕｓｔｎｅｓｓ。语音识别实验表明：在有噪声的情况下，采用听觉模型参数的识别器，其识别率明相似文献

16.

一种基于检测元音的孤立词端点检测算法 总被引：2，自引：0，他引：2

邝航宇张军韦岗《电声技术》2005,(3):40-43,48

提出了基于检测元音的端点检测算法。首先检测到语音中的元音的端点。然后利用元音的端点作为参考端点检测出语音真实的端点。将新方法应用在T146数据包在NoiseX-92的5种噪声下的端点检测和识别实验。并和基于能量的端点检测算法比较。2种不同的实验表明，提出的基于检测元音的孤立词端点检测算法可以在不同信噪比下提高端点检测的准确率，并在低信噪比的环境下能明显提高语音识别系统的识别率。相似文献

17.

汉语语音识别的抗噪性前端算法及性能分析

林建臻孙甲松王作英《电声技术》2004,(3):45-48,52

讨论了欧洲电信标准委员会ETSI提出的分布式语音识别系统的抗噪前端特征提取算法,该算法融合多种抗噪技术。结合汉语语音的特点,进行了汉语语音识别整体框架下的算法实现,并进行了实验和分析,典型噪声环境下的识别结果证明,相对于基线MFCC特征提取算法,稳健性有较大提高。相似文献

18.

Multi-model approach for noisy speech recognition

Cun-Tai Guan Shu-Hung Leung Wing-Hong Lan 《Electronics letters》1998,34(1):30-32

A multi-model approach for noisy speech recognition is proposed. This approach comprised an SVD-based preprocessing front-end and a multi-model HMM recognition structure. It can provide a high recognition rate over a large range of SNRs for speech recognition in wide-band additive noise 相似文献

19.

并行子带HMM最大后验概率自适应非线性类估计算法 总被引：1，自引：0，他引：1

孙暐吴镇扬刘海滨周琳《电路与系统学报》2005,10(6):20-24

目前,自动语音识别(ASR)系统在实验室环境下获得了较高的识别率,但是在实际环境中,由于受到背景噪声和传输信道的影响,系统的识别性能急剧恶化.本文以听觉试验为基础,提出一种新的独立子带并行最大后验概率的非线性类估计算法,用以提高识别系统的鲁棒性.本算法利用多种噪声和识别内容功率谱差异,以及噪声在不同频带上对HMM影响的不同,采用多层感知机(MLP)对噪声环境下最大后验概率进行非线性映射,以减少识别系统由于环境不匹配而导致的识别性能下降.实验表明:该算法性能明显优于最大后验线性回归算法和Sangita提出的子带语音识别算法. 相似文献

20.

采用子带长时信号变化特征的稳健语音活动检测

蔡铁唐飞龙志军《电视技术》2014,38(19)

为提高语音活动检测(VAD)在低信噪比下的准确率,提出了一种基于子带长时信号变化特征的VAD算法.将语音信号转换到频域,并分解为几个不重复的子频带,对这些子带信号分别提取长时信号变化特征,然后采用GMM在线建立语音和非语音模型,以模型的似然比进行VAD判决.实验结果表明,算法在较低的信噪比下能够显著地提高语音活动检测的准确率,且在多种噪声环境和信噪比条件下具有较好的稳健性.应用于语音识别系统的实验表明,该算法能有效提高噪声环境下的语音识别率. 相似文献