首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 687 毫秒
1.
This letter proposes an effective and robust speech feature extraction method based on statistical analysis of Pitch Frequency Distributions (PFD) for speaker identification. Compared with the conventional cepstrum, PFD is relatively insensitive to Additive White Gaussian Noise (AWGN), but it does not show good performance for speaker identification, even if under clean environments. To compensate this shortcoming, PFD and conventional cepstrum are combined to make the ultimate decision, instead of simply taking one kind of features into account. Experimental results indicate that the hybrid approach can give outstanding improvement for text-independent speaker identification under noisy environments corrupted by AWGN.  相似文献   

2.
This letter proposes an effective and robust speech feature extraction method based on statistical analysis of Pitch Prequency Distributions (PFD) for speaker identification. Compared with the conventional cepstrum, PFD is relatively insensitive to Additive White Gaussian Noise (AWGN), but it does not show good performance for speaker identification, even if under clean environments. To compensate this shortcoming, PFD and conventional cepstrum are combined to make the ultimate decision, instead of simply taking one kind of features into account.Experimental results indicate that the hybrid approach can give outstanding improvement for text-independent speaker identification under noisy environments corrupted by AWGN.  相似文献   

3.
Voice activity detection (VAD) is used to detect speech and non-speech periods from observed speech signals. It is an important front-end technique for many speech technology applications. Many VAD methods have been proposed. However most of them have been applied under clean or noisy conditions. Only a few methods have been proposed for reverberant conditions, particularly under noisy reverberant conditions. We therefore need to understand the ill effects of noise and reverberation on speech to design an accurate and robust method of VAD under noisy reverberant conditions. The ill effects of noise and reverberation for speech can be regarded as the modulation transfer function (MTF) under noisy and reverberant conditions. Therefore, our study is based on the MTF concept to reduce the ill effects of noise and reverberation on speech, and propose a robust VAD method that we obtained in this study. Noise reduction and dereverberation were first applied to the temporal power envelope of the speech signal to restore the temporal power envelope with this method. Then, power thresholding as a VAD decision was designed based on the restored temporal power envelope. A method of estimating the signal to noise ratio (SNR) was proposed to accurately estimate the SNR in the noise reduction stage. Experiments under both artificial and realistic noisy reverberant conditions were carried out to evaluate the performance of the proposed method of VAD and it was compared with conventional VAD methods. The results revealed that the proposed method significantly outperformed the conventional methods under artificial and realistic noisy reverberant conditions.  相似文献   

4.
基于倒谱特征的带噪语音端点检测   总被引:44,自引:0,他引:44       下载免费PDF全文
胡光锐  韦晓东 《电子学报》2000,28(10):95-97
在语音识别系统中产生错误识别的原因之一是端点检测有误差.在高信噪比情况下,正确地确定语音的端点并不困难.然而,大多数实际的语音识别系统需工作在低信噪比情况下,一些常规的端点检测方法,例如基于能量的端点检测方法在噪声环境下不能有效地工作.本文利用倒谱特征来检测语音端点,提出了带噪语音端点检测的两个算法,第一个算法利用倒谱距离代替短时能量作为判决的门限,第二个算法改进了基于隐马尔柯夫模型(HMM)的语音检测以适应噪声的变化,实验结果表明本方法可得到高正确率的带噪语音端点检测.  相似文献   

5.
雷静  何培宇  徐自励 《信号处理》2020,36(8):1205-1211
传统语音端点检测方法利用语音和噪声在某单一参数特征上的差异进行信号中语音起止点的切分,但不同参数在低信噪比不同噪声环境下表现不稳定,鲁棒性差。因此,本文提出了基于均匀子带谱方差,能熵比,梅尔倒谱距离,似然比四种参数相融合的语音端点检测方法。该方法能自适应地改变各参数阈值,并通过实时监测噪声段能熵比的值确定所采用的投票判决机制,从而进行语音端点判定。实验结果表明,该方法在低信噪比下较常用的端点检测方法有更高的检测正确率及鲁棒性,对语音信号后续处理工作有一定的借鉴意义。   相似文献   

6.
张瑶  付进  武建国 《电子学报》2015,43(12):2381-2387
针对水声信道中窄带信号的多途时延估计问题,本文在对复倒谱时延估计方法进行研究的基础上,提出了一种基于对数域同态滤波的时延估计算法.结合复倒谱与同态滤波思想,将接收的窄带信号首先变换到对数域,然后与本地存储的信号进行谱减法,再对相减后的信号进行滤波以消除残余的信号与噪声成分,最后将其恢复到时域以获取多途时延估计.与传统的匹配滤波/相关处理以及复倒谱分析方法相比,本文算法具有时延估计精度高、噪声抑制能力较强等特点.仿真与湖试数据处理结果证明了该方法的有效性.  相似文献   

7.
噪声环境下的基音检测方法   总被引:4,自引:0,他引:4  
噪声环境下的基音检测在语音信号分析和识别中占有得要地位。自相关法和平均幅度差函数是两种常用的基音检测方法。结合两种方法,提出了一种有效的噪声环境下基音检测方法。实验表明,该方法是可行的,与传统方法相比鲁棒性好,特别适用于信噪比较低的情况。  相似文献   

8.
Signal subspace approach for narrowband noise reduction in speech   总被引:2,自引:0,他引:2  
A signal subspace method is proposed for speech enhancement in the presence of narrowband noise. A fundamental assumption in subspace methods for noise reduction is that the noise covariance matrix is positive definite. However, this is not always the case, especially when the noise has narrowband characteristics. Based on the eigenvalue decomposition of the rank deficient noise covariance matrix, it is shown how to formulate the enhancement algorithm by decomposing the vector space of noisy signal into a signal-plus-noise subspace and a noise-free subspace. The proposed subspace partition is different from the conventional subspace approaches in that the noise reduction algorithm is implemented using the whitening approach exclusively in the signal-plus-noise subspace. The enhancement is performed by estimating the clean speech from the signal-plus-noise subspace and adding the components in the noise-free subspace. An explicit form of the estimator is presented, and examples are illustrated to validate the effectiveness of the proposed method.  相似文献   

9.
In this work, a curvelet based nonlocal means denoising method is proposed. In the proposed method, the curvelet transform is firstly implemented on the noisy image to produce reconstructed images. Then the similarity of two pixels in the noisy image is computed based on these reconstructed images which include complementary image features at relatively high noise levels or both the reconstructed images and the noisy image at relatively low noise levels. Finally, the pixel similarity and the noisy image are utilized to obtain the final denoised result using the nonlocal means method. Quantitative and visual comparisons demonstrate that the proposed method outperforms the state-of-art nonlocal means denoising methods in terms of noise removal and detail preservation.  相似文献   

10.
Conventional methods for noise parameter measurement for linear noisy two-ports have been improved by introducing a computational method for evaluating measured admittance errors. Derivation and comparison with a conventional method are given. Noise parameters of a packaged 0.5-mu m gate-length GaAs MESFET (NE38806) were successfully measured using the proposed technique.  相似文献   

11.
This report describes scream detection systems that can detect screams under noisy conditions and describes techniques for increasing their noise-robustness. More specifically, spectral entropy, which expresses the difference in frequency distribution between a scream and noise, is used as a detection feature. Furthermore, a method is presented that improves scream detection accuracy in noisy environments by limiting the frequency band of the spectral entropy used as the detection feature. Evaluation experiments in noisy environments demonstrated that the proposed method has better scream detection capability than a conventional method. The proposed method can detect screams with equal error rates of 0.3% and 0.8% under 0 and −5 dB conditions, respectively. Furthermore, the cost of likelihood calculation is about 1/12th that of the conventional method. The proposed method can thus be used to develop a scream detection system that is sufficiently accurate in an actual environment.  相似文献   

12.
成帅  张海剑  孙洪 《信号处理》2019,35(4):601-608
本文提出了一种结合鲁棒时变滤波和时频掩码的语音增强方法。首先在带噪语音的时频域中,结合图像处理方法估计出初始瞬时频率信息。然后基于该瞬时频率信息,利用鲁棒时变滤波算法构建降噪后的语音信号。最后根据重构语音的时频特征预测时频掩码。该掩码在带噪语音的时频域中能够有效地保留语音成分且抑制噪声成分,从而达到语音增强的目的。实验结果表明,在几种常见背景噪声环境下,所提语音增强算法在抑制背景噪声干扰、提升语音整体质量方面表现良好,尤其是在低信噪比环境下具有明显的优势。   相似文献   

13.
基于小波变换的鲁棒型特征提取及说话人识别   总被引:4,自引:0,他引:4  
说话人识别系统在实际应用中面临的主要困难之一是鲁棒性问题,干净语音环境下识别率很高的说话人识别系统,在有噪语音环境下识别性能显著降低。解决这一问题的方法之一是寻找具有鲁棒性的特征参数。本文结合具有多分辨率分析特点的小波变换技术,提出一种基于小波变换的鲁棒型特征提取算法,以提高说话人识别系统在噪声环境下的识别性能。对40个说话人的语音库SUDA2002-D2,在加性高斯白噪声环境下进行的识别实验结果表明,本文提出的特征提取算法可以有效地提高说话人识别系统在噪声环境下的识别性能。  相似文献   

14.

The paper proposes a method to improve the performance of speech communication system in a highly noisy industrial environment. For the improvement, different speech signals are considered which includes signals from different environments such as car noise, railway station, babble noise, street noise which are corrupted with additional noise as input data set for processing. This database is processed using suitable filters which will remove the effect of noise to some extent. Different algorithms have been proposed to minimize the effect of noise to a certain limit. The denoising algorithms are generally the different wavelet thresholding method which removes the noise from the speech signal. Many researchers have worked on soft and hard thresholding for image processing. The proposed method of hybrid thresholding comprises of both soft and hard thresholding process which is comparatively better method than the previous methods. The method can be implemented for the non-stationary noise and it also removes the problems of edges. Unlike the traditional way of using single value, different values are used for the adaptive filtering to remove the edges. During the course of experiments, the dataset of IIIT-H with a set of noisy files from Noizeus and AURORA database having sampling rate of 16 kHz has been used. Results are calculated with subjective and objective measures for fine and broad level quality assessment. SNR, SSNR, PSNR, NRMSE, and PESQ parameters are used as performance parameters and outperform with other combinations as compared to conventional methods. The hybrid threshold method yields better results with significant improvement in speech quality and intelligibility.

  相似文献   

15.
Impulse noise reduction from corrupted images plays an important role in image processing. This problem will also affect on image segmentation, object detection, edge detection, compression, etc. Generally, median filters or nonlinear filters have been used for noise reduction but these methods will destroy the natural texture and important information in the image like the edges. In this paper, to eliminate impulse noises from noisy images, we used a hybrid method based on cellular automata (CA) and fuzzy logic called Fuzzy Cellular Automata (FCA) in two steps. In the first step, based on statistical information, noisy pixels are detected by CA; then using this information, the noisy pixel will change by FCA. Regularly, CA is used for systems with simple components where the behavior of each component will be defined and updated based on its neighbors. The proposed hybrid method is characterized as simple, robust and parallel which keeps the important details of the image effectively. The proposed approach has been performed on well-known gray scale test images and compared with other conventional and famous algorithms, is more effective.  相似文献   

16.
We consider the feature recombination technique in a multiband approach to speaker identification and verification. To overcome the ineffectiveness of conventional feature recombination in broadband noisy environments, we propose a new subband feature recombination which uses subband likelihoods and a subband reliable‐feature selection technique with an adaptive noise model. In the decision step of speaker recognition, a few very low unreliable feature likelihood scores can cause a speaker recognition system to make an incorrect decision. To overcome this problem, reliable‐feature selection adjusts the likelihood scores of an unreliable feature by comparison with those of an adaptive noise model, which is estimated by the maximum a posteriori adaptation technique using noise features directly obtained from noisy test speech. To evaluate the effectiveness of the proposed methods in noisy environments, we use the TIMIT database and the NTIMIT database, which is the corresponding telephone version of TIMIT database. The proposed subband feature recombination with subband reliable‐feature selection achieves better performance than the conventional feature recombination system with reliable‐feature selection.  相似文献   

17.
The authors propose a degradation model which represents the spectral changes of speech signals by the Lombard effect and noise contamination in noisy environments. According to this model, spectral magnitude normalisation and cepstral coefficient transforms are used to restore the cepstrum of clean speech from noisy-Lombard speech  相似文献   

18.
A particularly effective distortion measure that takes into account the norm shrinkage bias in the noisy cepstrum is considered. A first-order equalization mechanism, specifically aiming at avoiding the norm shrinkage problem, is incorporated in a hidden Markov model (HMM) framework to model the speech cepstral sequence. Such a modeling technique requires special care, as the formulation inevitably involves parameter estimation from a set of data with singular dispersion. Solutions to this HMM stochastic modeling problem are provided, and algorithms for estimating the necessary model parameters are given. It is experimentally shown that incorporation of the first-order mean equalization model makes the HMM-based speech recognizer robust to noise. With respect to a conventional HMM recognizer, this leads to an improvement in recognition performance which is equivalent to a gain of about 15-20 dB in signal-to-noise ratio  相似文献   

19.
Two key tasks in the development of cognitive radio networks in commercial and military applications are spectrum sensing and automatic modulation classification (AMC). These tasks become even more difficult when the cognitive radio receiver has no information about the channel or the modulation type. An integrated scheme which includes both these aspects is proposed in this paper. Spectrum sensing is done using cumulants derived from fractional lower order statistics. It is shown through simulations that the proposed sensing method has improved performance, especially in low SNR environments in Gaussian and non-Gaussian noise when compared with the conventional higher-order statistics (HOS) based method. The performance of the automatic modulation classifier is presented in the form of conditional probability of classification, probability of correct classification and confusion matrix under noisy and under fading conditions. Simulations in our previous work showed that the proposed method achieved better classification accuracy when compared to cumulant based AMC method in noise conditions that are highly impulsive than Gaussian. In this paper, simulations show significant improvement in the performance of AMC in the presence of AWGN and under multipath fading, for a known frequency band of interest when compared with the conventional AMC methods available.  相似文献   

20.
为提高语音端点检测系统在低信噪比下检测的准确性,提出了一种基于倒谱特征和谱熵的端点检测算法.首先,根据分析得到待测语音帧的倒谱特征量,然后计算该特征量分别在通过训练得到的语音和噪声的高斯混合模型下的似然概率,通过两者概率的比较作出有声无声初判决;联合能量熵端点检测结果得到最终判决,最后通过Hangover机制最大限度的保护了语音.实验结果表明,此方法改善了能量熵端点检测法在babble噪声下的劣势,且在不同噪声环境下均优于G.729 Annex B的性能.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号