首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 343 毫秒
1.
本文提出了一种用于语音识别的双麦克风语音增强算法。该算法主要利用两个语音通道之间语音信号的空间相关性和时间相关性,进行空时域滤波,消除噪声。在输入语音信噪比为0至20dB之间时,能获得较大的信噪比处理增益。该方法只采用了两个麦克风,结构简单;相对于维纳后滤波法,解除了要求两麦克风接收的噪声信号不相关的约束,可以去除点声源的非强相干噪声。和一般的波束形成算法相比,可以去除期望声源方向的弱相关噪声。  相似文献   

2.
为解决现有语音增强算法需要麦克风数量较多和受估计误差影响较大的问题,提出一种改进的声源定位和波束形成方法。在现有声源定位算法利用时间延迟的基础上,引入能量衰减参数,实现利用双麦克风进行声源定位的目标;在波束形成算法中引入加载系数,在出现协方差矩阵统计失配时仍可对期望方向聚焦,提高波束形成算法的鲁棒性。仿真结果表明,改进后的算法与传统算法相比具有更强的鲁棒性。  相似文献   

3.
陈斌杰  陆志华  周宇  叶庆卫 《计算机应用》2018,38(12):3643-3648
为了探究利用两个麦克风进行多声源分离和二维平面定位的可能性,提出了一种基于双麦克风的室内语音分离与声源定位系统。该系统根据麦克风采集的信号,建立了双麦克风时延-衰减模型,然后利用DUET算法估计了模型的时延-衰减参数,并绘制了参数直方图。在语音分离阶段,建立了二进制时频掩膜(BTFM),根据参数直方图,结合二值掩蔽的方法对混合语音进行了分离;在声源定位阶段,通过推导模型衰减参数与信号能量比之间的关系,得到了确定声源位置的数学方程组。利用Roomsimove工具箱模拟室内声学环境,通过Matlab仿真和几何坐标计算,在对多个声源目标分离的同时完成了二维平面中的定位。实验结果表明,该系统对多个声源信号的定位误差均在2%以下,有助于小型系统的研究和开发。  相似文献   

4.
基于似然比测试的语音激活检测算法   总被引:3,自引:3,他引:0       下载免费PDF全文
李燕诚  崔慧娟  唐昆 《计算机工程》2009,35(10):214-216
针对低信噪比和噪声变化情况下语音激活检测算法性能下降的问题,提出一种新的参数更新和取值算法。该算法采用Laplacian-Gaussian混合模型对带噪语音频谱的概率分布进行建模,模型参数从带噪语音中估计得到,噪声功率参数通过跟踪语音的音节间隙进行平滑。实验结果表明,该算法在-5dB信噪比情况下,可以达到95%以上的检测率,具有优越的跟踪性能。  相似文献   

5.
提出一种符合人耳听觉感知的语音增强方法,使电子耳蜗能在噪声环境下获得准确的语音信息。利用Bark子波变换实现电子耳蜗中的语音处理,结合人耳听觉系统特性实现语音增强。使用根据人耳听觉掩蔽效应提出的自适应减参数。实验结果表明该算法在低信噪比情况下,信噪比可提高30 dB左右,更好地抑制了残留噪声和背景噪声,合成的语音具有较好清晰度和可懂度。  相似文献   

6.
研究表明,增强后的语音与纯净语音相比,会存在两种不同类型的畸变:放大畸变和衰减畸变,而放大畸变对语音可懂度的影响较大。传统的语音增强算法大多不能有效提高语音增强后的可懂度,因为这些算法仅使用最小均方误差的方法来限制这两种畸变,从而抑制噪声,提高语音的质量,但忽略了不同的畸变类型对可懂度的影响不同。提出一种基于子空间的提高可懂度的语音增强算法,使用先验信噪比及增益矩阵来判断语音畸变的类型。同时注意到,在估计先验信噪比时会存在估计误差:高估和低估,而高估会产生放大畸变,对可懂度造成较大的影响。先对高估先验信噪比(小于-10 dB)的增益矩阵进行修正,然后再对幅度谱畸变大于0 dB及6.02 dB的语音进行不同的限制。实验表明,所提出的算法能够有效增强语音的可懂度。  相似文献   

7.
为了能够在强噪声、干扰声等复杂环境下提取干净的目标语音,提高输出信号的信噪比和信干比,本文提出了一种基于多参考信号ICA算法的语音提取方案。该方法利用声源定位、波束形成和小波分解等算法结果作为参考信号,应用基于负熵的FastICA算法估计目标语音。使用麦克风阵实测语音信号的仿真实验证明,本文提出的算法能有效地抑制背景噪声和干扰声,恢复目标语音波形和语谱图。与常规波束形成和ICA算法相比较,本文的处理方法有更好的性能,输出信号的信噪比和信干比更高。  相似文献   

8.
针对强噪声环境下语音增强中噪声估计和先验信噪比估计算法导致的语音失真和音乐噪声的问题,利用语音和噪声的统计模型的对称性得到一种噪声幅度的估计值为参考,提出了一种噪声估计算法,改进了先验信噪比估计算法,形成了一种新的增强算法,适用于强噪声环境下的语音增强。由仿真实验给出的客观评分看出,在0 dB乃至-5 dB条件下,给出信噪比估计算法能够有效减小信号失真,基本上没有残留音乐噪声。  相似文献   

9.
提出一种基于声源时延估计的二元时频掩蔽方法.通过三个接收信号实现多于多个语音源信号的欠定盲分离.利用语音信号的W-分离正交性,在时频域估计各个源信号到达接收阵列的相对时延序列;进而基于信号时延序列的估计,采用最大似然算法将时频域划分为与源信号个数相同的互不重叠的时频点集合,每个集合(近似)只包含一个源信号的所有时频分量;再通过二元时频掩蔽依次恢复出各集合所对应的源信号.该方法性能通过主观试听得到了验证,其分段信噪比增益至少为13 dB.较之欠定解混迭估计技术DUET,本文方法得到的分离信号与实际声源信号的相异度降低约3 dB.  相似文献   

10.
基于语音增强失真补偿的抗噪声语音识别技术   总被引:1,自引:0,他引:1  
本文提出了一种基于语音增强失真补偿的抗噪声语音识别算法。在前端,语音增强有效地抑制背景噪声;语音增强带来的频谱失真和剩余噪声是对语音识别不利的因素,其影响将通过识别阶段的并行模型合并或特征提取阶段的倒谱均值归一化得到补偿。实验结果表明,此算法能够在非常宽的信噪比范围内显著的提高语音识别系统在噪声环境下的识别精度,在低信噪比情况下的效果尤其明显,如对-5dB的白噪声,相对于基线识别器,该算法可使误识率下降67.4%。  相似文献   

11.
《Advanced Robotics》2013,27(15):2093-2111
People usually talk face to face when they communicate with their partner. Therefore, in robot audition, the recognition of the front talker is critical for smooth interactions. This paper presents an enhanced speech detection method for a humanoid robot that can separate and recognize speech signals originating from the front even in noisy home environments. The robot audition system consists of a new type of voice activity detection (VAD) based on the complex spectrum circle centroid (CSCC) method and a maximum signal-to-noise ratio (SNR) beamformer. This VAD based on CSCC can classify speech signals that are retrieved at the frontal region of two microphones embedded on the robot. The system works in real-time without needing training filter coefficients given in advance even in a noisy environment (SNR > 0 dB). It can cope with speech noise generated from televisions and audio devices that does not originate from the center. Experiments using a humanoid robot, SIG2, with two microphones showed that our system enhanced extracted target speech signals more than 12 dB (SNR) and the success rate of automatic speech recognition for Japanese words was increased by about 17 points.  相似文献   

12.
A Spectral Conversion Approach to Single-Channel Speech Enhancement   总被引:1,自引:0,他引:1  
In this paper, a novel method for single-channel speech enhancement is proposed, which is based on a spectral conversion feature denoising approach. Spectral conversion has been applied previously in the context of voice conversion, and has been shown to successfully transform spectral features with particular statistical properties into spectral features that best fit (with the constraint of a piecewise linear transformation) different target statistics. This spectral transformation is applied as an initialization step to two well-known single channel enhancement methods, namely the iterative Wiener filter (IWF) and a particular iterative implementation of the Kalman filter. In both cases, spectral conversion is shown here to provide a significant improvement as opposed to initializations using the spectral features directly from the noisy speech. In essence, the proposed approach allows for applying these two algorithms in a user-centric manner, when "clean" speech training data are available from a particular speaker. The extra step of spectral conversion is shown to offer significant advantages regarding output signal-to-noise ratio (SNR) improvement over the conventional initializations, which can reach 2 dB for the IWF and 6 dB for the Kalman filtering algorithm, for low input SNRs and for white and colored noise, respectively  相似文献   

13.
提升低信噪比下的分离语音质量是语音分离技术研究的重点,而大多数语音分离方法在低信噪比下仍只对目标说话人的语音进行特征训练.针对目前方法的不足,提出了一种基于联合训练生成对抗网络GAN的混合语音分离方法.为避免复杂的声学特征提取,生成模型采用全卷积神经网络直接提取混合语音时域波形的高维特征,判别模型通过构建二分类卷积神经...  相似文献   

14.
This paper addresses the problem of single-channel speech enhancement of low (negative) SNR of Arabic noisy speech signals. For this aim, a binary mask thresholding function based coiflet5 mother wavelet transform is proposed for Arabic speech enhancement. The effectiveness of binary mask thresholding function based coiflet5 mother wavelet transform is compared with Wiener method, spectral subtraction, log-MMSE, test-PSC and p-mmse in presence of babble, pink, white, f-16 and Volvo car interior noise. The noisy input speech signals are processed at various levels of input SNR range from ?5 to ?25 dB. Performance of the proposed method is evaluated with the help of PESQ, SNR and cepstral distance measure. The results obtained by proposed binary mask thresholding function based coiflet5 wavelet transform method are very encouraging and shows that the proposed method is much helpful in Arabic speech enhancement than other existing methods.  相似文献   

15.
Generative adversarial networks (GANs) are paid more attention to dealing with the end-to-end speech enhancement in recent years. Various GAN-based enhancement methods are presented to improve the quality of reconstructed speech. However, the performance of these GAN-based methods is worse than those of masking-based methods. To tackle this problem, we propose speech enhancement method with a residual dense generative adversarial network (RDGAN) contributing to map the log-power spectrum (LPS) of degraded speech to the clean one. In detail, a residual dense block (RDB) architecture is designed to better estimate the LPS of clean speech, which can extract rich local features of LPS through densely connected convolution layers. Meanwhile, sequential RDB connections are incorporated on various scales of LPS. It significantly increases the feature learning flexibility and robustness in the time-frequency domain. Simulations show that the proposed method achieves attractive speech enhancement performance in various acoustic environments. Specifically, in the untrained acoustic test with limited priors, e.g., unmatched signal-to-noise ratio (SNR) and unmatched noise category, RDGAN can still outperform the existing GAN-based methods and masking-based method in the measures of PESQ and other evaluation indexes. It indicates that our method is more generalized in untrained conditions.  相似文献   

16.
论文针对带噪的耳语音提出了一种利用ADALINE神经网络消除背景噪声的耳语音增强算法。首先利用传统的谱减法来取得较好的谱包络,在此基础上使用AD线性神经网络进行自适应预测以达到提高耳语音质量的目的。结果表明,即使在低信噪比的情况下,信噪比也能提高20 dB左右,而且取得了良好的听觉效果。  相似文献   

17.
This paper proposes a phase-based dual-microphone speech enhancement technique that utilizes a prior speech model. Recently, it has been shown that phase-based dual-microphone filters can result in significant noise reduction in low signal-to-noise ratio [(SNR) less than 10 dB] conditions and negligible distortion at high SNRs (greater than 10 dB), as long as a correct filter parameter is chosen at each SNR. While prior work utilizes a constant parameter for all SNRs, we present an SNR-adaptive filter parameter estimation algorithm that maximizes the likelihood of the enhanced speech features based on a prior speech model. Experimental results using the CARVUI database show significant speech recognition accuracy rate improvement over alternative techniques in low SNR situations (e.g., an improvement of 11% in word error rate (WER) over postfiltering and 23% over delay-and-sum beamforming at 0 dB) and negligible distortion at high SNRs. The proposed adaptive approach also significantly outperforms the original phase-based filter with a constant parameter. Furthermore, it improves the filter's robustness when there are errors in time delay estimation  相似文献   

18.
基于波束形成的谱相减语音增强   总被引:1,自引:0,他引:1  
为降低谱相减算法产生的音乐噪声,并提高语音增强效果,本文在深入研究波束形成技术和谱相减算法的基础上,提出波束形成器后级联谱相减的语音增强处理方法,并分析、验证了这种结构的可行性。对真实环境下的语音和噪声数据的处理结果显示,谊级联结构的语音增强系统可显著降低背景噪声,语音失真小,并易于实时实现,信噪比增益达16.5dB。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号