共查询到18条相似文献,搜索用时 234 毫秒
1.
2.
在复杂的声学环境中,通常噪声场特性和混响强度是未知的,这样就对麦克风阵列语音增强算法的性能提出了较高的要求.本文提出一种基于带噪语音信号相位差和后置滤波的语音增强方法.首先,将麦克风阵列接收信号分帧,利用相邻两个麦克风之间每帧带噪语音信号的相位差,构成该帧改变频率点幅度谱值的比例系数,对该帧带噪语音信号进行掩蔽增强处理,得到预处理信号;然后利用固定波束形成、独立分量分析算法和后置滤波技术对预处理信号进一步处理,从而有效地抑制了噪声.计算机仿真实验结果表明,在存在一定混响的多种噪声场中,该方法均具有较好的噪声抑制能力. 相似文献
3.
针对噪声和混响环境下的助听器用户聆听上的困难,基于麦克风阵列的数字助听器设计能够很好的提高助听器在这种环境下的语音信噪比。本文研究了应用麦克风阵列进行数字助听器语音增强处理技术,提出了一种基于粒子群优化的改进粒子滤波算法,它将语音增强问题转换为从带噪语音中对纯净语音的估计过程,引入粒子群优化的方法来产生建议分布,使降噪结果更接近纯净语音,从而得到更好的语音增强效果。 相似文献
4.
5.
6.
7.
8.
基于近场波束形成的麦克风阵列语音增强方法 总被引:1,自引:0,他引:1
当麦克风阵列用于封闭环境中非手持式语音拾取时,必须面对的一个问题是声场为阵列近场的问题。该文在子带自适应波束形成方法的基础上,引进了一种基于近场波束形成的麦克风阵列语音增强方法。该方法充分利用了近场球面波的波前弯曲率,有效地衰减了混响和噪声对期望信号的影响。仿真实验结果表明,在小房间混响条件下,基于近场波束形成的麦克风阵列语音增强方法取得了较好的噪声抑制效果。 相似文献
9.
随着多媒体技术的进一步发展,语音接收和声音信号处理得到了日益广泛的关注和应用,而声源的定位和声源增强是实现语音增强,语音识别的前提和基础.基于麦克风阵列的声源定位技术由于其广泛的应用前景得到了众多学者的关注.单个麦克风接收到的信息量较少,缺少声源定位所需要的信息,而麦克风阵列克服了上述的缺点,利用了各麦克风信号之间信号的相关性对数据进行相关分析和处理从而实现声源的定位.文中阐述了麦克风阵列声源定位的原理,推导计算目标方位角、俯仰角以及距离的计算公式;阐述了硬件系统的组成以及各个部分的作用并通过实验进行了系统的测试,通过对测试数据的分析得出麦克风阵列声源定位系统能够实现声源的快速定位. 相似文献
10.
临境语音通信与智能语音交互都面临复杂声学环境中的远距离高保真拾音难题,解决这一难题的有效途径是使用由多个麦克风传感器组成的麦克风阵列或多通道拾音系统,这种系统的核心是信号处理,通过对空间采样的声场信息进行时、空、频三域的联合处理来实现声源定向/定位、信号增强、噪声抑制、混响抑制、声源分离、声场参数估计等功能。麦克风阵列信号处理的方法有很多,其中研究的最多、使用的最广的方法是波束形成。本文对麦克风阵列波束形成的原理、进展以及当前常用的方法进行简要综述,内容涵盖延迟求和、超指向、差分、正交级数展开、Kronecker和自适应波束形成方法等。论文侧重于方法原理、机理和架构方面的探讨,具体的算法实现细节感兴趣的读者可以参考相应的文献。 相似文献
11.
Tracking an unknown time-varying number of speakers using TDOA measurements: a random finite set approach 总被引:1,自引:0,他引:1
Wing-Kin Ma Ba-Ngu Vo Singh S.S. Baddeley A. 《Signal Processing, IEEE Transactions on》2006,54(9):3291-3304
Speaker location estimation techniques based on time-difference-of-arrival measurements have attracted much attention recently. Many existing localization ideas assume that only one speaker is active at a time. In this paper, we focus on a more realistic assumption that the number of active speakers is unknown and time-varying. Such an assumption results in a more complex localization problem, and we employ the random finite set (RFS) theory to deal with that problem. The RFS concepts provide us with an effective, solid foundation where the multispeaker locations and the number of speakers are integrated to form a single set-valued variable. By applying a sequential Monte Carlo implementation, we develop a Bayesian RFS filter that simultaneously tracks the time-varying speaker locations and number of speakers. The tracking capability of the proposed filter is demonstrated in simulated reverberant environments. 相似文献
12.
In order to improve tracking performance in a noisy and reverberant environment,an acoustic source tracking algorithm using track before detect was proposed.This algorithm used a modified steered response power as localization function which take into account a rectangular region to achieve more robust source location estimation than steered response power function and applied track-before-detect technology to avoid reduplicate calculation of the same rectangular region so that the algorithm could reduce the computation burden without decreasing the accuracy.The simulation results verified that the proposed algorithm can achieve more accurate tracking results than traditional tracking algorithm in a noisy and reverberant environment. 相似文献
13.
Ali Dehghan Firoozabadi Hamid Reza Abutalebi 《Circuits, Systems, and Signal Processing》2016,35(2):573-601
This paper addresses the topic of simultaneous speaker localization. The work is related to the generalized cross-correlation (GCC)-based methods for estimating the direction of multiple speakers. Considering the defects of GCC-based direction of arrival (DOA) estimation methods, we have applied several modifications to improve our previous subband processing-based system for the localization of simultaneous speakers. Three modifications have been presented in this paper. In the first step, the DOA estimation method is equipped with a front-end block that determines the number of speakers based on K-means clustering and silhouette criterion. This block provides the true number of speakers for the DOA estimator. Secondly, in order to eliminate the spatial aliasing, we propose a novel nested circular microphone array. In the proposed array design, each microphone pair is only used in appropriate subband according to its inter-microphone distance. In the third step, to overcome the weakness of GCC-phase transform (GCC-PHAT) in noisy and noisy-reverberant conditions, we propose a SNR estimation block. So, we can separate noisy and reverberant conditions and use PHAT filter for reverberant conditions and maximum likelihood filter for noisy situations. The proposed method has been evaluated on both simulated and real multi-speaker speech data in various environmental conditions and different number of speakers. Our evaluations in terms of DOA accuracy demonstrate the superiority of the proposed method compared to the fullband and baseline subband methods. 相似文献
14.
本文提出一种基于采样交互的多模型粒子滤波方法,实现了对随意运动说话人的有效跟踪。该方法根据说话人跟踪问题的特点,用马尔可夫跳变系统描述说话人的动态特性,用粒子滤波方法估计说话人的位置。在说话人跟踪过程中,通过调整滤波粒子的采样区域,完成交互式多模型方法中的输入交互,这不仅实现了各子滤波器中粒子数目的任意设定,避免了模型转换过程中的性能退化现象,而且取消了对模型后验概率密度函数的高斯分布假定,增强了说话人跟踪系统的鲁棒性。计算机仿真实验结果验证了本文方法的有效性。 相似文献
15.
为了改善在复杂环境下声源定位算法的性能,提出了一种新的时延估计(TDE)方法,即基于传递函数比的统计模型方法(ATFR-SM)。该方法采用统计模型去除噪声对传递函数(ATF)的影响,在计算传递函数时对功率谱密度(PSD)进行平滑和“白化”,以去除混响对传递函数的影响。同时,算法中引入话音激活检测(VAD)去除对求取传递函数无用的噪声段,以提高时延估计的准确性。此外,将所提时延估计方法与线性定位法相结合,构成一套完整的声源定位方法。实验结果表明,在复杂环境下,时延估计方法具有更低的异常点百分比(PAP)和均方根误差(RMSE),且明显优于传统的参考算法,同时声源定位方法具有更高的定位精度。 相似文献
16.
Time–frequency masking has evolved as a powerful tool for tackling blind source separation problems. In previous work, mask estimation was performed with the help of well-known standard cluster algorithms. Spatial observation vectors, extracted from a set of microphones, were grouped into separate clusters, each representing a particular source. However, most off-the-shelf clustering methods are not very robust to outliers or noise in the data. This lack of robustness often leads to incorrect localization and partitioning results, particularly for reverberant speech mixtures. To address this issue, we investigate the use of observation weights and context information as means to improve the clustering performance under reverberant conditions. While the observation weights improve the localization accuracy by ignoring noisy observations, context information smoothes the cluster membership levels by exploiting the highly structured nature of speech signals in the time–frequency domain. In a number of experiments, we demonstrate the superiority of the proposed method over conventional fuzzy clustering, both in terms of localization accuracy as well as speech separation performance. 相似文献
17.
传声器阵列的声源定位研究 总被引:2,自引:0,他引:2
对传声器阵列进行了总体概述,研究了基于传声器阵列的声源定位所面临的问题,分析和比较了几类主要的声源定位方法。给出了一种基于时间到达差的声源定位在处于混响环境下时延估计的有效算法并通过实验验证了该算法。 相似文献
18.
基于迭代中心差分卡尔曼滤波的说话人跟踪方法 总被引:3,自引:0,他引:3
利用状态空间方法对说话人进行语音跟踪时,观测方程的非线性会影响说话人位置的估计精度。该文将迭代滤波理论与中心差分卡尔曼滤波技术相结合,提出迭代的中心差分卡尔曼滤波方法,并应用于说话人跟踪系统。仿真实验结果表明,该文所提出的方法减少了系统线性化误差,增强了滤波算法的鲁棒性,提高了说话人跟踪精度。 相似文献