共查询到20条相似文献,搜索用时 15 毫秒
1.
余力 《计算机工程与应用》2011,47(16):147-150
在声学回音消除中,近端语音的出现会导致模拟回音路径的自适应滤波器发散,一个成熟的声学回音消除器应该包含有双端通话检测算法。针对这个问题,提出了一种计算复杂度低的、基于麦克风信号与误差信号的互相关双端通话检测算法。同时,该算法与滤波器系数缓存机制相结合以进一步提高系统的鲁棒性。实验结果表明,该算法具有良好的检测性能,可以对双端通话的出现和消失做出快速响应,同时能显著提高系统在双端通话环境下的回音消除效果。 相似文献
2.
E. P. Jayakumar P. V. Muhammed Shifas P. S. Sathidevi 《International Journal of Speech Technology》2016,19(3):611-621
The quality of speech transmission in mobile communication systems deteriorates due to the presence of background noise and acoustic echo. The background noises are the disturbances from the surroundings and acoustic echo is induced due to the reverberation of loudspeaker signal in the near end environment. In conventional acoustic echo suppression setup, the echo path effect is modelled either in time or in frequency domain, and to cancel the echo, a replica of the echo is created by estimating the echo path response adaptively in the corresponding domain. Recently, the modulation domain analysis, which captures the human perceptual properties, is widely being used in speech processing. Modulation domain conveys the temporal variation of the acoustic magnitude spectra which acts as an information bearing signal. In this work, a novel integrated system for acoustic echo and noise suppression in the modulation domain is developed. So far, no work in this context in modulation domain has been found as reported. An efficient method for modelling the echo path and estimating the echo in the modulation domain is introduced and implemented. The effects of echo and noise are suppressed using the modulation spectral manipulation and the performance of the proposed system is found to be better than other conventional integrated systems. 相似文献
3.
Klaus Reindl Yuanhang Zheng Andreas Schwarz Stefan Meier Roland Maas Armin Sehr Walter Kellermann 《Computer Speech and Language》2013,27(3):726-745
In this contribution, a novel two-channel acoustic front-end for robust automatic speech recognition in adverse acoustic environments with nonstationary interference and reverberation is proposed. From a MISO system perspective, a statistically optimum source signal extraction scheme based on the multichannel Wiener filter (MWF) is discussed for application in noisy and underdetermined scenarios. For free-field and diffuse noise conditions, this optimum scheme reduces to a Delay & Sum beamformer followed by a single-channel Wiener postfilter. Scenarios with multiple simultaneously interfering sources and background noise are usually modeled by a diffuse noise field. However, in reality, the free-field assumption is very weak because of the reverberant nature of acoustic environments. Therefore, we propose to estimate this simplified MWF solution in each frequency bin separately to cope with reverberation. We show that this approach can very efficiently be realized by the combination of a blocking matrix based on semi-blind source separation (‘directional BSS’), which provides a continuously updated reference of all undesired noise and interference components separated from the desired source and its reflections, and a single-channel Wiener postfilter. Moreover, it is shown, how the obtained reference signal of all undesired components can efficiently be used to realize the Wiener postfilter, and at the same time, generalizes well-known postfilter realizations. The proposed front-end and its integration into an automatic speech recognition (ASR) system are analyzed and evaluated in noisy living-room-like environments according to the PASCAL CHiME challenge. A comparison to a simplified front-end based on a free-field assumption shows that the introduced system substantially improves the speech quality and the recognition performance under the considered adverse conditions. 相似文献
4.
为了解决回声抵消器中双端会话检测器计算量大,以及由于其时间的延迟而导致回声抵消性能恶化的问题,提出了一种无双端会话检测的信号的时变功率谱声回声抵消算法。根据双端会话时近端会话与远端会话相互之间的独立性,推导出远端会话信号的时变功率通过回声路径的冲激响应系统与麦克风输入信号的时变互功率相等,利用最小均方误差的自适应滤波来估计回声路径的冲激响应,从而将回声抵消掉。该算法不同于现有的回声抵消算法,它在进行回声抵消的过程中,不需要进行双端会话检测。通过将该算法的步长进行归一化处理,使得自适应滤波器的系数收敛变快,稳定性变强。仿真实验表明,该算法能够在各种会话情况下进行有效的回声抵消。 相似文献
5.
6.
Gordy J.D. Goubran R.A. 《IEEE transactions on audio, speech, and language processing》2006,14(1):33-42
In this paper, standard echo canceller performance measures are evaluated in terms of psychoacoustic aspects of human hearing. The focus is on wideband speech communications systems with long round-trip delays of 200 ms and up present in the transmission path. The results of a simple acoustic echo cancellation experiment are analyzed with a standard psychoacoustic model, revealing that steady-state echo return loss enhancement and mean square error cannot be used to determine whether residual echo is perceivable in the presence of background noise. In addition, a simple modification to the normalized least mean square (NLMS) algorithm is introduced by adding a perceptual preemphasis filter. Simulation results and listening tests show that it is possible to improve the perceived performance of an echo canceller during convergence by placing greater emphasis on frequencies at which the human auditory system is most sensitive. 相似文献
7.
考虑到智能音箱中多采用麦克风阵列作为拾音装置,而单通道自适应滤波技术对声学回声消除具有失真性和复杂性,提出一种麦克风阵列快速回声消除算法。该算法首先用自适应滤波技术估计第一通道回声,然后估计阵列间的相对回声传递函数,把两者相乘得到其他通道回声;其次,把估计出的回声和噪声当作广义旁瓣抵消器(GSC)波束形成下支路的噪声参考信号,利用GSC波束形成算法去除回声和噪声。仿真结果表明,在中度混响、远距离、低回噪比且用音乐作为回声环境时,该算法具有良好的回声消除与噪声抑制性能,不仅运算量小,而且使目标语音信号具有较高的信源失真率和可懂度。 相似文献
8.
9.
Grancharov V. Plasberg J.H. Samuelsson J. Bastiaan Kleijn W. 《IEEE transactions on audio, speech, and language processing》2008,16(1):57-64
Postfilters are commonly used in speech coding for the attenuation of quantization noise. In the presence of acoustic background noise or distortion due to tandeming operations, the postfilter parameters are not adjusted and the performance is, therefore, not optimal. We propose a modification that consists of replacing the nonadaptive postfilter parameters with parameters that adapt to variations in spectral flatness, obtained from the noisy speech. This generalization of the postfiltering concept can handle a larger range of noise conditions, but has the same computational complexity and memory requirements as the conventional postfilter. Test results indicate that the presented algorithm improves on the standard postfilter, as well as on the combination of a noise attenuation preprocessor and the conventional postfilter. 相似文献
10.
This paper proposes a speech enhancement approach to suppress the interference of car noise. A linear microphone array is adopted for far-talking speech acquisition and delay-and-sum beamforming noise reduction. We present an effective time delay estimator using the coherence function between the reference microphone and the beamformed speech. To further enhance the beamformed speech, we exploit an improved Wiener filter where the resulting noise correlation in microphone array is relatively small so that the performance of optimal Wiener filtering could be achieved. Also, due to the serious degradation in low frequency car speech, we develop a spectral weighting function to compensate the low frequency filtering. These two processing units serve as the post filters to attain the desirable enhancement performance. In the experiments on microphone array speech in presence of real and simulated car noises, we find that the proposed algorithm performs well. Performance is measured in terms of the signal-to-noise ratio and the word error rate. The combined delay-and-sum beamformer and two post filters obtain the best results compared to other methods. 相似文献
11.
基于电话会议的声学回波中的双方对讲情况,本文提出了一个无需设置双方对讲检测器,但仍能在双讲过程中保护自适应滤波器消除性能的NLMS类算法.由于可以由远端信号和近端混合接收信号之间的相关性系数的变化来判断双讲发生或回波路径改变,所以改进的算法中直接将其代入滤波器权系数的迭代公式中,从而控制滤波器系数更新的快慢.仿真结果表明与同类算法相比,采用更小的计算量,该算法在双方对讲时能较好地起到保护作用,而在回波路径改变时也具有快速的跟踪性能. 相似文献
12.
Delcroix M. Hikichi T. Miyoshi M. 《IEEE transactions on audio, speech, and language processing》2007,15(6):1791-1801
Reverberation in a room severely degrades the characteristics and auditory quality of speech captured by distant microphones, thus posing a severe problem for many speech applications. Several dereverberation techniques have been proposed with a view to solving this problem. There are, however, few reports of dereverberation methods working under noisy conditions. In this paper, we propose an extension of a dereverberation algorithm based on multichannel linear prediction that achieves both the dereverberation and noise reduction of speech in an acoustic environment with a colored noise source. The method consists of two steps. First, the speech residual is estimated from the observed signals by employing multichannel linear prediction. When we use a microphone array, and assume, roughly speaking, that one of the microphones is closer to the speaker than the noise source, the speech residual is unaffected by the room reverberation or the noise. However, the residual is degraded because linear prediction removes an average of the speech characteristics. In a second step, the average of the speech characteristics is estimated and used to recover the speech. Simulations were conducted for a reverberation time of 0.5 s and an input signal-to-noise ratio of 0 dB. With the proposed method, the reverberation was suppressed by more than 20 dB and the noise level reduced to -18 dB. 相似文献
13.
Potamitis I. Kokkinakis G. 《IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans : a publication of the IEEE Systems, Man, and Cybernetics Society》2007,37(1):72-81
The general problem addressed in this paper is that of separating the voices of active moving speakers in the presence of background noise and moderate reverberation level in the acoustic field using a single microphone array. We adapt the multisensor multitarget tracking theory to the context of microphone arrays in order to form receptive beams that lock on each moving speaker on an extended time basis and therefore, achieve voice separation. Our approach: 1) incorporates kinematical information of speakers' movement by using an interacting multiple model (IMM) estimator per speaker in order to constrain the evolution of direction of arrival (DOA) measurements, which characterize various motions of the speakers, and 2) can directly account for measurement origin uncertainty, i.e., which measurement comes from which speaker, by using the probabilistic-data-association technique in conjunction with the IMM estimator. The effectiveness of the approach is illustrated by an extensive simulation study on tracking the DOAs of two speakers with crossing trajectories and three static speakers having a conversation with partially overlapping speech and long pauses 相似文献
14.
Tomohiro Nakatani Keisuke Kinoshita Masato Miyoshi 《IEEE transactions on audio, speech, and language processing》2007,15(1):80-95
The distant acquisition of acoustic signals in an enclosed space often produces reverberant artifacts due to the room impulse response. Speech dereverberation is desirable in situations where the distant acquisition of acoustic signals is involved. These situations include hands-free speech recognition, teleconferencing, and meeting recording, to name a few. This paper proposes a processing method, named Harmonicity-based dEReverBeration (HERB), to reduce the amount of reverberation in the signal picked up by a single microphone. The method makes extensive use of harmonicity, a unique characteristic of speech, in the design of a dereverberation filter. In particular, harmonicity enhancement is proposed and demonstrated as an effective way of estimating a filter that approximates an inverse filter corresponding to the room impulse response. Two specific harmonicity enhancement techniques are presented and compared; one based on an average transfer function and the other on the minimization of a mean squared error function. Prototype HERB systems are implemented by introducing several techniques to improve the accuracy of dereverberation filter estimation, including time warping analysis. Experimental results show that the proposed methods can achieve high-quality speech dereverberation, when the reverberation time is between 0.1 and 1.0 s, in terms of reverberation energy decay curves and automatic speech recognition accuracy 相似文献
15.
用麦克风阵列进行语音处理的方法可以提高信噪比,解决环境噪声、回声和混响引起的语音识别性能降低的问题.介绍基于延迟-累加方法(传统波束法) 、自适应波束法及基于后置自适应滤波等结构的麦克风阵列语音增强的基本原理,总结了各种算法的特点. 相似文献
16.
《IEEE transactions on audio, speech, and language processing》2009,17(4):534-545
17.
《IEEE transactions on audio, speech, and language processing》2008,16(8):1466-1478
18.
分析了IP电话系统中回声的特点,并针对其回声具有路径延尺、延尺抖动大等特点,提出了一种基于延尺分析的自适应回声消除器设计的新思想和相关算法。介绍了自适应回声消除器的基本结构及各个功能模块,例如:变阶自适应滤波器,语音状态检测器,延尺分析器和NLMS算法控制器。本文还详细论述了以TMS320C6201DSP为硬件平台实现能运用在IP电话网关中的自适应回声消除器及其在IP电话系统中的连接关系。在文章的最后,列举了自适应回声消除器的性能测试结果,并分析了自适应回声消除器的优点,证明了自适应回声消除器能很好地解决IP电话系统中的回声消除问题。 相似文献
19.
20.
Electronic hearing protection devices are increasingly used in noisy environments. Theses devices feature a miniaturized external microphone and internal loudspeaker in addition to an analog or digital electronic circuit. They can transmit useful audio signals such as speech and warning signals to the protected ear and can reduce the sound pressure level using dynamic range compression. In the case of a digital electronic circuit, the transmission of audio signals may be noticeably delayed because of the latency introduced by the digital signal processor and by the analog-to-digital and digital-to-analog converters. These delayed audio signals will hence interfere with the audio signals perceived naturally through the passive acoustical path of the device. The proposed study presents an original procedure to evaluate, for two representative passive earplugs, the shortest delay at which human listeners start to perceive two sounds composed of the signal transmitted through the electronic circuit and the passively transmitted signal. This shortest delay is called the echo threshold and represents the delay between the time of perception of one fused sound from two separate sounds. In this study, a transient signal, a clean speech signal, a speech signal corrupted by factory noise, and a speech signal corrupted by babble noise are used to determine the echo thresholds of the two earplugs. Twenty untrained listeners participated in this study, and were asked to determine the echo thresholds using a test software in which attenuated signals are delayed from the original signals in real-time. The findings show that when using hearing devices, the echo threshold depends on four parameters: (a) the attenuation function of the device, (b) the duration of the signal, (c) the level of the background noise and (d) the type of background noise. Defined here as the shortest time delay at which at least 20% of the participants noticed an echo, the echo threshold was found to be 8 ms for a bell signal, 16 ms for clean speech and 22 ms for speech corrupted by babble noise when using a shallow earplug fit. When using a deep fit, the echo threshold was found to be 18 ms for a bell signal and 26 ms for clean speech and 68 ms for speech in factory. No echo threshold could be clearly determined for the speech signal in babble noise with a deep earplug fit. 相似文献