期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

余力《计算机工程与应用》2011,47(16):147-150

在声学回音消除中,近端语音的出现会导致模拟回音路径的自适应滤波器发散,一个成熟的声学回音消除器应该包含有双端通话检测算法。针对这个问题,提出了一种计算复杂度低的、基于麦克风信号与误差信号的互相关双端通话检测算法。同时,该算法与滤波器系数缓存机制相结合以进一步提高系统的鲁棒性。实验结果表明,该算法具有良好的检测性能,可以对双端通话的出现和消失做出快速响应,同时能显著提高系统在双端通话环境下的回音消除效果。相似文献

2.

Integrated acoustic echo and noise suppression in modulation domain

E. P. Jayakumar P. V. Muhammed Shifas P. S. Sathidevi 《International Journal of Speech Technology》2016,19(3):611-621

The quality of speech transmission in mobile communication systems deteriorates due to the presence of background noise and acoustic echo. The background noises are the disturbances from the surroundings and acoustic echo is induced due to the reverberation of loudspeaker signal in the near end environment. In conventional acoustic echo suppression setup, the echo path effect is modelled either in time or in frequency domain, and to cancel the echo, a replica of the echo is created by estimating the echo path response adaptively in the corresponding domain. Recently, the modulation domain analysis, which captures the human perceptual properties, is widely being used in speech processing. Modulation domain conveys the temporal variation of the acoustic magnitude spectra which acts as an information bearing signal. In this work, a novel integrated system for acoustic echo and noise suppression in the modulation domain is developed. So far, no work in this context in modulation domain has been found as reported. An efficient method for modelling the echo path and estimating the echo in the modulation domain is introduced and implemented. The effects of echo and noise are suppressed using the modulation spectral manipulation and the performance of the proposed system is found to be better than other conventional integrated systems. 相似文献

3.

A stereophonic acoustic signal extraction scheme for noisy and reverberant environments

Klaus Reindl Yuanhang Zheng Andreas Schwarz Stefan Meier Roland Maas Armin Sehr Walter Kellermann 《Computer Speech and Language》2013,27(3):726-745

In this contribution, a novel two-channel acoustic front-end for robust automatic speech recognition in adverse acoustic environments with nonstationary interference and reverberation is proposed. From a MISO system perspective, a statistically optimum source signal extraction scheme based on the multichannel Wiener filter (MWF) is discussed for application in noisy and underdetermined scenarios. For free-field and diffuse noise conditions, this optimum scheme reduces to a Delay & Sum beamformer followed by a single-channel Wiener postfilter. Scenarios with multiple simultaneously interfering sources and background noise are usually modeled by a diffuse noise field. However, in reality, the free-field assumption is very weak because of the reverberant nature of acoustic environments. Therefore, we propose to estimate this simplified MWF solution in each frequency bin separately to cope with reverberation. We show that this approach can very efficiently be realized by the combination of a blocking matrix based on semi-blind source separation (‘directional BSS’), which provides a continuously updated reference of all undesired noise and interference components separated from the desired source and its reflections, and a single-channel Wiener postfilter. Moreover, it is shown, how the obtained reference signal of all undesired components can efficiently be used to realize the Wiener postfilter, and at the same time, generalizes well-known postfilter realizations. The proposed front-end and its integration into an automatic speech recognition (ASR) system are analyzed and evaluated in noisy living-room-like environments according to the PASCAL CHiME challenge. A comparison to a simplified front-end based on a free-field assumption shows that the introduced system substantially improves the speech quality and the recognition performance under the considered adverse conditions. 相似文献

4.

一种无双端会话检测的回声抵消算法

袁佳能于凤芹《计算机工程与应用》2008,44(15):33-35

为了解决回声抵消器中双端会话检测器计算量大,以及由于其时间的延迟而导致回声抵消性能恶化的问题,提出了一种无双端会话检测的信号的时变功率谱声回声抵消算法。根据双端会话时近端会话与远端会话相互之间的独立性,推导出远端会话信号的时变功率通过回声路径的冲激响应系统与麦克风输入信号的时变互功率相等,利用最小均方误差的自适应滤波来估计回声路径的冲激响应,从而将回声抵消掉。该算法不同于现有的回声抵消算法,它在进行回声抵消的过程中,不需要进行双端会话检测。通过将该算法的步长进行归一化处理,使得自适应滤波器的系数收敛变快,稳定性变强。仿真实验表明,该算法能够在各种会话情况下进行有效的回声抵消。相似文献

5.

基于波束形成与多参考源噪声对消的语音增强算法

魏序赵平谭晶晶《计算机与现代化》2011,(12):45-47,52

通过传声器阵列采用波束形成技术采集语音信号,同时使用参考传声器获得背景噪声信号,本文提出一种基于波束形成和自适应多参考噪声对消的语音增强算法。该算法不依赖任何信号模型且无需对噪声信号的统计特性进行先验假设,可以适应背景噪声的突然改变,同时具有良好的实时性和鲁棒性。可广泛应用于复杂噪声环境下目标语音识别,仿真结果表明了该算法的有效性。相似文献

6.

On the perceptual performance limitations of echo cancellers in wideband telephony

Gordy J.D. Goubran R.A. 《IEEE transactions on audio, speech, and language processing》2006,14(1):33-42

In this paper, standard echo canceller performance measures are evaluated in terms of psychoacoustic aspects of human hearing. The focus is on wideband speech communications systems with long round-trip delays of 200 ms and up present in the transmission path. The results of a simple acoustic echo cancellation experiment are analyzed with a standard psychoacoustic model, revealing that steady-state echo return loss enhancement and mean square error cannot be used to determine whether residual echo is perceivable in the presence of background noise. In addition, a simple modification to the normalized least mean square (NLMS) algorithm is introduced by adding a perceptual preemphasis filter. Simulation results and listening tests show that it is possible to improve the perceived performance of an echo canceller during convergence by placing greater emphasis on frequencies at which the human auditory system is most sensitive. 相似文献

7.

智能音箱中的一种快速回声消除算法

张伟王冬霞于玲《计算机应用》2020,40(4):1191-1195

考虑到智能音箱中多采用麦克风阵列作为拾音装置,而单通道自适应滤波技术对声学回声消除具有失真性和复杂性,提出一种麦克风阵列快速回声消除算法。该算法首先用自适应滤波技术估计第一通道回声,然后估计阵列间的相对回声传递函数,把两者相乘得到其他通道回声;其次,把估计出的回声和噪声当作广义旁瓣抵消器（GSC）波束形成下支路的噪声参考信号,利用GSC波束形成算法去除回声和噪声。仿真结果表明,在中度混响、远距离、低回噪比且用音乐作为回声环境时,该算法具有良好的回声消除与噪声抑制性能,不仅运算量小,而且使目标语音信号具有较高的信源失真率和可懂度。相似文献

8.

基于TF-GSC的多通道后置滤波语音增强算法

马子骥倪忠余旭《传感器与微系统》2018,(5):105-107,111

针对传统语音增强算法在非平稳噪声,尤其是在噪声为语音的环境下,对噪声的抑制效果急剧下降的情况,提出了一种基于传递函数—广义旁瓣抵消(TF-GSC)和最佳修正测井谱振幅估计量(OM-LSA)的改进型多通道后置滤波语音增强算法.算法在后置滤波时,利用TF-GSC输出信号与参考噪声之间的相互关系求解出语音存在概率,并更新噪声功率谱估计.实验结果表明:算法可以有效地抑制非平稳噪声,提高语音增强算法在语音噪声环境下的鲁棒性. 相似文献

9.

Generalized Postfilter for Speech Quality Enhancement

Grancharov V. Plasberg J.H. Samuelsson J. Bastiaan Kleijn W. 《IEEE transactions on audio, speech, and language processing》2008,16(1):57-64

Postfilters are commonly used in speech coding for the attenuation of quantization noise. In the presence of acoustic background noise or distortion due to tandeming operations, the postfilter parameters are not adjusted and the performance is, therefore, not optimal. We propose a modification that consists of replacing the nonadaptive postfilter parameters with parameters that adapt to variations in spectral flatness, obtained from the noisy speech. This generalization of the postfiltering concept can handle a larger range of noise conditions, but has the same computational complexity and memory requirements as the conventional postfilter. Test results indicate that the presented algorithm improves on the standard postfilter, as well as on the combination of a noise attenuation preprocessor and the conventional postfilter. 相似文献

10.

Car Speech Enhancement Using a Microphone Array

Jen-Tzung?Chien Email author Po-Yin?Lai 《International Journal of Speech Technology》2005,8(1):79-91

This paper proposes a speech enhancement approach to suppress the interference of car noise. A linear microphone array is adopted for far-talking speech acquisition and delay-and-sum beamforming noise reduction. We present an effective time delay estimator using the coherence function between the reference microphone and the beamformed speech. To further enhance the beamformed speech, we exploit an improved Wiener filter where the resulting noise correlation in microphone array is relatively small so that the performance of optimal Wiener filtering could be achieved. Also, due to the serious degradation in low frequency car speech, we develop a spectral weighting function to compensate the low frequency filtering. These two processing units serve as the post filters to attain the desirable enhancement performance. In the experiments on microphone array speech in presence of real and simulated car noises, we find that the proposed algorithm performs well. Performance is measured in terms of the signal-to-noise ratio and the word error rate. The combined delay-and-sum beamformer and two post filters obtain the best results compared to other methods. 相似文献

11.

具有双方对讲保护的自适应回波消除新算法

王杰谢胜利《控制理论与应用》2005,22(5):753-757

基于电话会议的声学回波中的双方对讲情况,本文提出了一个无需设置双方对讲检测器,但仍能在双讲过程中保护自适应滤波器消除性能的NLMS类算法.由于可以由远端信号和近端混合接收信号之间的相关性系数的变化来判断双讲发生或回波路径改变,所以改进的算法中直接将其代入滤波器权系数的迭代公式中,从而控制滤波器系数更新的快慢.仿真结果表明与同类算法相比,采用更小的计算量,该算法在双方对讲时能较好地起到保护作用,而在回波路径改变时也具有快速的跟踪性能. 相似文献

12.

Dereverberation and Denoising Using Multichannel Linear Prediction

Delcroix M. Hikichi T. Miyoshi M. 《IEEE transactions on audio, speech, and language processing》2007,15(6):1791-1801

Reverberation in a room severely degrades the characteristics and auditory quality of speech captured by distant microphones, thus posing a severe problem for many speech applications. Several dereverberation techniques have been proposed with a view to solving this problem. There are, however, few reports of dereverberation methods working under noisy conditions. In this paper, we propose an extension of a dereverberation algorithm based on multichannel linear prediction that achieves both the dereverberation and noise reduction of speech in an acoustic environment with a colored noise source. The method consists of two steps. First, the speech residual is estimated from the observed signals by employing multichannel linear prediction. When we use a microphone array, and assume, roughly speaking, that one of the microphones is closer to the speaker than the noise source, the speech residual is unaffected by the room reverberation or the noise. However, the residual is degraded because linear prediction removes an average of the speech characteristics. In a second step, the average of the speech characteristics is estimated and used to recover the speech. Simulations were conducted for a reverberation time of 0.5 s and an input signal-to-noise ratio of 0 dB. With the proposed method, the reverberation was suppressed by more than 20 dB and the noise level reduced to -18 dB. 相似文献

13.

Speech Separation of Multiple Moving Speakers Using Multisensor Multitarget Techniques

Potamitis I. Kokkinakis G. 《IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans : a publication of the IEEE Systems, Man, and Cybernetics Society》2007,37(1):72-81

The general problem addressed in this paper is that of separating the voices of active moving speakers in the presence of background noise and moderate reverberation level in the acoustic field using a single microphone array. We adapt the multisensor multitarget tracking theory to the context of microphone arrays in order to form receptive beams that lock on each moving speaker on an extended time basis and therefore, achieve voice separation. Our approach: 1) incorporates kinematical information of speakers' movement by using an interacting multiple model (IMM) estimator per speaker in order to constrain the evolution of direction of arrival (DOA) measurements, which characterize various motions of the speakers, and 2) can directly account for measurement origin uncertainty, i.e., which measurement comes from which speaker, by using the probabilistic-data-association technique in conjunction with the IMM estimator. The effectiveness of the approach is illustrated by an extensive simulation study on tracking the DOAs of two speakers with crossing trajectories and three static speakers having a conversation with partially overlapping speech and long pauses 相似文献

14.

Harmonicity-Based Blind Dereverberation for Single-Channel Speech Signals

Tomohiro Nakatani Keisuke Kinoshita Masato Miyoshi 《IEEE transactions on audio, speech, and language processing》2007,15(1):80-95

The distant acquisition of acoustic signals in an enclosed space often produces reverberant artifacts due to the room impulse response. Speech dereverberation is desirable in situations where the distant acquisition of acoustic signals is involved. These situations include hands-free speech recognition, teleconferencing, and meeting recording, to name a few. This paper proposes a processing method, named Harmonicity-based dEReverBeration (HERB), to reduce the amount of reverberation in the signal picked up by a single microphone. The method makes extensive use of harmonicity, a unique characteristic of speech, in the design of a dereverberation filter. In particular, harmonicity enhancement is proposed and demonstrated as an effective way of estimating a filter that approximates an inverse filter corresponding to the room impulse response. Two specific harmonicity enhancement techniques are presented and compared; one based on an average transfer function and the other on the minimization of a mean squared error function. Prototype HERB systems are implemented by introducing several techniques to improve the accuracy of dereverberation filter estimation, including time warping analysis. Experimental results show that the proposed methods can achieve high-quality speech dereverberation, when the reverberation time is between 0.1 and 1.0 s, in terms of reverberation energy decay curves and automatic speech recognition accuracy 相似文献

15.

基于麦克风阵列的语音增强技术及应用

杜军桑胜举《计算机应用与软件》2009,26(10):75-77

用麦克风阵列进行语音处理的方法可以提高信噪比,解决环境噪声、回声和混响引起的语音识别性能降低的问题.介绍基于延迟-累加方法(传统波束法) 、自适应波束法及基于后置自适应滤波等结构的麦克风阵列语音增强的基本原理,总结了各种算法的特点. 相似文献

16.

Suppression of Late Reverberation Effect on Speech Signal Using Long-Term Multiple-step Linear Prediction

《IEEE transactions on audio, speech, and language processing》2009,17(4):534-545

A speech signal captured by a distant microphone is generally smeared by reverberation, which severely degrades automatic speech recognition (ASR) performance. One way to solve this problem is to dereverberate the observed signal prior to ASR. In this paper, a room impulse response is assumed to consist of three parts: a direct-path response, early reflections and late reverberations. Since late reverberations are known to be a major cause of ASR performance degradation, this paper focuses on dealing with the effect of late reverberations. The proposed method first estimates the late reverberations using long-term multi-step linear prediction, and then reduces the late reverberation effect by employing spectral subtraction. The algorithm provided good dereverberation with training data corresponding to the duration of one speech utterance, in our case, less than 6 s. This paper describes the proposed framework for both single-channel and multichannel scenarios. Experimental results showed substantial improvements in ASR performance with real recordings under severe reverberant conditions. 相似文献

17.

A Variable Step-Size Affine Projection Algorithm Designed for Acoustic Echo Cancellation

《IEEE transactions on audio, speech, and language processing》2008,16(8):1466-1478

The adaptive algorithms used for acoustic echo cancellation (AEC) have to provide 1) high convergence rates and good tracking capabilities, since the acoustic environments imply very long and time-variant echo paths, and 2) low misadjustment and robustness against background noise variations and double-talk. In this context, the affine projection algorithm (APA) and different versions of it are very attractive choices for AEC. However, an APA with a constant step-size parameter has to compromise between the performance criteria 1) and 2). Therefore, a variable step-size APA (VSS-APA) represents a more reliable solution. In this paper, we propose a VSS-APA derived in the context of AEC. Most of the APAs aim to cancel $p$ (i.e., projection order) previous a posteriori errors at every step of the algorithm. The proposed VSS-APA aims to recover the near-end signal within the error signal of the adaptive filter. Consequently, it is robust against near-end signal variations (including double-talk). This algorithm does not require any a priori information about the acoustic environment, so that it is easy to control in practice. The simulation results indicate the good performance of the proposed algorithm as compared to other members of the APA family. 相似文献

18.

IP电话系统中的消回声处理

黄永峰周可张江陵《数据采集与处理》2000,15(4):467-470

分析了IP电话系统中回声的特点,并针对其回声具有路径延尺、延尺抖动大等特点,提出了一种基于延尺分析的自适应回声消除器设计的新思想和相关算法。介绍了自适应回声消除器的基本结构及各个功能模块,例如：变阶自适应滤波器,语音状态检测器,延尺分析器和NLMS算法控制器。本文还详细论述了以TMS320C6201DSP为硬件平台实现能运用在IP电话网关中的自适应回声消除器及其在IP电话系统中的连接关系。在文章的最后,列举了自适应回声消除器的性能测试结果,并分析了自适应回声消除器的优点,证明了自适应回声消除器能很好地解决IP电话系统中的回声消除问题。相似文献

19.

一种基于模糊逻辑推理的双端发声检测算法

王树恩宋彦汪萌戴礼荣孙蓉《数据采集与处理》2007,22(3):267-272

为了提高双端发声检测算法的性能、减少切音现象，文中提出了一种基于双统计量模糊逻辑推理的双端发声检测算法，并在回声消除系统中加入了舒适噪声生成部分。文中对传统的双端发声检测算法进行了分析，指出了这些算法由于采用固定门限，不能处理双端发声检测统计量其固有的模糊性，造成了在给定的虚警概率下，漏警概率偏高的问题。算法分析和实验结果表明，该算法大大地提高了双端发声检测的性能，很好地解决了加入非线性处理后的切音问题。相似文献

20.

Echo threshold between passive and electro-acoustic transmission paths in digital hearing protection devices

《International Journal of Industrial Ergonomics》2016

Electronic hearing protection devices are increasingly used in noisy environments. Theses devices feature a miniaturized external microphone and internal loudspeaker in addition to an analog or digital electronic circuit. They can transmit useful audio signals such as speech and warning signals to the protected ear and can reduce the sound pressure level using dynamic range compression. In the case of a digital electronic circuit, the transmission of audio signals may be noticeably delayed because of the latency introduced by the digital signal processor and by the analog-to-digital and digital-to-analog converters. These delayed audio signals will hence interfere with the audio signals perceived naturally through the passive acoustical path of the device. The proposed study presents an original procedure to evaluate, for two representative passive earplugs, the shortest delay at which human listeners start to perceive two sounds composed of the signal transmitted through the electronic circuit and the passively transmitted signal. This shortest delay is called the echo threshold and represents the delay between the time of perception of one fused sound from two separate sounds. In this study, a transient signal, a clean speech signal, a speech signal corrupted by factory noise, and a speech signal corrupted by babble noise are used to determine the echo thresholds of the two earplugs. Twenty untrained listeners participated in this study, and were asked to determine the echo thresholds using a test software in which attenuated signals are delayed from the original signals in real-time. The findings show that when using hearing devices, the echo threshold depends on four parameters: (a) the attenuation function of the device, (b) the duration of the signal, (c) the level of the background noise and (d) the type of background noise. Defined here as the shortest time delay at which at least 20% of the participants noticed an echo, the echo threshold was found to be 8 ms for a bell signal, 16 ms for clean speech and 22 ms for speech corrupted by babble noise when using a shallow earplug fit. When using a deep fit, the echo threshold was found to be 18 ms for a bell signal and 26 ms for clean speech and 68 ms for speech in factory. No echo threshold could be clearly determined for the speech signal in babble noise with a deep earplug fit. 相似文献