首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
王晶  匡镜明  赵胜辉 《信号处理》2007,23(5):755-758
本文将自适应后滤波技术引入3kbps特征波形内插语音编码算法中,在解码端级联短时后滤波、频谱倾斜补偿、长时后滤波及自动增益控制四个模块。通过理论分析及主观听音测试来合理设置滤波系数,使其随着语音帧的特性自适应改变。经后滤波处理的输出语音信号频谱在共振峰及谐波处频率成分得到加强,而谱谷值处噪声被削弱,同时保证了滤波前后的信号能量基本保持不变,且不引入频谱倾斜。实验结果表明,本文的3kbps波形内插编码器合成语音经过自适应后滤波处理后量化噪声明显减少,语音质量得到改善。  相似文献   

2.
基于非线性盲源分离的维纳系统算法中,采用固定步长导致算法的收敛速度和稳态误差之间存在矛盾,直接影响分离算法的性能。为了解决该问题,提出了基于非线性函数的变步长维纳系统盲源分离方法。该方法将更新的步长以非线性函数的形式引入到分离算法中,使得稳态时参数更新的步长尽可能小,以避免发生振荡。变步长算法在分离过程中的每次更新都会使步长自动进行合理的调整,使得收敛速度提高了53%,误差减小了45%。实验仿真表明,相对原算法,提出的维纳系统盲源分离方法可以更好地分离出信源信号,而且具有较小的误差和较快的收敛速度。  相似文献   

3.
Most existing algorithms for the underdetermined blind source separation (UBSS) problem are two-stage algorithm, i.e., mixing parameters estimation and sources estimation. In the mixing parameters estimation, the previously proposed traditional clustering algorithms are sensitive to the initializations of the mixing parameters. To reduce the sensitiveness to the initialization, we propose a new algorithm for the UBSS problem based on anechoic speech mixtures by employing the visual information, i.e., the interaural time difference (ITD) and the interaural level difference (ILD), as the initializations of the mixing parameters. In our algorithm, the video signals are utilized to estimate the distances between microphones and sources, and then the estimations of the ITD and ILD can be obtained. With the sparsity assumption in the time-frequency domain, the Gaussian potential function algorithm is utilized to estimate the mixing parameters by using the ITDs and ILDs as the initializations of the mixing parameters. And the time-frequency masking is used to recover the sources by evaluating the various ITDs and ILDs. Experimental results demonstrate the competitive performance of the proposed algorithm compared with the baseline algorithms.  相似文献   

4.
基于电话用户交换机的语音识别系统研究   总被引:3,自引:0,他引:3  
本论文对电话用户交换机研制了一个声控语音命令交换系统,该系统能够实现与特定人无关中小词汇量连续命令语音自动识别,研究中统计了用和命令语句,生成相应识别文法网络,识别系统的训练采用由子词模型构成的复合模型进行强化训练,识别采用令牌传递式改进Viterbi算法,提高系统的识别性能,论文比较了不同语音特征参数以及隐含马尔可夫模型状态数对电话语音识别精度的影响,研究中还开发识别系统拒识系统,在无拒识情况下  相似文献   

5.
In this paper, a new fast method for solving the permutation problem in convolutive BSS is presented. Typically, by transferring signals to the frequency domain, the convolutive BSS problem is converted to an instantaneous BSS, and deconvolution takes place in each frequency bin. However, another major problem arises which is permutation ambiguity in the frequency domain. Solving the permutation ambiguity for N sources in frequency domain needs N! comparisons between adjacent frequency bins. This drastically increases the overall computational complexity of the convolutive BSS. In our new approach, the complex-valued signals are decomposed into real and imaginary parts in each frequency bin. We show that the ideal mixing matrix has to possess a simple and symmetric structure. Accordingly, the structure can be exploited for solving the permutation ambiguity in frequency domain. Although separation in subband is accomplished by the FastICA algorithm, the proposed method requires modification of the separation algorithm, and a new structure is imposed on the mixing matrix. After that signals are separated by means of the FastICA, the permutation correction takes place only by N comparisons, decreasing the computational complexity. Comparing to five competitive methods, we experimentally demonstrate that permutation ambiguity is resolved accurately by this very fast approach while substantially decreasing the order of calculations. In terms of the separation performance and signal quality, the proposed method is superior to four of the compared methods and almost similar to the best of them.  相似文献   

6.
详细介绍了用双传声器分离混叠语音信号的方法.该方法在混响较弱的环境中能够很好地分离多个语音信号,同时保证可懂度.针对语音分离后的失真问题及其对策进行了探讨,并实验分析了其对自动识别系统产生的影响.  相似文献   

7.
This paper describes an algorithm to suppress composite noise in a two‐microphone speech enhancement system for robust hands‐free speech communication. The proposed algorithm has four stages. The first stage estimates the power spectral density of the residual stationary noise, which is based on the detection of nonstationary signal‐dominant time‐frequency bins (TFBs) at the generalized sidelobe canceller output. Second, speech‐dominant TFBs are identified among the previously detected nonstationary signal‐dominant TFBs, and power spectral densities of speech and residual nonstationary noise are estimated. In the final stage, the bin‐wise output signal‐to‐noise ratio is obtained with these power estimates and a Wiener post‐filter is constructed to attenuate the residual noise. Compared to the conventional beamforming and post‐filter algorithms, the proposed speech enhancement algorithm shows significant performance improvement in terms of perceptual evaluation of speech quality.  相似文献   

8.
Speech enhancement algorithms play an important role in speech signal processing. Over the past several decades, many algorithms have been studied for speech enhancement. A speech enhancement algorithm uses a noise removal method and a statistical model filter to analyze the speech signal in the frequency domain. Spectral subtraction and Wiener filters have been used as representative algorithms. These algorithms have excellent speech enhancement performance, but suffer from deterioration in performance due to specific noise or low signal-to-noise ratio (SNR) environments. In addition, according to estimations of erroneous noise, a noise existing in a voice signal is maintained so that a spectrum corresponding to a voice signal is distorted, or a frame corresponding to a voice signal cannot be retrieved, and voice recognition performance deteriorates. The problem of deterioration in speech recognition performance arises from the difference between speech recognition and training model. We use silence-feature normalization model as a methodology to improve the recognition rate resulting from the difference in the noisy environments. Conventional silence-feature normalization has a problem in that the silent part of the energy increases, which affects recognition performance due to unclear boundaries categorizing the voice. In this study, we use the cepstrum feature of the noise signals in the silence-feature normalization model to improve the performance of silence-feature normalization in a signal with a low SNR by setting a reference value for voiced and unvoiced classification. As a result of recognition rate confirmation, the recognition rates improve in performance, compared with other methods.  相似文献   

9.
Future wireless multimedia terminals will have a variety of applications that require speech recognition capabilities. We consider a robust distributed speech recognition system where representative parameters of the speech signal are extracted at the wireless terminal and transmitted to a centralized automatic speech recognition (ASR) server. We propose two unequal error protection schemes for the ASR bit stream and demonstrate the satisfactory performance of these schemes for typical wireless cellular channels. In addition, a "soft-feature" error concealment strategy is introduced at the ASR server that uses "soft-outputs" from the channel decoder to compute the marginal distribution of only the reliable features during likelihood computation at the speech recognizer. This soft-feature error concealment technique reduces the ASR error rate by more than a factor of 2.5 for certain channels. Also considered is a channel decoding technique with source information that improves ASR performance  相似文献   

10.
We propose a two-step algorithm for the blind separation of convolutive mixtures. We discuss its application to automatic speech recognition in a noisy environment where the acoustic signals have been recorded by a microphone in a room whose furniture and walls produce echoes. The method yields good results  相似文献   

11.
针对欠定情况下源数的估计、解混叠矩阵和源信号恢复关键技术,提出一种源数未知的欠定盲源分离算法,首先利用S变换和聚类技术相结合来估算源数和混叠矩阵,然后将源信号以零空间形式表示,再通过最大似然估计关于其后验概率以达到恢复源信号的目的。仿真实验结果表明了该方法不仅能同时分离服从超高斯分布和亚高斯分布的源信号,且比其他传统的方法具有更优越的估计性能。  相似文献   

12.
基于Borel测度峰值判定的欠定混合盲语音信号分离   总被引:1,自引:0,他引:1  
简要介绍稳定分布的特征函数及其Borel测度表示,给出了Borel测度的估计方法,并利用Borel测度的峰值确定混合矩阵的基矢量,从而可以确定各个独立分量,实现信号的盲分离。计算机模拟和分析表明,这种算法是一种在高斯和分数低阶Alpha稳定分布噪声条件下具有良好韧性的独立分量分析与盲源分离方法,在盲语音混合信号的分离应用中也得到了很好的效果。  相似文献   

13.
杨飞然  吴鸣  杨军 《电声技术》2014,38(10):50-52
提出了一种新的基于维纳滤波的频域算法来解决立体声回声抵消问题,该算法不需要对立体声信号预处理,从而最大程度地保证了近端语音质量。并且它具有很好的鲁棒性,很快的收敛速度和跟踪速度,因而具有一定的实用价值。引入了语音增强中的软判决方法来进一步提高算法的性能。新的算法在保证近端语音质量的同时达到了更好的回声压制效果。仿真实验证明了新算法的良好性能。  相似文献   

14.
Aiming to the estimation of source numbers, mixing matrix and separation of mixing signals under underdetermined case, the article puts forward a method of underdetermined blind source separation (UBSS) with an application in ultra-wideband (UWB) communication signals. The method is based on the sparse characteristic of UWB communication signals in the time domain. Firstly, finding the single source area by calculating the ratio of observed sampling points. Then an algorithm called hough-windowed method was introduced to estimate the number of sources and mixing matrix. Finally the separation of mixing signals using a method based on amended subspace projection. The simulation results indicate that the proposed method can separate UWB communication signals successfully, estimate the mixing matrix with higher accuracy and separate the mixing signals with higher gain compared with other conventional algorithms. At the same time, the method reflects the higher stability and the better noise immunity.  相似文献   

15.
智能麦克风阵列语音分离和说话人跟踪技术研究   总被引:1,自引:1,他引:0       下载免费PDF全文
杜江  朱柯 《电子学报》2005,33(2):382-384
本文介绍一种新的基于麦克风阵列的语音分离和说话人跟踪技术.该技术使用麦克风阵列,形成一个指向感兴趣说话人的波束来增强信号,并通过方向置零来抑制其他说话人的声音和噪声,同时用自适应算法跟踪说话人的方位变化.仿真验证了该技术的有效性.与常规的自适应算法相比,该算法不需训练序列,具有显著的优势.  相似文献   

16.
Underdetermined blind source separation based on sparse representation   总被引:14,自引:0,他引:14  
This paper discusses underdetermined (i.e., with more sources than sensors) blind source separation (BSS) using a two-stage sparse representation approach. The first challenging task of this approach is to estimate precisely the unknown mixing matrix. In this paper, an algorithm for estimating the mixing matrix that can be viewed as an extension of the DUET and the TIFROM methods is first developed. Standard clustering algorithms (e.g., K-means method) also can be used for estimating the mixing matrix if the sources are sufficiently sparse. Compared with the DUET, the TIFROM methods, and standard clustering algorithms, with the authors' proposed method, a broader class of problems can be solved, because the required key condition on sparsity of the sources can be considerably relaxed. The second task of the two-stage approach is to estimate the source matrix using a standard linear programming algorithm. Another main contribution of the work described in this paper is the development of a recoverability analysis. After extending the results in , a necessary and sufficient condition for recoverability of a source vector is obtained. Based on this condition and various types of source sparsity, several probability inequalities and probability estimates for the recoverability issue are established. Finally, simulation results that illustrate the effectiveness of the theoretical results are presented.  相似文献   

17.
运用TMS320C5416实现了语音自动识别装置。该装置利用一种新的语音信号r阶的倒谱线性回归系数等参数构成识别的特征矢量集,运用模糊矢量量化技术实现了特定人的语音识别。实验结果表明该系统具有识别精度高、识别速度快等特点.是一种语音自动识别装置的有效的硬件实现方案。  相似文献   

18.
改进的后滤波波束形成器语音增强算法   总被引:1,自引:0,他引:1  
该文提出了一种具有后滤波的波束形成器的语音增强改进算法。该算法主要解决维纳滤波器的理想信号功率谱估计,结合自功率谱减法和互功率谱减法计算出尽可能多的功率谱估计值,以使平均结果更接近于真实值,同时修正了声源移动引起的互功率谱变化。实验结果信噪比提高5dB以上,汽车环境中基于隐含马尔可夫模型(HMM)的小词汇量短语识别达到84%。从信噪比、平均谱距离和语音识别率可以看出该算法有效去除了原始算法中易残留的低频噪声,减少了语音信号失真。  相似文献   

19.
In speech enhancement, soft decision, in which the speech absence probability (SAP) is introduced to modify the spectral gain or update the noise power, is known to be efficient. In many previous works, a fixed a priori probability of speech absence (q) is assumed in estimating the SAP, which is not realistic since speech is quasi-stationary and may not be present in each frequency bin. To address this problem, Malah et al. devised a novel method to obtain distinct values of q for each frequency bin in many frames by comparing the a posteriori SNR to a threshold value [9]. In this regard, a novel algorithm is achieved by taking an advantage of a minima-controlled recursive averaging (MCRA) technique that allows for the robust tracking of speech absence in time. This leads to the improved tracking performance of speech absence in speech enhancement and better results in the objective and subjective evaluation tests.  相似文献   

20.
Recent advances in automatic speech recognition are accomplished by designing a plug-in maximum a posteriori decision rule such that the forms of the acoustic and language model distributions are specified and the parameters of the assumed distributions are estimated from a collection of speech and language training corpora. Maximum-likelihood point estimation is by far the most prevailing training method. However, due to the problems of unknown speech distributions, sparse training data, high spectral and temporal variabilities in speech, and possible mismatch between training and testing conditions, a dynamic training strategy is needed. To cope with the changing speakers and speaking conditions in real operational conditions for high-performance speech recognition, such paradigms incorporate a small amount of speaker and environment specific adaptation data into the training process. Bayesian adaptive learning is an optimal way to combine prior knowledge in an existing collection of general models with a new set of condition-specific adaptation data. In this paper, the mathematical framework for Bayesian adaptation of acoustic and language model parameters is first described. Maximum a posteriori point estimation is then developed for hidden Markov models and a number of useful parameters densities commonly used in automatic speech recognition and natural language processing  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号