共查询到10条相似文献,搜索用时 0 毫秒
1.
In this paper, we propose a new adaptation mode controller (AMC) for a generalized sidelobe canceller (GSC)-based speech enhancement system. Here, a likelihood ratio for target speech presence was first estimated and then utilized to estimate both the local target speech presence probability (SPP) and global SPP. Next, the estimated SPPs were applied to the design of an AMC that controlled the parameters of adaptive filters for an adaptive blocking matrix (ABM) and noise canceller (NC). In particular, the combination of local and global SPPs was applied to the AMC in the ABM, whereas only global SPPs were used for the NC. Finally, a multiple-microphone speech enhancement system was constructed on the basis of a GSC having the proposed AMC. The performance of the speech enhancement system was subsequently evaluated in terms of the perceptual evaluation of speech quality (PESQ) and the cepstral distortion (CD) for car noise conditions. It was shown from this evaluation that a speech enhancement system using the proposed AMC method provided better performance than conventional AMC methods using power ratios between the target and non-target directional signals, the inter-channel normalized cross-correlation, and the local SPPs only. 相似文献
2.
针对现有基于字典学习的增强算法依赖先验信息的问题,基于矩阵的稀疏低秩分解提出一种无监督的单通道语音增强算法。该算法首先通过稀疏低秩分解将带噪语音的幅度谱分解为低秩、稀疏和噪声三部分,然后通过对低秩部分进行自学习构建出噪声字典,最后利用所得噪声字典和乘性迭代准则于低秩和稀疏部分中分离出纯净语音。相较于其他基于字典学习的语音增强算法,本文所提算法无需语音或噪声的先验信息,因而更加方便和实用。实验结果显示,本文算法能够在保留语音谐波结构的同时有效抑制噪声,增强效果明显优于鲁棒主成分分析和多带谱减法。 相似文献
3.
This paper presents a novel noise-robust graph-based semi-supervised learning algorithm to deal with the challenging problem of semi-supervised learning with noisy initial labels. Inspired by the successful use of sparse coding for noise reduction, we choose to give new L1-norm formulation of Laplacian regularization for graph-based semi-supervised learning. Since our L1-norm Laplacian regularization is explicitly defined over the eigenvectors of the normalized Laplacian matrix, we formulate graph-based semi-supervised learning as an L1-norm linear reconstruction problem which can be efficiently solved by sparse coding. Furthermore, by working with only a small subset of eigenvectors, we develop a fast sparse coding algorithm for our L1-norm semi-supervised learning. Finally, we evaluate the proposed algorithm in noise-robust image classification. The experimental results on several benchmark datasets demonstrate the promising performance of the proposed algorithm. 相似文献
4.
In this paper, we present a novel approach to voice activity detection (VAD) based on the sparse representation of an input noisy speech over a learned dictionary. First, we investigate the relationship between the signal detection and the sparse representation based on the Bayesian framework. Second, we derive the decision rule and an adaptive threshold based on a likelihood ratio test, by modeling the non-zero elements in the sparse representation as a Gaussian distribution. The experimental results show that the proposed approach outperforms the current statistical model-based methods, such as Gaussian, Laplacian, and Gamma, under white, babble, and vehicle noise conditions. 相似文献
5.
非平稳噪声环境下基于谐波能量的语音检测 总被引:1,自引:0,他引:1
语音端点检测的鲁棒性,对于构建实际语音识别系统具有重要的意义.谐波成分是语音信号的一个基本特点,为此提出了一种基于谐波成分能量的端点检测算法.通过sobeI算子计算窄带语谱图的方向场,通过Gabor滤波增强谐波区域,通过门限方法得到二值化图,去除方向大于45度和依赖度低的点,得到连续的水平方向的带状分布,即谐波分布区域,求取谐波分布区域内的能量,以此作为门限判决的特征.实验结果表明,在不同信噪比、多种非平稳噪声环境下都能够达到较好的语音检出效果.其优点为,不需要噪声的先验知识,充分利用了语音在频率域和时间域的相关性,适应于各种非平稳复杂噪声. 相似文献
6.
7.
The paper presents a supervised discriminative dictionary learning algorithm specially designed for classifying HEp-2 cell patterns. The proposed algorithm is an extension of the popular K-SVD algorithm: at the training phase, it takes into account the discriminative power of the dictionary atoms and reduces their intra-class reconstruction error during each update. Meanwhile, their inter-class reconstruction effect is also considered. Compared to the existing extension of K-SVD, the proposed algorithm is more robust to parameters and has better discriminative power for classifying HEp-2 cell patterns. Quantitative evaluation shows that the proposed algorithm outperforms general object classification algorithms significantly on standard HEp-2 cell patterns classifying benchmark1 and also achieves competitive performance on standard natural image classification benchmark. 相似文献
8.
As an alternative to classical representations in machine learning algorithms, we explore coding strategies using events as
is observed for spiking neurons in the central nervous system. Focusing on visual processing, we have previously shown that
we can define with an over-complete dictionary a sparse spike coding scheme by implementing lateral interactions that account for redundant information. Since this class of algorithms is both
compatible with biological constraints and with neuro-physiological observations, it can provide a possible algorithm to explain
the speed of visual processing despite the relatively slow time of response of single neurons. Here, I explore learning mechanisms
to derive in an unsupervised manner an over-complete set of filters which provides a progressively sparser representation
of the input. This work is based on a previous model of sparse coding from Olshausen et al. (1998) and the results leads to
similar results, suggesting that this strategy provides a simple neural implementation of this algorithm and thus of Blind
Source Separation. Moreover, this neuro-mimetic algorithm may be easily extended to realistic architectures of cortical columns
in the primary visual cortex and we show results for different strategies of representation, leading to neuro-mimetic adaptive sparse spike coding schemes.
This revised version was published online in June 2006 with corrections to the Cover Date. 相似文献
9.
Difering from common 2D images,a texture map,since it is used to project onto a 3D model in 3D space,not only contains 2D texture information,but also implicitly associates certain 3D geometric information.Related to this,an efective 3D geometry-dependent texture map compression method with hybrid region of interest(ROI)coding is proposed in this paper.We regard the visually important area of the texture map as the ROI.To acquire the visually important areas of the texture map,we take into account information from both the 3D geometry and 2D texture maps,depicting the saliency of the textured model,distortion of the texture mapping,and boundary of the texture atlas.These visually important areas are expressed as a visual importance map.According to the particularity of the texture map,a hybrid ROI coding method that utilizes Max-Shift and an improved post compression rate distortion(PCRD)technique is presented,guided by this visual importance map.To find the exact wavelet coefcients pertaining to these ROIs before carrying out the hybrid ROI coding,this paper proposes a stochastic coefcient priority mask map computational method.Experimental results show that the visually important areas of the texture image have a better visual efect and that a good rendering result can be obtained from the texture mapping. 相似文献
10.
Online systems have come to be heavily used in education, particularly for online learning and collecting information not otherwise readily available. Most e-learning systems, including interactive learning systems, have been designed to “push” course materials to students but rarely to “collect” or “pull” ideas from them. The interactive mechanisms in proposed instructional design models, however, prevent many potential designers from improving course quality, even though some believe that the learning experience and the comments of students are important for enhancing course materials. As well, students could actually contribute to instructional design.This paper presents a course material enhancement process that elicits ideas from students by encouraging students to modify course materials. This process had been tested on different higher education programs, both graduate and undergraduate. It aims to understand which programs’ students have a higher willingness to participate in this work and if they can benefit from this process. To facilitate this research, an asynchronous interaction system, teacher digital assistant (TDA), was designed for teachers to receive responses, recommendations, and modified materials from students at any time. The major advantage of this process is that it could embed students’ thoughts into the course material to improve the curriculum, which can benefit future students. 相似文献