首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Looking at the speaker's face can be useful to better hear a speech signal in noisy environment and extract it from competing sources before identification. This suggests that the visual signals of speech (movements of visible articulators) could be used in speech enhancement or extraction systems. In this paper, we present a novel algorithm plugging audiovisual coherence of speech signals, estimated by statistical tools, on audio blind source separation (BSS) techniques. This algorithm is applied to the difficult and realistic case of convolutive mixtures. The algorithm mainly works in the frequency (transform) domain, where the convolutive mixture becomes an additive mixture for each frequency channel. Frequency by frequency separation is made by an audio BSS algorithm. The audio and visual informations are modeled by a newly proposed statistical model. This model is then used to solve the standard source permutation and scale factor ambiguities encountered for each frequency after the audio blind separation stage. The proposed method is shown to be efficient in the case of 2 times 2 convolutive mixtures and offers promising perspectives for extracting a particular speech source of interest from complex mixtures  相似文献   

2.
MISEP method for postnonlinear blind source separation   总被引:2,自引:0,他引:2  
Zheng CH  Huang DS  Li K  Irwin G  Sun ZL 《Neural computation》2007,19(9):2557-2578
In this letter, a standard postnonlinear blind source separation algorithm is proposed, based on the MISEP method, which is widely used in linear and nonlinear independent component analysis. To best suit a wide class of postnonlinear mixtures, we adapt the MISEP method to incorporate a priori information of the mixtures. In particular, a group of three-layered perceptrons and a linear network are used as the unmixing system to separate sources in the postnonlinear mixtures, and another group of three-layered perceptron is used as the auxiliary network. The learning algorithm for the unmixing system is then obtained by maximizing the output entropy of the auxiliary network. The proposed method is applied to postnonlinear blind source separation of both simulation signals and real speech signals, and the experimental results demonstrate its effectiveness and efficiency in comparison with existing methods.  相似文献   

3.
一种带有色量测噪声的非线性系统辨识方法   总被引:2,自引:0,他引:2  
黄玉龙  张勇刚  李宁  赵琳 《自动化学报》2015,41(11):1877-1892
利用最大似然判据, 本文提出了一种带有色量测噪声的非线性系统辨识方法. 首先, 利用量测差分方法将有色量测噪声白色化, 获得新的量测方程, 从而将带有色量测噪声的非线性系统辨识问题转化成带白色量测噪声和一步延迟状态的非线性系统辨识问题. 其次, 利用期望最大化(Expectation maximization, EM)算法提出了一种新的基于最大似然估计的非线性系统辨识方法, 该算法由期望步骤(Expectation step, E-step)和最大化步骤(Maximization step, M-step)两部分组成. 在期望步骤中, 基于当前估计的参数并利用带有色量测噪声的高斯近似滤波器和平滑器, 近似计算完整的对数似然函数的期望. 在最大化步骤中, 近似计算的似然函数期望值被最大化, 并且通过解析更新获得噪声参数估计, 通过Newton更新方法获得模型参数的估计. 最后, 数值仿真验证了本文提出算法的有效性.  相似文献   

4.
Learning spatial models from sensor data raises the challenging data association problem of relating model parameters to individual measurements. This paper proposes an EM-based algorithm, which solves the model learning and the data association problem in parallel. The algorithm is developed in the context of the the structure from motion problem, which is the problem of estimating a 3D scene model from a collection of image data. To accommodate the spatial constraints in this domain, we compute virtual measurements as sufficient statistics to be used in the M-step. We develop an efficient Markov chain Monte Carlo sampling method called chain flipping, to calculate these statistics in the E-step. Experimental results show that we can solve hard data association problems when learning models of 3D scenes, and that we can do so efficiently. We conjecture that this approach can be applied to a broad range of model learning problems from sensordata, such as the robot mapping problem.  相似文献   

5.
基于稳健联合分块对角化的卷积盲分离   总被引:1,自引:0,他引:1  
汤辉  王殊 《自动化学报》2013,39(9):1502-1510
针对卷积盲分离问题,提出一种新的矩阵联合分块对角化(Joint block diagonalization, JBD)算法. 现有的迭代非正交联合分块对角化算法都存在不收敛的情况,本文利用分离矩阵的特殊结构确保其可逆性,使得算法的迭代过程稳定. 在已知矩阵分块结构的条件下,首先,将卷积盲分离模型写成瞬时形式,并说明其满足联合分块对角化结构; 然后,提出联合分块对角化的代价函数,依据代价函数的最小化等价于矩阵中每个分块的范数最小化, 将整个分离矩阵的迭代更新转化成每个分块的迭代更新;最后,利用最小化条件得到迭代算法. 实数和复数两种情况下的算法都进行了推导.基本实验验证了新算法在不同条件下的性能; 仿真实验中对在时域和频域都重叠的信号的卷积混合进行盲分离,实验结果验证了新算法具有更好的分离性能和更稳定的分离能力.  相似文献   

6.
This paper derives two spatio-temporal extensions of the well-known FastICA algorithm of Hyvarinen and Oja that are applicable to the convolutive blind source separation task. Our time-domain algorithms combine multichannel spatio-temporal prewhitening via multistage least-squares linear prediction with novel adaptive procedures that impose paraunitary constraints on the multichannel separation filter. The techniques converge quickly to a separation solution without any step size selection or divergence difficulties, and unlike other methods, ours do not require special coefficient initialization procedures to obtain good separation performance. They also allow for the efficient reconstruction of individual signals as observed in the sensor measurements directly from the system parameters for single-input multiple-output blind source separation tasks. An analysis of one of the adaptive constraint procedures shows its fast convergence to a paraunitary filter bank solution. Numerical evaluations of the proposed algorithms and comparisons with several existing convolutive blind source separation techniques indicate the excellent relative performance of the proposed methods.  相似文献   

7.
李炜  杨慧中 《控制与决策》2014,29(3):541-545

联合对角化能够成功解决盲分离问题, 但在求解时会得到非期望的奇异解, 从而无法完全分离出源信号. 鉴于此, 提出一种用于线性卷积混合盲分离的联合对角化方法, 将卷积混合模型变换为瞬时模型, 并对变换后的模型应用联合对角化求取分离矩阵. 在求解过程中, 引入约束条件对解的范围进行限定, 避免了奇异解的出现. 仿真结果表明, 所提出的方法能够成功实现卷积混合信号盲分离.

  相似文献   

8.
An EM type algorithm for estimating the parameters of the exponential Poisson distribution has recently been proposed. This algorithm uses a Newton Raphson approach at the M-step. An alternative approach, where the Newton Raphson step is replaced by another EM algorithm is described. This approach has simple closed form expressions, making use of a nested EM scheme where the M-step itself is solved with an EM algorithm. An advantage of this algorithm is that it provides estimates in the admissible range when the initial values are in the admissible range, and avoids overflow problems that may occur during Newton-Raphson iterations and, in certain cases, it can be faster. Some simulation evidence on the speed of the new approach compared to the one that uses Newton-Raphson is provided.  相似文献   

9.
Various techniques have previously been proposed for the separation of convolutive mixtures. These techniques can be classified as stochastic, adaptive, and deterministic. Stochastic methods are computationally expensive since they require an iterative process for the calculation of the demixing filters based on a separation criterion that usually assumes that the source signals are statistically independent. Adaptive methods, such as the adaptive beamformers, also exploit signal properties in order to optimize a multichannel filter structure. However, these algorithms need initialization and time to converge. Deterministic methods, on the other hand, provide a closed-form solution based on the deterministic aspects of the problem, such as the channel characteristics and the source directions. This paper presents a technique that exploits the intensity vector statistics to achieve a nearly closed-form solution for the separation of the convolutive mixtures as recorded with a coincident microphone array. No assumptions are made on the signals, but it is assumed that the source directions are known a priori. Directivity functions based on von Mises functions are designed for beamforming depending on the circular statistics of the calculated intensity vectors. Numerical evaluation results were presented for various speech and instrument sounds and source positions in two reverberant rooms.  相似文献   

10.
We consider inference in a general data-driven object-based model of multichannel audio data, assumed generated as a possibly underdetermined convolutive mixture of source signals. We work in the short-time Fourier transform (STFT) domain, where convolution is routinely approximated as linear instantaneous mixing in each frequency band. Each source STFT is given a model inspired from nonnegative matrix factorization (NMF) with the Itakura–Saito divergence, which underlies a statistical model of superimposed Gaussian components. We address estimation of the mixing and source parameters using two methods. The first one consists of maximizing the exact joint likelihood of the multichannel data using an expectation-maximization (EM) algorithm. The second method consists of maximizing the sum of individual likelihoods of all channels using a multiplicative update algorithm inspired from NMF methodology. Our decomposition algorithms are applied to stereo audio source separation in various settings, covering blind and supervised separation, music and speech sources, synthetic instantaneous and convolutive mixtures, as well as professionally produced music recordings. Our EM method produces competitive results with respect to state-of-the-art as illustrated on two tasks from the international Signal Separation Evaluation Campaign (SiSEC 2008).   相似文献   

11.
Two-microphone separation of speech mixtures.   总被引:1,自引:0,他引:1  
Separation of speech mixtures, often referred to as the cocktail party problem, has been studied for decades. In many source separation tasks, the separation method is limited by the assumption of at least as many sensors as sources. Further, many methods require that the number of signals within the recorded mixtures be known in advance. In many real-world applications, these limitations are too restrictive. We propose a novel method for underdetermined blind source separation using an instantaneous mixing model which assumes closely spaced microphones. Two source separation techniques have been combined, independent component analysis (ICA) and binary time - frequency (T-F) masking. By estimating binary masks from the outputs of an ICA algorithm, it is possible in an iterative way to extract basis speech signals from a convolutive mixture. The basis signals are afterwards improved by grouping similar signals. Using two microphones, we can separate, in principle, an arbitrary number of mixed speech signals. We show separation results for mixtures with as many as seven speech signals under instantaneous conditions. We also show that the proposed method is applicable to segregate speech signals under reverberant conditions, and we compare our proposed method to another state-of-the-art algorithm. The number of source signals is not assumed to be known in advance and it is possible to maintain the extracted signals as stereo signals.  相似文献   

12.
针对语音信号的欠定卷积混合模型,提出一种基于快速独立分量分析和自适应非线性二元时频掩蔽的语音盲分离方法。对输入的混合语音信号进行快速独立分量分析,将结果进行自适应非线性二元时频掩蔽;重复进行这两步处理,直到分离出所有的语音源信号。将分离出的语音源信号,再通过二元时频掩蔽合并可提高输出的质量,分离出的语音信号仍然能保留双声道立体声的效果。实验表明,该方法的性能大大优于DUET方法和BLUES方法,信噪比增益大幅提高。  相似文献   

13.
14.
This paper presents a novel method for the enhancement of independent components of mixed speech signal segregated by the frequency domain independent component analysis (FDICA) algorithm. The enhancement algorithm proposed here is based on maximum a posteriori (MAP) estimation of the speech spectral components using generalized Gaussian distribution (GGD) function as the statistical model for the time–frequency series of speech (TFSS) signal. The proposed MAP estimator has been used and evaluated as the post-processing stage for the separation of convolutive mixture of speech signals by the fixed-point FDICA algorithm. It has been found that the combination of separation algorithm with the proposed enhancement algorithm provides better separation performance under both the reverberant and non-reverberant conditions.  相似文献   

15.
针对独立矢量分析(IVA)算法初始分离矩阵取值对分离性能影响较大的局限性,提出了基于回溯搜索优化的卷积混合语音盲分离算法。采用频域各频率点IVA分离信号的复数峭度和作为目标函数,利用回溯搜索优化算法(BSA)对初始分离矩阵进行优化调整,更好地实现了语音信号的盲分离。在分离过程中,采用复Givens旋转变换原理将对分离矩阵的求解转化为对旋转角度的求解,有效减少了BSA的参数编码维数,降低了优化求解难度。针对语音信号的卷积混合分离实验表明,该算法具有良好的分离效果,其分离性能较之基本IVA算法显著提升。  相似文献   

16.
针对语音卷积盲源分离频域法排列顺序不确定性问题,提出一种多频段能量排序算法。首先,通过对混合信号的短时傅立叶变换(STFT),在频域上各个频点建立一个瞬时混合模型进行独立分量分析,之后结合能量相关排序法和波达方向(DOA)排序法解决排序不确定性问题,再利用分裂语谱方法解决幅度不确定性问题,进而得到每个频点正确的分离子信号,最后利用逆短时傅立叶(ISTFT)变换得到分离的源信号。仿真结果表明,与Murata的排序算法对比,改进的算法在信号偏差比、信道干扰比、系统误差比上都所提高。  相似文献   

17.
In this paper we compare two iterative approaches to the problem of pixel-level image restoration when the model contains unknown parameters. Pairwise interaction models are assumed to represent the local associations in the true scene. The first approach is a variation on the EM algorithm in which Mean-field approximations are used in the E-step and a variational approximation is used in the M-step. In the second approach, each iteration involves first restoring the image using the Iterated Conditional Modes (ICM) algorithm and then updating the parameter estimates by maximising the so-called pseudolikelihood. In addition, refinemenrs are made to the Mean-field approximation, and these are also used for restoration. The methods are compared empirically using both artificial and real noise-corrupted binary scenes. Within the comparisons the effects of using different convergence criteria for deciding when to stop the algorithms are also investigated.  相似文献   

18.
Clustering high dimensional data has become a challenge in data mining due to the curse of dimensionality. To solve this problem, subspace clustering has been defined as an extension of traditional clustering that seeks to find clusters in subspaces spanned by different combinations of dimensions within a dataset. This paper presents a new subspace clustering algorithm that calculates the local feature weights automatically in an EM-based clustering process. In the algorithm, the features are locally weighted by using a new unsupervised weighting method, as a means to minimize a proposed clustering criterion that takes into account both the average intra-clusters compactness and the average inter-clusters separation for subspace clustering. For the purposes of capturing accurate subspace information, an additional outlier detection process is presented to identify the possible local outliers of subspace clusters, and is embedded between the E-step and M-step of the algorithm. The method has been evaluated in clustering real-world gene expression data and high dimensional artificial data with outliers, and the experimental results have shown its effectiveness.  相似文献   

19.
We propose a new method to incorporate priors on the solution of nonnegative matrix factorization (NMF). The NMF solution is guided to follow the minimum mean square error (MMSE) estimates of the weight combinations under a Gaussian mixture model (GMM) prior. The proposed algorithm can be used for denoising or single-channel source separation (SCSS) applications. NMF is used in SCSS in two main stages, the training stage and the separation stage. In the training stage, NMF is used to decompose the training data spectrogram for each source into a multiplication of a trained basis and gains matrices. In the separation stage, the mixed signal spectrogram is decomposed as a weighted linear combination of the trained basis matrices for the source signals. In this work, to improve the separation performance of NMF, the trained gains matrices are used to guide the solution of the NMF weights during the separation stage. The trained gains matrix is used to train a prior GMM that captures the statistics of the valid weight combinations that the columns of the basis matrix can receive for a given source signal. In the separation stage, the prior GMMs are used to guide the NMF solution of the gains/weights matrices using MMSE estimation. The NMF decomposition weights matrix is treated as a distorted image by a distortion operator, which is learned directly from the observed signals. The MMSE estimate of the weights matrix under the trained GMM prior and log-normal distribution for the distortion is then found to improve the NMF decomposition results. The MMSE estimate is embedded within the optimization objective to form a novel regularized NMF cost function. The corresponding update rules for the new objectives are derived in this paper. The proposed MMSE estimates based regularization avoids the problem of computing the hyper-parameters and the regularization parameters. MMSE also provides a better estimate for the valid gains matrix. Experimental results show that the proposed regularized NMF algorithm improves the source separation performance compared with using NMF without a prior or with other prior models.  相似文献   

20.
In this work, we formulate the interaction between image segmentation and object recognition in the framework of the Expectation-Maximization (EM) algorithm. We consider segmentation as the assignment of image observations to object hypotheses and phrase it as the E-step, while the M-step amounts to fitting the object models to the observations. These two tasks are performed iteratively, thereby simultaneously segmenting an image and reconstructing it in terms of objects. We model objects using Active Appearance Models (AAMs) as they capture both shape and appearance variation. During the E-step, the fidelity of the AAM predictions to the image is used to decide about assigning observations to the object. For this, we propose two top-down segmentation algorithms. The first starts with an oversegmentation of the image and then softly assigns image segments to objects, as in the common setting of EM. The second uses curve evolution to minimize a criterion derived from the variational interpretation of EM and introduces AAMs as shape priors. For the M-step, we derive AAM fitting equations that accommodate segmentation information, thereby allowing for the automated treatment of occlusions. Apart from top-down segmentation results, we provide systematic experiments on object detection that validate the merits of our joint segmentation and recognition approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号