首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The curse of dimensionality hinders the effectiveness of density estimation in high dimensional spaces. Many techniques have been proposed in the past to discover embedded, locally linear manifolds of lower dimensionality, including the mixture of principal component analyzers, the mixture of probabilistic principal component analyzers and the mixture of factor analyzers. In this paper, we propose a novel mixture model for reducing dimensionality based on a linear transformation which is not restricted to be orthogonal nor aligned along the principal directions. For experimental validation, we have used the proposed model for classification of five “hard” data sets and compared its accuracy with that of other popular classifiers. The performance of the proposed method has outperformed that of the mixture of probabilistic principal component analyzers on four out of the five compared data sets with improvements ranging from 0.5 to 3.2%. Moreover, on all data sets, the accuracy achieved by the proposed method outperformed that of the Gaussian mixture model with improvements ranging from 0.2 to 3.4%.  相似文献   

2.
As the demand for colorization increases, so does the need for an automated technique. A solution to the color-picking task involves principal component analysis-based learning techniques such as a mixture model of probabilistic principal component analyzers and regressive PCA. Experimental results confirm the method's feasibility.  相似文献   

3.
There has been growing interest in subspace data modeling over the past few years. Methods such as principal component analysis, factor analysis, and independent component analysis have gained in popularity and have found many applications in image modeling, signal processing, and data compression, to name just a few. As applications and computing power grow, more and more sophisticated analyses and meaningful representations are sought. Mixture modeling methods have been proposed for principal and factor analyzers that exploit local gaussian features in the subspace manifolds. Meaningful representations may be lost, however, if these local features are nongaussian or discontinuous. In this article, we propose extending the gaussian analyzers mixture model to an independent component analyzers mixture model. We employ recent developments in variational Bayesian inference and structure determination to construct a novel approach for modeling nongaussian, discontinuous manifolds. We automatically determine the local dimensionality of each manifold and use variational inference to calculate the optimum number of ICA components needed in our mixture model. We demonstrate our framework on complex synthetic data and illustrate its application to real data by decomposing functional magnetic resonance images into meaningful-and medically useful-features.  相似文献   

4.
Mixtures of probabilistic principal component analyzers (MPPCA) have shown effective for modeling high-dimensional data sets living on non-linear manifolds. Briefly stated, they conduct mixture model estimation and dimensionality reduction through a single process. This paper makes two contributions: first, we disclose a Bayesian technique for estimating such mixture models. Then, assuming several MPPCA models are available, we address the problem of aggregating them into a single MPPCA model, which should be as parsimonious as possible. We disclose in detail how this can be achieved in a cost-effective way, without sampling nor access to data, but solely requiring mixture parameters. The proposed approach is based on a novel variational-Bayes scheme operating over model parameters. Numerous experimental results and discussion are provided.  相似文献   

5.
Mixture of local principal component analysis (PCA) has attracted attention due to a number of benefits over global PCA. The performance of a mixture model usually depends on the data partition and local linear fitting. In this paper, we propose a mixture model which has the properties of optimal data partition and robust local fitting. Data partition is realized by a soft competition algorithm called neural 'gas' and robust local linear fitting is approached by a nonlinear extension of PCA learning algorithm. Based on this mixture model, we describe a modular classification scheme for handwritten digit recognition, in which each module or network models the manifold of one of ten digit classes. Experiments demonstrate a very high recognition rate.  相似文献   

6.
Recently, two-dimensional principal component analysis (2DPCA) as a novel eigenvector-based method has proved to be an efficient technique for image feature extraction and representation. In this paper, by supposing a parametric Gaussian distribution over the image space (spanned by the row vectors of 2D image matrices) and a spherical Gaussian noise model for the image, we endow the 2DPCA with a probabilistic framework called probabilistic 2DPCA (P2DPCA), which is robust to noise. Further, by using the probabilistic perspective of P2DPCA, we extend the P2DPCA to a mixture of local P2DPCA models (MP2DPCA). The MP2DPCA offers us a method of being able to model faces in unconstrained (complex) environment. The model parameters could be fitted on the basis of maximum likelihood (ML) estimation via the expectation maximization (EM) algorithm. The experimental recognition results on UMIST, AR face database, and the face recognition (FR) data collected at University of Essex confirm the effectivity of the proposed methods.  相似文献   

7.
Handling of incomplete data sets using ICA and SOM in data mining   总被引:1,自引:0,他引:1  
Based on independent component analysis (ICA) and self-organizing maps (SOM), this paper proposes an ISOM-DH model for the incomplete data’s handling in data mining. Under these circumstances the data remain dependent and non-Gaussian, this model can make full use of the information of the given data to estimate the missing data and can visualize the handled high-dimensional data. Compared with mixture of principal component analyzers (MPCA), mean method and standard SOM-based fuzzy map model, ISOM-DH model can be applied to more cases, thus performing its superiority. Meanwhile, the correctness and reasonableness of ISOM-DH model is also validated by the experiment carried out in this paper.  相似文献   

8.
In this article, two layer mixture Bayesian probabilistic principal component analyser model is developed and proposed for fault detection. It is suitable for the data driven process monitoring applications where data with non-Gaussian distribution and temporal correlations are encountered. Model development involves modifying the original observation matrix to make it suitable for building dynamic models and followed by two stages of estimation. In the first stage, the data is divided into a manageable number of clusters and in the second stage, a mixture model is built over each cluster. This strategy provides a scalable mixture model that can have multiple local models. It has the potential to provide a parsimonious model and be less susceptible to local optima compared to the existing approaches that build mixture models in a single stage. Dimension reduction during the estimation is automated using the Bayesian regularization approach. The proposed model essentially provides a probability density function for the training data. It is deployed for fault detection and the performance highlights are demonstrated in two real datasets, one is from the oil sands industry and the other is a publicly available experimental dataset.  相似文献   

9.
We develop a new biologically motivated algorithm for representing natural images using successive projections into complementary subspaces. An image is first projected into an edge subspace spanned using an ICA basis adapted to natural images which captures the sharp features of an image like edges and curves. The residual image obtained after extraction of the sharp image features is approximated using a mixture of probabilistic principal component analyzers (MPPCA) model. The model is consistent with cellular, functional, information theoretic, and learning paradigms in visual pathway modeling. We demonstrate the efficiency of our model for representing different attributes of natural images like color and luminance. We compare the performance of our model in terms of quality of representation against commonly used basis, like the discrete cosine transform (DCT), independent component analysis (ICA), and principal components analysis (PCA), based on their entropies. Chrominance and luminance components of images are represented using codes having lower entropy than DCT, ICA, or PCA for similar visual quality. The model attains considerable simplification for learning from images by using a sparse independent code for representing edges and explicitly evaluating probabilities in the residual subspace.  相似文献   

10.
Probabilistic PCA Self-Organizing Maps   总被引:1,自引:0,他引:1  
In this paper, we present a probabilistic neural model, which extends Kohonen's self-organizing map (SOM) by performing a probabilistic principal component analysis (PPCA) at each neuron. Several SOMs have been proposed in the literature to capture the local principal subspaces, but our approach offers a probabilistic model while it has a low complexity on the dimensionality of the input space. This allows to process very high-dimensional data to obtain reliable estimations of the probability densities which are based on the PPCA framework. Experimental results are presented, which show the map formation capabilities of the proposal with high-dimensional data, and its potential in image and video compression applications.  相似文献   

11.
基于混合概率主元分析(MPPCA)的监控方法,存在要求各子模型中主元个数相同、监控指标不一致、监控表格过多等缺陷.为此对MPPCA算法进行改进,分两步建立模型:首先求出混合高斯模型(GMM),然后利用概率主元分析(PPCA)建立每个子模型的主元模型.改进方法中各子模型主元的选取兼顾了主元的解释宰及其变化趋势,并引进基于PPCA的监控方法,保证了监控指标的一致性,减少了过程监控图.  相似文献   

12.
从语音信号声学特征空间的非线性流形结构特点出发, 利用流形上的压缩感知原理, 构建新的语音识别声学模型. 将特征空间划分为多个局部区域, 对每个局部区域用一个低维的因子分析模型进行近似, 从而得到混合因子分析模型. 将上下文相关状态的观测矢量限定在该非线性低维流形结构上, 推导得到其观测概率模型. 最终, 每个状态由一个服从稀疏约束的权重矢量和若干个服从标准正态分布的低维局部因子矢量所决定. 文中给出了局部区域潜在维数的确定准则及模型参数的迭代估计算法. 基于RM语料库的连续语音识别实验表明, 相比于传统的高斯混合模型(Gaussian mixture model, GMM)和子空间高斯混合模型(Subspace Gaussian mixture model, SGMM), 新声学模型在测试集上的平均词错误率(Word error rate, WER)分别相对下降了33.1%和9.2%.  相似文献   

13.
Fuzzy$c$-means (FCM)-type fuzzy clustering approaches are closely related to Gaussian mixture models (GMMs) and EM-like algorithms have been used in FCM clustering with regularized objective functions. Especially, FCM with regularization by Kullback–Leibler information (KLFCM) is a fuzzy counterpart of GMMs. In this paper, we propose to apply probabilistic principal component analysis (PCA) mixture models to linear clustering following a discussion on the relationship between local PCA and linear fuzzy clustering. Although the proposed method is a kind of the constrained model of KLFCM, the algorithm includes the fuzzy$c$-varieties (FCV) algorithm as a special case, and the algorithm can be regarded as a modified FCV algorithm with regularization by K–L information. Numerical experiments demonstrate that the proposed clustering algorithm is more flexible than the maximum likelihood approaches and is useful for capturing local substructures properly.  相似文献   

14.
The nonlinear and multimodal characteristics in many manufacturing processes have posed some difficulties to regular multivariate statistical process control (MSPC) (e.g., principal component analysis (PCA)-based monitoring method) because a fundamental assumption is that the process data follow unimodal and Gaussian distribution. To explicitly address these important data distribution characteristics in some complicated processes, a novel manifold learning algorithm, joint local intrinsic and global/local variance preserving projection (JLGLPP) is proposed for information extraction from process data. Based on the features extracted by JLGLPP, local/nonlocal manifold regularization-based Gaussian mixture model (LNGMM) is proposed to estimate process data distributions with nonlinear and multimodal characteristics. A probabilistic indicator for quantifying process states is further developed, which effectively combines local and global information extracted from a baseline GMM. Thus, the JLGLPP and LNGMM-based monitoring model can be used effectively for online process monitoring under complicated working conditions. The experimental results illustrate that the proposed method effectively captures meaningful information hidden in the process signals and shows superior process monitoring performance compared to regular monitoring methods.  相似文献   

15.
Mixtures of factor analyzers enable model-based density estimation to be undertaken for high-dimensional data, where the number of observations n is small relative to their dimension p. However, this approach is sensitive to outliers as it is based on a mixture model in which the multivariate normal family of distributions is assumed for the component error and factor distributions. An extension to mixtures of t-factor analyzers is considered, whereby the multivariate t-family is adopted for the component error and factor distributions. An EM-based algorithm is developed for the fitting of mixtures of t-factor analyzers. Its application is demonstrated in the clustering of some microarray gene-expression data.  相似文献   

16.
基于主元神经网络的非结构化道路跟踪   总被引:7,自引:0,他引:7  
李青  郑南宁  马琳  程洪 《机器人》2005,27(3):247-251
在概率的框架内,基于主元神经网络,提出了一种新的蒙特卡罗道路跟踪技术,用于自主陆地车辆在非结构化道路上的导航.使用直线道路模型表示道路边缘,并对其状态利用二阶自回归模型进行预测;在HSV彩色空间将颜色信息和局部空间特征相结合,利用主元神经网络提取主成分;根据道路边缘窗的统计特性,利用粒子滤波器进行道路状态的估计.实验结果表明,该方法能够鲁棒地进行非结构化道路跟踪.  相似文献   

17.
格拉斯曼平均子空间对应着高斯数据的主成分,解决了PCA的扩展性问题,但算法假定样本的贡献取决于样本的长度,这可能导致离群点对算法的干扰较强。为此,利用无监督学习数据的局部特性或监督学习中样本的类别信息建立样本的权重,从而提出一种基于样本加权的格拉斯曼平均的算法,在UCI数据集和ORL人脸数据库上的实验结果表明,新算法有好的鲁棒性并且其识别率比已有方法提高1%~2%。  相似文献   

18.
Independent factor analysis   总被引:19,自引:0,他引:19  
We introduce the independent factor analysis (IFA) method for recovering independent hidden sources from their observed mixtures. IFA generalizes and unifies ordinary factor analysis (FA), principal component analysis (PCA), and independent component analysis (ICA), and can handle not only square noiseless mixing but also the general case where the number of mixtures differs from the number of sources and the data are noisy. IFA is a two-step procedure. In the first step, the source densities, mixing matrix, and noise covariance are estimated from the observed data by maximum likelihood. For this purpose we present an expectation-maximization (EM) algorithm, which performs unsupervised learning of an associated probabilistic model of the mixing situation. Each source in our model is described by a mixture of gaussians; thus, all the probabilistic calculations can be performed analytically. In the second step, the sources are reconstructed from the observed data by an optimal nonlinear estimator. A variational approximation of this algorithm is derived for cases with a large number of sources, where the exact algorithm becomes intractable. Our IFA algorithm reduces to the one for ordinary FA when the sources become gaussian, and to an EM algorithm for PCA in the zero-noise limit. We derive an additional EM algorithm specifically for noiseless IFA. This algorithm is shown to be superior to ICA since it can learn arbitrary source densities from the data. Beyond blind separation, IFA can be used for modeling multidimensional data by a highly constrained mixture of gaussians and as a tool for nonlinear signal encoding.  相似文献   

19.
This paper is devoted to extending common factors and categorical variables in the model of a finite mixture of factor analyzers based on the multivariate generalized linear model and the principle of maximum random utility in the probabilistic choice theory. The EM algorithm and Newton-Raphson algorithm are used to estimate model parameters, and then the algorithm is illustrated with a simulation study and a real example.  相似文献   

20.
Probabilistic models, including probabilistic principal component analysis (PPCA) and PPCA mixture models, have been successfully applied to statistical process monitoring. This paper reviews these two models and discusses some implementation issues that provide alternative perspective on their application to process monitoring. Then a probabilistic contribution analysis method, based on the concept of missing variable, is proposed to facilitate the diagnosis of the source behind the detected process faults. The contribution analysis technique is demonstrated through its application to both PPCA and PPCA mixture models for the monitoring of two industrial processes. The results suggest that the proposed method in conjunction with PPCA model can reduce the ambiguity with regard to identifying the process variables that contribute to process faults. More importantly it provides a fault identification approach for PPCA mixture model where conventional contribution analysis is not applicable.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号