共查询到20条相似文献,搜索用时 31 毫秒
1.
Ahmed Fawzi Otoom Hatice Gunes Oscar Perez Concha Massimo Piccardi 《Pattern Analysis & Applications》2011,14(2):193-205
The curse of dimensionality hinders the effectiveness of density estimation in high dimensional spaces. Many techniques have
been proposed in the past to discover embedded, locally linear manifolds of lower dimensionality, including the mixture of
principal component analyzers, the mixture of probabilistic principal component analyzers and the mixture of factor analyzers.
In this paper, we propose a novel mixture model for reducing dimensionality based on a linear transformation which is not
restricted to be orthogonal nor aligned along the principal directions. For experimental validation, we have used the proposed
model for classification of five “hard” data sets and compared its accuracy with that of other popular classifiers. The performance
of the proposed method has outperformed that of the mixture of probabilistic principal component analyzers on four out of
the five compared data sets with improvements ranging from 0.5 to 3.2%. Moreover, on all data sets, the accuracy achieved
by the proposed method outperformed that of the Gaussian mixture model with improvements ranging from 0.2 to 3.4%. 相似文献
2.
As the demand for colorization increases, so does the need for an automated technique. A solution to the color-picking task involves principal component analysis-based learning techniques such as a mixture model of probabilistic principal component analyzers and regressive PCA. Experimental results confirm the method's feasibility. 相似文献
3.
There has been growing interest in subspace data modeling over the past few years. Methods such as principal component analysis, factor analysis, and independent component analysis have gained in popularity and have found many applications in image modeling, signal processing, and data compression, to name just a few. As applications and computing power grow, more and more sophisticated analyses and meaningful representations are sought. Mixture modeling methods have been proposed for principal and factor analyzers that exploit local gaussian features in the subspace manifolds. Meaningful representations may be lost, however, if these local features are nongaussian or discontinuous. In this article, we propose extending the gaussian analyzers mixture model to an independent component analyzers mixture model. We employ recent developments in variational Bayesian inference and structure determination to construct a novel approach for modeling nongaussian, discontinuous manifolds. We automatically determine the local dimensionality of each manifold and use variational inference to calculate the optimum number of ICA components needed in our mixture model. We demonstrate our framework on complex synthetic data and illustrate its application to real data by decomposing functional magnetic resonance images into meaningful-and medically useful-features. 相似文献
4.
Mixtures of probabilistic principal component analyzers (MPPCA) have shown effective for modeling high-dimensional data sets living on non-linear manifolds. Briefly stated, they conduct mixture model estimation and dimensionality reduction through a single process. This paper makes two contributions: first, we disclose a Bayesian technique for estimating such mixture models. Then, assuming several MPPCA models are available, we address the problem of aggregating them into a single MPPCA model, which should be as parsimonious as possible. We disclose in detail how this can be achieved in a cost-effective way, without sampling nor access to data, but solely requiring mixture parameters. The proposed approach is based on a novel variational-Bayes scheme operating over model parameters. Numerous experimental results and discussion are provided. 相似文献
5.
Mixture of local principal component analysis (PCA) has attracted attention due to a number of benefits over global PCA. The performance of a mixture model usually depends on the data partition and local linear fitting. In this paper, we propose a mixture model which has the properties of optimal data partition and robust local fitting. Data partition is realized by a soft competition algorithm called neural 'gas' and robust local linear fitting is approached by a nonlinear extension of PCA learning algorithm. Based on this mixture model, we describe a modular classification scheme for handwritten digit recognition, in which each module or network models the manifold of one of ten digit classes. Experiments demonstrate a very high recognition rate. 相似文献
6.
Probabilistic two-dimensional principal component analysis and its mixture model for face recognition 总被引:1,自引:1,他引:0
Recently, two-dimensional principal component analysis (2DPCA) as a novel eigenvector-based method has proved to be an efficient technique for image feature extraction and representation. In this paper, by supposing a parametric Gaussian distribution over the image space (spanned by the row vectors of 2D image matrices) and a spherical Gaussian noise model for the image, we endow the 2DPCA with a probabilistic framework called probabilistic 2DPCA (P2DPCA), which is robust to noise. Further, by using the probabilistic perspective of P2DPCA, we extend the P2DPCA to a mixture of local P2DPCA models (MP2DPCA). The MP2DPCA offers us a method of being able to model faces in unconstrained (complex) environment. The model parameters could be fitted on the basis of maximum likelihood (ML) estimation via the expectation maximization (EM) algorithm. The experimental recognition results on UMIST, AR face database, and the face recognition (FR) data collected at University of Essex confirm the effectivity of the proposed methods. 相似文献
7.
Based on independent component analysis (ICA) and self-organizing maps (SOM), this paper proposes an ISOM-DH model for the
incomplete data’s handling in data mining. Under these circumstances the data remain dependent and non-Gaussian, this model
can make full use of the information of the given data to estimate the missing data and can visualize the handled high-dimensional
data. Compared with mixture of principal component analyzers (MPCA), mean method and standard SOM-based fuzzy map model, ISOM-DH
model can be applied to more cases, thus performing its superiority. Meanwhile, the correctness and reasonableness of ISOM-DH
model is also validated by the experiment carried out in this paper. 相似文献
8.
In this article, two layer mixture Bayesian probabilistic principal component analyser model is developed and proposed for fault detection. It is suitable for the data driven process monitoring applications where data with non-Gaussian distribution and temporal correlations are encountered. Model development involves modifying the original observation matrix to make it suitable for building dynamic models and followed by two stages of estimation. In the first stage, the data is divided into a manageable number of clusters and in the second stage, a mixture model is built over each cluster. This strategy provides a scalable mixture model that can have multiple local models. It has the potential to provide a parsimonious model and be less susceptible to local optima compared to the existing approaches that build mixture models in a single stage. Dimension reduction during the estimation is automated using the Bayesian regularization approach. The proposed model essentially provides a probability density function for the training data. It is deployed for fault detection and the performance highlights are demonstrated in two real datasets, one is from the oil sands industry and the other is a publicly available experimental dataset. 相似文献
9.
Balakrishnan N Hariharakrishnan K Schonfeld D 《IEEE transactions on pattern analysis and machine intelligence》2005,27(9):1367-1378
We develop a new biologically motivated algorithm for representing natural images using successive projections into complementary subspaces. An image is first projected into an edge subspace spanned using an ICA basis adapted to natural images which captures the sharp features of an image like edges and curves. The residual image obtained after extraction of the sharp image features is approximated using a mixture of probabilistic principal component analyzers (MPPCA) model. The model is consistent with cellular, functional, information theoretic, and learning paradigms in visual pathway modeling. We demonstrate the efficiency of our model for representing different attributes of natural images like color and luminance. We compare the performance of our model in terms of quality of representation against commonly used basis, like the discrete cosine transform (DCT), independent component analysis (ICA), and principal components analysis (PCA), based on their entropies. Chrominance and luminance components of images are represented using codes having lower entropy than DCT, ICA, or PCA for similar visual quality. The model attains considerable simplification for learning from images by using a sparse independent code for representing edges and explicitly evaluating probabilities in the residual subspace. 相似文献
10.
Probabilistic PCA Self-Organizing Maps 总被引:1,自引:0,他引:1
Lopez-Rubio E. Ortiz-de-Lazcano-Lobato J.M. Lopez-Rodriguez D. 《Neural Networks, IEEE Transactions on》2009,20(9):1474-1489
In this paper, we present a probabilistic neural model, which extends Kohonen's self-organizing map (SOM) by performing a probabilistic principal component analysis (PPCA) at each neuron. Several SOMs have been proposed in the literature to capture the local principal subspaces, but our approach offers a probabilistic model while it has a low complexity on the dimensionality of the input space. This allows to process very high-dimensional data to obtain reliable estimations of the probability densities which are based on the PPCA framework. Experimental results are presented, which show the map formation capabilities of the proposal with high-dimensional data, and its potential in image and video compression applications. 相似文献
11.
12.
从语音信号声学特征空间的非线性流形结构特点出发, 利用流形上的压缩感知原理, 构建新的语音识别声学模型. 将特征空间划分为多个局部区域, 对每个局部区域用一个低维的因子分析模型进行近似, 从而得到混合因子分析模型. 将上下文相关状态的观测矢量限定在该非线性低维流形结构上, 推导得到其观测概率模型. 最终, 每个状态由一个服从稀疏约束的权重矢量和若干个服从标准正态分布的低维局部因子矢量所决定. 文中给出了局部区域潜在维数的确定准则及模型参数的迭代估计算法. 基于RM语料库的连续语音识别实验表明, 相比于传统的高斯混合模型(Gaussian mixture model, GMM)和子空间高斯混合模型(Subspace Gaussian mixture model, SGMM), 新声学模型在测试集上的平均词错误率(Word error rate, WER)分别相对下降了33.1%和9.2%. 相似文献
13.
《Fuzzy Systems, IEEE Transactions on》2005,13(4):508-516
Fuzzy$c$ -means (FCM)-type fuzzy clustering approaches are closely related to Gaussian mixture models (GMMs) and EM-like algorithms have been used in FCM clustering with regularized objective functions. Especially, FCM with regularization by Kullback–Leibler information (KLFCM) is a fuzzy counterpart of GMMs. In this paper, we propose to apply probabilistic principal component analysis (PCA) mixture models to linear clustering following a discussion on the relationship between local PCA and linear fuzzy clustering. Although the proposed method is a kind of the constrained model of KLFCM, the algorithm includes the fuzzy$c$ -varieties (FCV) algorithm as a special case, and the algorithm can be regarded as a modified FCV algorithm with regularization by K–L information. Numerical experiments demonstrate that the proposed clustering algorithm is more flexible than the maximum likelihood approaches and is useful for capturing local substructures properly. 相似文献
14.
The nonlinear and multimodal characteristics in many manufacturing processes have posed some difficulties to regular multivariate statistical process control (MSPC) (e.g., principal component analysis (PCA)-based monitoring method) because a fundamental assumption is that the process data follow unimodal and Gaussian distribution. To explicitly address these important data distribution characteristics in some complicated processes, a novel manifold learning algorithm, joint local intrinsic and global/local variance preserving projection (JLGLPP) is proposed for information extraction from process data. Based on the features extracted by JLGLPP, local/nonlocal manifold regularization-based Gaussian mixture model (LNGMM) is proposed to estimate process data distributions with nonlinear and multimodal characteristics. A probabilistic indicator for quantifying process states is further developed, which effectively combines local and global information extracted from a baseline GMM. Thus, the JLGLPP and LNGMM-based monitoring model can be used effectively for online process monitoring under complicated working conditions. The experimental results illustrate that the proposed method effectively captures meaningful information hidden in the process signals and shows superior process monitoring performance compared to regular monitoring methods. 相似文献
15.
G.J. McLachlan R.W. Bean L. Ben-Tovim Jones 《Computational statistics & data analysis》2007,51(11):5327-5338
Mixtures of factor analyzers enable model-based density estimation to be undertaken for high-dimensional data, where the number of observations n is small relative to their dimension p. However, this approach is sensitive to outliers as it is based on a mixture model in which the multivariate normal family of distributions is assumed for the component error and factor distributions. An extension to mixtures of t-factor analyzers is considered, whereby the multivariate t-family is adopted for the component error and factor distributions. An EM-based algorithm is developed for the fitting of mixtures of t-factor analyzers. Its application is demonstrated in the clustering of some microarray gene-expression data. 相似文献
16.
17.
格拉斯曼平均子空间对应着高斯数据的主成分,解决了PCA的扩展性问题,但算法假定样本的贡献取决于样本的长度,这可能导致离群点对算法的干扰较强。为此,利用无监督学习数据的局部特性或监督学习中样本的类别信息建立样本的权重,从而提出一种基于样本加权的格拉斯曼平均的算法,在UCI数据集和ORL人脸数据库上的实验结果表明,新算法有好的鲁棒性并且其识别率比已有方法提高1%~2%。 相似文献
18.
Independent factor analysis 总被引:19,自引:0,他引:19
Attias H 《Neural computation》1999,11(4):803-851
We introduce the independent factor analysis (IFA) method for recovering independent hidden sources from their observed mixtures. IFA generalizes and unifies ordinary factor analysis (FA), principal component analysis (PCA), and independent component analysis (ICA), and can handle not only square noiseless mixing but also the general case where the number of mixtures differs from the number of sources and the data are noisy. IFA is a two-step procedure. In the first step, the source densities, mixing matrix, and noise covariance are estimated from the observed data by maximum likelihood. For this purpose we present an expectation-maximization (EM) algorithm, which performs unsupervised learning of an associated probabilistic model of the mixing situation. Each source in our model is described by a mixture of gaussians; thus, all the probabilistic calculations can be performed analytically. In the second step, the sources are reconstructed from the observed data by an optimal nonlinear estimator. A variational approximation of this algorithm is derived for cases with a large number of sources, where the exact algorithm becomes intractable. Our IFA algorithm reduces to the one for ordinary FA when the sources become gaussian, and to an EM algorithm for PCA in the zero-noise limit. We derive an additional EM algorithm specifically for noiseless IFA. This algorithm is shown to be superior to ICA since it can learn arbitrary source densities from the data. Beyond blind separation, IFA can be used for modeling multidimensional data by a highly constrained mixture of gaussians and as a tool for nonlinear signal encoding. 相似文献
19.
This paper is devoted to extending common factors and categorical variables in the model of a finite mixture of factor analyzers based on the multivariate generalized linear model and the principle of maximum random utility in the probabilistic choice theory. The EM algorithm and Newton-Raphson algorithm are used to estimate model parameters, and then the algorithm is illustrated with a simulation study and a real example. 相似文献
20.
《Control Engineering Practice》2009,17(4):469-477
Probabilistic models, including probabilistic principal component analysis (PPCA) and PPCA mixture models, have been successfully applied to statistical process monitoring. This paper reviews these two models and discusses some implementation issues that provide alternative perspective on their application to process monitoring. Then a probabilistic contribution analysis method, based on the concept of missing variable, is proposed to facilitate the diagnosis of the source behind the detected process faults. The contribution analysis technique is demonstrated through its application to both PPCA and PPCA mixture models for the monitoring of two industrial processes. The results suggest that the proposed method in conjunction with PPCA model can reduce the ambiguity with regard to identifying the process variables that contribute to process faults. More importantly it provides a fault identification approach for PPCA mixture model where conventional contribution analysis is not applicable. 相似文献