共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Nikolaos Nasios Adrian G Bors 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2006,36(4):849-862
This paper proposes a joint maximum likelihood and Bayesian methodology for estimating Gaussian mixture models. In Bayesian inference, the distributions of parameters are modeled, characterized by hyperparameters. In the case of Gaussian mixtures, the distributions of parameters are considered as Gaussian for the mean, Wishart for the covariance, and Dirichlet for the mixing probability. The learning task consists of estimating the hyperparameters characterizing these distributions. The integration in the parameter space is decoupled using an unsupervised variational methodology entitled variational expectation-maximization (VEM). This paper introduces a hyperparameter initialization procedure for the training algorithm. In the first stage, distributions of parameters resulting from successive runs of the expectation-maximization algorithm are formed. Afterward, maximum-likelihood estimators are applied to find appropriate initial values for the hyperparameters. The proposed initialization provides faster convergence, more accurate hyperparameter estimates, and better generalization for the VEM training algorithm. The proposed methodology is applied in blind signal detection and in color image segmentation. 相似文献
3.
In this paper we introduce and illustrate non-trivial upper and lower bounds on the learning curves for one-dimensional Guassian Processes. The analysis is carried out emphasising the effects induced on the bounds by the smoothness of the random process described by the Modified Bessel and the Squared Exponential covariance functions. We present an explanation of the early, linearly-decreasing behavior of the learning curves and the bounds as well as a study of the asymptotic behavior of the curves. The effects of the noise level and the lengthscale on the tightness of the bounds are also discussed. 相似文献
4.
Yu-Ren LaiKuo-Liang Chung Guei-Yin LinChyou-Hwa Chen 《Expert systems with applications》2012,39(8):6720-6728
The current major theme in contrast enhancement is to partition the input histogram into multiple sub-histograms before final equalization of each sub-histogram is performed. This paper presents a novel contrast enhancement method based on Gaussian mixture modeling of image histograms, which provides a sound theoretical underpinning of the partitioning process. Our method comprises five major steps. First, the number of Gaussian functions to be used in the model is determined using a cost function of input histogram partitioning. Then the parameters of a Gaussian mixture model are estimated to find the best fit to the input histogram under a threshold. A binary search strategy is then applied to find the intersection points between the Gaussian functions. The intersection points thus found are used to partition the input histogram into a new set of sub-histograms, on which the classical histogram equalization (HE) is performed. Finally, a brightness preservation operation is performed to adjust the histogram produced in the previous step into a final one. Based on three representative test images, the experimental results demonstrate the contrast enhancement advantage of the proposed method when compared to twelve state-of-the-art methods in the literature. 相似文献
5.
《国际计算机数学杂志》2012,89(4):584-588
A k-dominating set for a graph G(V, E) is a set of vertices D? V such that every vertex v∈V\ D is adjacent to at least k vertices in D. The k-domination number of G, denoted by γ k (G), is the cardinality of a smallest k-dominating set of G. Here we establish lower and upper bounds of γ k (C m ×C n ) for k=2. In some cases, these bounds agree so that the exact 2-domination number is obtained. 相似文献
6.
The brain must deal with a massive flow of sensory information without receiving any prior information. Therefore, when creating cognitive models, it is important to acquire as much information as possible from the data itself. Moreover, the brain has to deal with an unknown number of components (concepts) contained in a dataset without any prior knowledge. Most of the algorithms in use today are not able to effectively copy this strategy. We propose a novel approach based on neural modelling fields theory (NMF) to overcome this problem. The algorithm combines NMF and greedy Gaussian mixture models. The novelty lies in the combination of information criterion with the merging algorithm. The performance of the algorithm was compared with other well-known algorithms and tested both on artificial and real-world datasets. 相似文献
7.
In many practical applications, the performance of a learning algorithm is not actually affected only by an unitary factor just like the complexity of hypothesis space, stability of the algorithm and data quality. This paper addresses in the performance of the regularization algorithm associated with Gaussian kernels. The main purpose is to provide a framework of evaluating the generalization performance of the algorithm conjointly in terms of hypothesis space complexity, algorithmic stability and data quality. The new bounds on generalization error of such algorithm measured by regularization error and sample error are established. It is shown that the regularization error has polynomial decays under some conditions, and the new bounds are based on uniform stability of the algorithm, covering number of hypothesis space and data information simultaneously. As an application, the obtained results are applied to several special regularization algorithms, and some new results for the special algorithms are deduced. 相似文献
8.
Bounds on the number of samples needed for neural learning. 总被引:1,自引:0,他引:1
The relationship between the number of hidden nodes in a neural network, the complexity of a multiclass discrimination problem, and the number of samples needed for effect learning are discussed. Bounds for the number of samples needed for effect learning are given. It is shown that Omega(min (d,n) M) boundary samples are required for successful classification of M clusters of samples using a two-hidden-layer neural network with d-dimensional inputs and n nodes in the first hidden layer. 相似文献
9.
Genetic-based EM algorithm for learning Gaussian mixture models 总被引:3,自引:0,他引:3
Pernkopf F Bouchaffra D 《IEEE transactions on pattern analysis and machine intelligence》2005,27(8):1344-1348
10.
The generalized Gaussian mixture model (GGMM) provides a flexible and suitable tool for many computer vision and pattern recognition problems. However, generalized Gaussian distribution is unbounded. In many applications, the observed data are digitalized and have bounded support. A new bounded generalized Gaussian mixture model (BGGMM), which includes the Gaussian mixture model (GMM), Laplace mixture model (LMM), and GGMM as special cases, is presented in this paper. We propose an extension of the generalized Gaussian distribution in this paper. This new distribution has a flexibility to fit different shapes of observed data such as non-Gaussian and bounded support data. In order to estimate the model parameters, we propose an alternate approach to minimize the higher bound on the data negative log-likelihood function. We quantify the performance of the BGGMM with simulations and real data. 相似文献
11.
Nicola Greggio Alexandre Bernardino Cecilia Laschi Paolo Dario José Santos-Victor 《Machine Vision and Applications》2012,23(4):773-789
The expectation maximization algorithm has been classically used to find the maximum likelihood estimates of parameters in probabilistic models with unobserved data, for instance, mixture models. A key issue in such problems is the choice of the model complexity. The higher the number of components in the mixture, the higher will be the data likelihood, but also the higher will be the computational burden and data overfitting. In this work, we propose a clustering method based on the expectation maximization algorithm that adapts online the number of components of a finite Gaussian mixture model from multivariate data or method estimates the number of components and their means and covariances sequentially, without requiring any careful initialization. Our methodology starts from a single mixture component covering the whole data set and sequentially splits it incrementally during expectation maximization steps. The coarse to fine nature of the algorithm reduce the overall number of computations to achieve a solution, which makes the method particularly suited to image segmentation applications whenever computational time is an issue. We show the effectiveness of the method in a series of experiments and compare it with a state-of-the-art alternative technique both with synthetic data and real images, including experiments with images acquired from the iCub humanoid robot. 相似文献
12.
Bayesian inference and prediction for a generalized autoregressive conditional heteroskedastic (GARCH) model where the innovations are assumed to follow a mixture of two Gaussian distributions is performed. The mixture GARCH model can capture the patterns usually exhibited by many financial time series such as volatility clustering, large kurtosis and extreme observations. A Griddy-Gibbs sampler implementation is proposed for parameter estimation and volatility prediction. Bayesian prediction of the Value at Risk is also addressed providing point estimates and predictive intervals. The method is illustrated using the Swiss Market Index. 相似文献
13.
Clustering is a useful tool for finding structure in a data set. The mixture likelihood approach to clustering is a popular clustering method, in which the EM algorithm is the most used method. However, the EM algorithm for Gaussian mixture models is quite sensitive to initial values and the number of its components needs to be given a priori. To resolve these drawbacks of the EM, we develop a robust EM clustering algorithm for Gaussian mixture models, first creating a new way to solve these initialization problems. We then construct a schema to automatically obtain an optimal number of clusters. Therefore, the proposed robust EM algorithm is robust to initialization and also different cluster volumes with automatically obtaining an optimal number of clusters. Some experimental examples are used to compare our robust EM algorithm with existing clustering methods. The results demonstrate the superiority and usefulness of our proposed method. 相似文献
14.
Semi-supervised Gaussian mixture model (SGMM) has been successfully applied to a wide range of engineering and scientific fields, including text classification, image retrieval, and biometric identification. Recently, many studies have shown that naturally occurring data may reside on or near manifold structures in ambient space. In this paper, we study the use of SGMM for data sets containing multiple separated or intersecting manifold structures. We propose a new multi-manifold regularized, semi-supervised Gaussian mixture model (M2SGMM) for classifying multiple manifolds. Specifically, we model the data manifold using a similarity graph with local and geometrical consistency properties. The geometrical similarity is measured by a novel application of local tangent space. We regularize the model parameters of the SGMM by incorporating the enhanced Laplacian of the graph. Experiments demonstrate the effectiveness of the proposed approach. 相似文献
15.
Hong Zeng Author Vitae Author Vitae 《Pattern recognition》2009,42(2):243-250
With the wide applications of Gaussian mixture clustering, e.g., in semantic video classification [H. Luo, J. Fan, J. Xiao, X. Zhu, Semantic principal video shot classification via mixture Gaussian, in: Proceedings of the 2003 International Conference on Multimedia and Expo, vol. 2, 2003, pp. 189-192], it is a nontrivial task to select the useful features in Gaussian mixture clustering without class labels. This paper, therefore, proposes a new feature selection method, through which not only the most relevant features are identified, but the redundant features are also eliminated so that the smallest relevant feature subset can be found. We integrate this method with our recently proposed Gaussian mixture clustering approach, namely rival penalized expectation-maximization (RPEM) algorithm [Y.M. Cheung, A rival penalized EM algorithm towards maximizing weighted likelihood for density mixture clustering with automatic model selection, in: Proceedings of the 17th International Conference on Pattern Recognition, 2004, pp. 633-636; Y.M. Cheung, Maximum weighted likelihood via rival penalized EM for density mixture clustering with automatic model selection, IEEE Trans. Knowl. Data Eng. 17(6) (2005) 750-761], which is able to determine the number of components (i.e., the model order selection) in a Gaussian mixture automatically. Subsequently, the data clustering, model selection, and the feature selection are all performed in a single learning process. Experimental results have shown the efficacy of the proposed approach. 相似文献
16.
Bayesian feature and model selection for Gaussian mixture models 总被引:1,自引:0,他引:1
Constantinopoulos C Titsias MK Likas A 《IEEE transactions on pattern analysis and machine intelligence》2006,28(6):1013-1018
We present a Bayesian method for mixture model training that simultaneously treats the feature selection and the model selection problem. The method is based on the integration of a mixture model formulation that takes into account the saliency of the features and a Bayesian approach to mixture learning that can be used to estimate the number of mixture components. The proposed learning algorithm follows the variational framework and can simultaneously optimize over the number of components, the saliency of the features, and the parameters of the mixture model. Experimental results using high-dimensional artificial and real data illustrate the effectiveness of the method. 相似文献
17.
Gaussian Mixture Models (GMM) have been broadly applied for the fitting of probability density function. However, due to the intrinsic linearity of GMM, usually many components are needed to appropriately fit the data distribution, when there are curve manifolds in the data cloud.
In order to solve this problem and represent data with curve manifolds better, in this paper we propose a new nonlinear probability model, called active curve axis Gaussian model. Intuitively, this model can be imagined as Gaussian model being bent at the first principal axis. For estimating parameters of mixtures of this model, the EM algorithm is employed.
Experiments on synthetic data and Chinese characters show that the proposed nonlinear mixture models can approximate distributions of data clouds with curve manifolds in a more concise and compact way than GMM does. The performance of the proposed nonlinear mixture models is promising. 相似文献
18.
Imputation through finite Gaussian mixture models 总被引:1,自引:0,他引:1
Marco Di Zio Ugo Guarnera Orietta Luzi 《Computational statistics & data analysis》2007,51(11):5305-5316
Imputation is a widely used method for handling missing data. It consists in the replacement of missing values with plausible ones. Parametric and nonparametric techniques are generally adopted for modelling incomplete data. Both of them have advantages and drawbacks. Parametric techniques are parsimonious but depend on the model assumed, while nonparametric techniques are more flexible but require a high amount of observations. The use of finite mixture of multivariate Gaussian distributions for handling missing data is proposed. The main reason is that it allows to control the trade-off between parsimony and flexibility. An experimental comparison with the widely used imputation nearest neighbour donor is illustrated. 相似文献
19.
This paper presents a new extension of Gaussian mixture models (GMMs) based on type-2 fuzzy sets (T2 FSs) referred to as T2 FGMMs. The estimated parameters of the GMM may not accurately reflect the underlying distributions of the observations because of insufficient and noisy data in real-world problems. By three-dimensional membership functions of T2 FSs, T2 FGMMs use footprint of uncertainty (FOU) as well as interval secondary membership functions to handle GMMs uncertain mean vector or uncertain covariance matrix, and thus GMMs parameters vary anywhere in an interval with uniform possibilities. As a result, the likelihood of the T2 FGMM becomes an interval rather than a precise real number to account for GMMs uncertainty. These interval likelihoods are then processed by the generalized linear model (GLM) for classification decision-making. In this paper we focus on the role of the FOU in pattern classification. Multi-category classification on different data sets from UCI repository shows that T2 FGMMs are consistently as good as or better than GMMs in case of insufficient training data, and are also insensitive to different areas of the FOU. Based on T2 FGMMs, we extend hidden Markov models (HMMs) to type-2 fuzzy HMMs (T2 FHMMs). Phoneme classification in the babble noise shows that T2 FHMMs outperform classical HMMs in terms of the robustness and classification rate. We also find that the larger area of the FOU in T2 FHMMs with uncertain mean vectors performs better in classification when the signal-to-noise ratio is lower. 相似文献
20.
Context in time series is one of the most useful and interesting characteristics for machine learning. In some cases, the dynamic characteristic would be the only basis for achieving a possible classification. A novel neural network, which is named "a recurrent log-linearized Gaussian mixture network (R-LLGMN)," is proposed in this paper for classification of time series. The structure of this network is based on a hidden Markov model (HMM), which has been well developed in the area of speech recognition. R-LLGMN can as well be interpreted as an extension of a probabilistic neural network using a log-linearized Gaussian mixture model, in which recurrent connections have been incorporated to make temporal information in use. Some simulation experiments are carried out to compare R-LLGMN with the traditional estimator of HMM as classifiers, and finally, pattern classification experiments for EEG signals are conducted. It is indicated from these experiments that R-LLGMN can successfully classify not only artificial data but real biological data such as EEG signals. 相似文献