首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
基于分裂EM算法的GMM参数估计   总被引:2,自引:0,他引:2  
期望最大化(Expectation Maximization,EM)算法是一种求参数极大似然估计的迭代算法,常用来估计混合密度分布模型的参数。EM算法的主要问题是参数初始化依赖于先验知识且在迭代过程中容易收敛到局部极大值。提出一种新的基于分裂EM算法的GMM参数估计算法,该方法从一个确定的单高斯分布开始,在EM优化过程中逐渐分裂并估计混合分布的参数,解决了参数迭代收敛到局部极值问题。大量的实验表明,与现有的其他参数估计算法相比,算法具有较好的运算效率和估算准确性。  相似文献   

2.
This paper formulates a novel expectation maximization (EM) algorithm for the mixture of multivariate t-distributions. By introducing a new kind of “missing” data, we show that the empirically improved iterative algorithm, in literature, for the mixture of multivariate t-distributions is in fact a type of EM algorithm; thus a theoretical analysis is established, which guarantees the empirical algorithm converges to the maximization likelihood estimates of the mixture parameters. Simulated experiment and real experiments on classification and image segmentation confirm the effectiveness of the improved EM algorithm.  相似文献   

3.
This paper focuses on the Bayesian posterior mean estimates (or Bayes’ estimate) of the parameter set of Poisson hidden Markov models in which the observation sequence is generated by a Poisson distribution whose parameter depends on the underlining discrete-time time-homogeneous Markov chain. Although the most commonly used procedures for obtaining parameter estimates for hidden Markov models are versions of the expectation maximization and Markov chain Monte Carlo approaches, this paper exhibits an algorithm for calculating the exact posterior mean estimates which, although still cumbersome, has polynomial rather than exponential complexity, and is a feasible alternative for use with small scale models and data sets. This paper also shows simulation results, comparing the posterior mean estimates obtained by this algorithm and the maximum likelihood estimates obtained by expectation maximization approach.  相似文献   

4.
We apply the idea of averaging ensembles of estimators to probability density estimation. In particular, we use Gaussian mixture models which are important components in many neural-network applications. We investigate the performance of averaging using three data sets. For comparison, we employ two traditional regularization approaches, i.e., a maximum penalized likelihood approach and a Bayesian approach. In the maximum penalized likelihood approach we use penalty functions derived from conjugate Bayesian priors such that an expectation maximization (EM) algorithm can be used for training. In all experiments, the maximum penalized likelihood approach and averaging improved performance considerably if compared to a maximum likelihood approach. In two of the experiments, the maximum penalized likelihood approach outperformed averaging. In one experiment averaging was clearly superior. Our conclusion is that maximum penalized likelihood gives good results if the penalty term in the cost function is appropriate for the particular problem. If this is not the case, averaging is superior since it shows greater robustness by not relying on any particular prior assumption. The Bayesian approach worked very well on a low-dimensional toy problem but failed to give good performance in higher dimensional problems.  相似文献   

5.
Multi-level nonlinear mixed effects (ML-NLME) models have received a great deal of attention in recent years because of the flexibility they offer in handling the repeated-measures data arising from various disciplines. In this study, we propose both maximum likelihood and restricted maximum likelihood estimations of ML-NLME models with two-level random effects, using first order conditional expansion (FOCE) and the expectation–maximization (EM) algorithm. The FOCE–EM algorithm was compared with the most popular Lindstrom and Bates (LB) method in terms of computational and statistical properties. Basal area growth series data measured from Chinese fir (Cunninghamia lanceolata) experimental stands and simulated data were used for evaluation. The FOCE–EM and LB algorithms given the same parameter estimates and fit statistics for models that converged by both. However, FOCE–EM converged for all the models, while LB did not, especially for the models in which two-level random effects are simultaneously considered in several base parameters to account for between-group variation. We recommend the use of FOCE–EM in ML-NLME models, particularly when convergence is a concern in model selection.  相似文献   

6.
For multimode processes, Gaussian mixture model (GMM) has been applied to estimate the probability density function of the process data under normal-operational condition in last few years. However, learning GMM with the expectation maximization (EM) algorithm from process data can be difficult or even infeasible for high-dimensional and collinear process variables. To address this issue, a novel multimode process monitoring approach based on PCA mixture model is proposed. First, the PCA technique is directly applied to the covariance matrix of each Gaussian component to reduce the dimension of process variables and to obtain nonsingular covariance matrices. Then the Bayesian Ying-Yang incremental EM algorithm is adopted to automatically optimize the number of mixture components. With the obtained PCA mixture model, a novel process monitoring scheme is derived for fault detection of multimode processes. Three case studies are provided to evaluate the monitoring performance of the proposed method.  相似文献   

7.
Gaussian mixture models (GMM), commonly used in pattern recognition and machine learning, provide a flexible probabilistic model for the data. The conventional expectation–maximization (EM) algorithm for the maximum likelihood estimation of the parameters of GMMs is very sensitive to initialization and easily gets trapped in local maxima. Stochastic search algorithms have been popular alternatives for global optimization but their uses for GMM estimation have been limited to constrained models using identity or diagonal covariance matrices. Our major contributions in this paper are twofold. First, we present a novel parametrization for arbitrary covariance matrices that allow independent updating of individual parameters while retaining validity of the resultant matrices. Second, we propose an effective parameter matching technique to mitigate the issues related with the existence of multiple candidate solutions that are equivalent under permutations of the GMM components. Experiments on synthetic and real data sets show that the proposed framework has a robust performance and achieves significantly higher likelihood values than the EM algorithm.  相似文献   

8.
9.
Unsupervised data clustering can be addressed by the estimation of mixture models, where the mixture components are associated to clusters in data space. In this paper we present a novel unsupervised classification algorithm based on the simultaneous estimation of the mixture’s parameters and the number of components (complexity). Its distinguishing aspect is the way the data space is searched. Our algorithm starts from a single component covering all the input space and iteratively splits components according to breadth first search on a binary tree structure that provides an efficient exploration of the possible solutions. The proposed scheme demonstrates important computational savings with respect to other state-of-the-art algorithms, making it particularly suited to scenarios where the performance time is an issue, such as in computer and robot vision applications. The initialization procedure is unique, allowing a deterministic evolution of the algorithm, while the parameter estimation is performed with a modification of the Expectation Maximization algorithm. To compare models with different complexity we use the Minimum Message Length information criteria that implement the trade-off between the number of components and data fit log-likelihood. We validate our new approach with experiments on synthetic data, and we test and compare to related approaches its computational efficiency in data-intensive image segmentation applications.  相似文献   

10.
王旭  鞠颖 《数字社区&智能家居》2014,(4):2363-2366,2377
结核病是严重危害人类健康的一类疾病。通过计算机图像处理手段进行自动检测结核菌计数可以大幅提高医生诊断效率。高斯混合模型是单一高斯分布的延伸,是使用多个高斯分布加权来拟合给定的数据样本,通过确定拟合参数确定每个样本的分类概率。该文首先通过向量量化算法对图像预处理,降低所需处理数据量,然后从HSV、CIEL*a*b*、YCbCr颜色空间提取特征分量并送入高斯混合模型进行训练。根据实验结果,高斯混合模型比其他无监督分类算法(如K-means算法)准确度更高,与有监督的分类算法(如朴素贝叶斯分类算法)相比可以简化训练样本的制作,具有一定优势。  相似文献   

11.
基于EM的直方图逼近及其应用   总被引:5,自引:1,他引:5       下载免费PDF全文
由于直方图一般是图像灰度或者其他分量的统计信息,因此分析图像的直方图是图像处理中的一个实用方法。直方图逼近是直方图分析方法之一,一般利用若干高斯分布函数来对直方图进行逼近。如何得到各个高斯分布函数的参数是问题的难点,解决此问题的一条途径把直方图逼近问题转化为统计学中的混合模型参数估计问题。文章首先采用EM(数学期望最大化)方法解决了这个问题,然后介绍了基于EM的直方图逼近方法在最优阈值化、直方图成份分析方面的应用。  相似文献   

12.
Mixture model based clustering (also simply called model-based clustering hereinafter) consists of fitting a mixture model to data and identifying each cluster with one of its components. This paper tackles the model selection and parameter estimation problems in model-based clustering so as to improve the clustering performance on the data sets whose true kernel distribution functions are not in the family of assumed ones, as well as with inherently overlapped clusters. Being tailored to clustering applications, an effective model selection criterion is first proposed. Unlike most criteria that measure the goodness-of-fit of the model only to generate data, the proposed one also evaluates whether the candidate model provides a reasonable partition for the observed data, which enforces a model with well-separated components. Accordingly, an improved method for the estimation of mixture parameters is derived, which aims to suppress the spurious estimates by the standard expectation maximization (EM) algorithm and enforce well-supported components in the mixture model. Finally, the estimation of mixture parameters and the model selection is integrated in a single algorithm which favors a compact mixture model with both the well-supported and well-separated components. Extensive experiments on synthetic and real-world data sets are carried out to show the effectiveness of the proposed approach to the mixture model based clustering.  相似文献   

13.
This paper presents a new approach to estimating mixture models based on a recent inference principle we have proposed: the latent maximum entropy principle (LME). LME is different from Jaynes' maximum entropy principle, standard maximum likelihood, and maximum a posteriori probability estimation. We demonstrate the LME principle by deriving new algorithms for mixture model estimation, and show how robust new variants of the expectation maximization (EM) algorithm can be developed. We show that a regularized version of LME (RLME), is effective at estimating mixture models. It generally yields better results than plain LME, which in turn is often better than maximum likelihood and maximum a posterior estimation, particularly when inferring latent variable models from small amounts of data.  相似文献   

14.
This paper presents a novel method for intensity normalization of DaTSCAN SPECT brain images. The proposed methodology is based on Gaussian mixture models (GMMs) and considers not only the intensity levels, but also the coordinates of voxels inside the so-defined spatial Gaussian functions. The model parameters are obtained according to a maximum likelihood criterion employing the expectation maximization (EM) algorithm. First, an averaged control subject image is computed to obtain a threshold-based mask that selects only the voxels inside the skull. Then, the GMM is obtained for the DaTSCAN-SPECT database, performing space quantization by populating it with Gaussian kernels whose linear combination approximates the image intensity. According to a probability threshold that measures the weight of each kernel or “cluster” in the striatum area, the voxels in the non-specific region are intensity-normalized by removing clusters whose likelihood is negligible.  相似文献   

15.
We deal with the parameter estimation problem for probability density models with latent variables. For this problem traditionally the expectation maximization (EM) algorithm has been broadly used. However, it suffers from bad local maxima, and the quality of the estimator is sensitive to the initial model choice. Recently, an alternative density estimator has been proposed that is based on matching the moments between sample averaged and model averaged. This moment matching estimator is typically used as the initial iterate for the EM algorithm for further refinement. However, there is actually no guarantee that the EM-refined estimator still yields the moments close enough to the sample-averaged one. Motivated by this issue, in this paper we propose a novel estimator that takes merits of both worlds: we do likelihood maximization, but the moment discrepancy score is used as a regularizer that prevents the model-averaged moments from straying away from those estimated from data. On some crowd-sourcing label prediction problems, we demonstrate that the proposed approach yields more accurate density estimates than the existing estimators.  相似文献   

16.
Efficient greedy learning of gaussian mixture models   总被引:10,自引:0,他引:10  
This article concerns the greedy learning of gaussian mixtures. In the greedy approach, mixture components are inserted into the mixture one after the other. We propose a heuristic for searching for the optimal component to insert. In a randomized manner, a set of candidate new components is generated. For each of these candidates, we find the locally optimal new component and insert it into the existing mixture. The resulting algorithm resolves the sensitivity to initialization of state-of-the-art methods, like expectation maximization, and has running time linear in the number of data points and quadratic in the (final) number of mixture components. Due to its greedy nature, the algorithm can be particularly useful when the optimal number of mixture components is unknown. Experimental results comparing the proposed algorithm to other methods on density estimation and texture segmentation are provided.  相似文献   

17.
The current computational power and some recently developed algorithms allow a new automatic spectral analysis method for randomly missing data. Accurate spectra and autocorrelation functions are computed from the estimated parameters of time series models, without user interaction. If only a few data are missing, the accuracy is almost the same as when all observations were available. For larger missing fractions, low-order time series models can still be estimated with a good accuracy if the total observation time is long enough. Autoregressive models are best estimated with the maximum likelihood method if data are missing. Maximum likelihood estimates of moving average and of autoregressive moving average models are not very useful with missing data. Those models are found most accurately if they are derived from the estimated parameters of an intermediate autoregressive model. With statistical criteria for the selection of model order and model type, a completely automatic and numerically reliable algorithm is developed that estimates the spectrum and the autocorrelation function in randomly missing data problems. The accuracy was better than what can be obtained with other methods, including the famous expectation–maximization (EM) algorithm.  相似文献   

18.
The analysis of incomplete longitudinal data requires joint modeling of the longitudinal outcomes (observed and unobserved) and the response indicators. When non-response does not depend on the unobserved outcomes, within a likelihood framework, the missingness is said to be ignorable, obviating the need to formally model the process that drives it. For the non-ignorable or non-random case, estimation is less straightforward, because one must work with the observed data likelihood, which involves integration over the missing values, thereby giving rise to computational complexity, especially for high-dimensional missingness. The stochastic EM algorithm is a variation of the expectation-maximization (EM) algorithm and is particularly useful in cases where the E (expectation) step is intractable. Under the stochastic EM algorithm, the E-step is replaced by an S-step, in which the missing data are simulated from an appropriate conditional distribution. The method is appealing due to its computational simplicity. The SEM algorithm is used to fit non-random models for continuous longitudinal data with monotone or non-monotone missingness, using simulated, as well as case study, data. Resulting SEM estimates are compared with their direct likelihood counterparts wherever possible.  相似文献   

19.
提出了增量式有限混合模型来提取概率假设密度滤波器序贯蒙特卡罗实现方式中的多目标状态. 该模型以增量方式构建, 其混合分量采用逐个方式插入其中. 采用极大似然准则来估计多目标状态. 对于给定分量数目的混合模型, 应用期望极大化算法来获得参数的极大似然解. 在新分量插入混合模型时, 保持已有混合模型的参数不变, 仍旧采用极大似然准则从候选新分量集合中选择新插入分量. 新分量插入混合步和期望极大化算法拟合混合参数步交替应用直到混合分量数目达到概率假设密度滤波器的目标数目估计值. 利用k-d树生成插入到混合模型的新分量候选集合. 增量式有限混合模型统一了分量数目变化趋势和粒子集合似然函数的变化趋势, 有助于一步一步地搜寻混合模型的极大似然解. 仿真结果表明, 基于增量式有限混合模型的概率假设密度滤波器状态提取算法在多目标跟踪的应用中优于已有的状态提取算法.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号