首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 859 毫秒
1.
A mixture vector autoregressive model has recently been introduced to the literature. Although this model is a promising candidate for nonlinear multiple time series modeling, high dimensionality of the parameters and lack of method for computing the standard errors of estimates limit its application to real data. The contribution of this paper is threefold. First, a form of parameter constraints is introduced with an efficient EM algorithm for estimation. Second, an accurate method for computing standard errors is presented for the model with and without parameter constraints. Lastly, a hypothesis-testing approach based on likelihood ratio tests is proposed, which aids in the selection of unnecessary parameters and leads to the greater efficiency at the estimation. A case study employing U.S. Treasury constant maturity rates illustrates the applicability of the mixture vector autoregressive model with parameter constraints, and the importance of using a reliable method to compute standard errors.  相似文献   

2.
Unsupervised learning of finite mixture models   总被引:38,自引:0,他引:38  
This paper proposes an unsupervised algorithm for learning a finite mixture model from multivariate data. The adjective "unsupervised" is justified by two properties of the algorithm: 1) it is capable of selecting the number of components and 2) unlike the standard expectation-maximization (EM) algorithm, it does not require careful initialization. The proposed method also avoids another drawback of EM for mixture fitting: the possibility of convergence toward a singular estimate at the boundary of the parameter space. The novelty of our approach is that we do not use a model selection criterion to choose one among a set of preestimated candidate models; instead, we seamlessly integrate estimation and model selection in a single algorithm. Our technique can be applied to any type of parametric mixture model for which it is possible to write an EM algorithm; in this paper, we illustrate it with experiments involving Gaussian mixtures. These experiments testify for the good performance of our approach  相似文献   

3.
Mixture model based clustering (also simply called model-based clustering hereinafter) consists of fitting a mixture model to data and identifying each cluster with one of its components. This paper tackles the model selection and parameter estimation problems in model-based clustering so as to improve the clustering performance on the data sets whose true kernel distribution functions are not in the family of assumed ones, as well as with inherently overlapped clusters. Being tailored to clustering applications, an effective model selection criterion is first proposed. Unlike most criteria that measure the goodness-of-fit of the model only to generate data, the proposed one also evaluates whether the candidate model provides a reasonable partition for the observed data, which enforces a model with well-separated components. Accordingly, an improved method for the estimation of mixture parameters is derived, which aims to suppress the spurious estimates by the standard expectation maximization (EM) algorithm and enforce well-supported components in the mixture model. Finally, the estimation of mixture parameters and the model selection is integrated in a single algorithm which favors a compact mixture model with both the well-supported and well-separated components. Extensive experiments on synthetic and real-world data sets are carried out to show the effectiveness of the proposed approach to the mixture model based clustering.  相似文献   

4.
Finite mixture is widely used in the fields of information processing and data analysis. However, its model selection, i.e., the selection of components in the mixture for a given sample data set, has been still a rather difficult task. Recently, the Bayesian Ying-Yang (BYY) harmony learning has provided a new approach to the Gaussian mixture modeling with a favorite feature that model selection can be made automatically during parameter learning. In this paper, based on the same BYY harmony learning framework for finite mixture, we propose an adaptive gradient BYY learning algorithm for Poisson mixture with automated model selection. It is demonstrated well by the simulation experiments that this adaptive gradient BYY learning algorithm can automatically determine the number of actual Poisson components for a sample data set, with a good estimation of the parameters in the original or true mixture where the components are separated in a certain degree. Moreover, the adaptive gradient BYY learning algorithm is successfully applied to texture classification.  相似文献   

5.
Unsupervised data clustering can be addressed by the estimation of mixture models, where the mixture components are associated to clusters in data space. In this paper we present a novel unsupervised classification algorithm based on the simultaneous estimation of the mixture’s parameters and the number of components (complexity). Its distinguishing aspect is the way the data space is searched. Our algorithm starts from a single component covering all the input space and iteratively splits components according to breadth first search on a binary tree structure that provides an efficient exploration of the possible solutions. The proposed scheme demonstrates important computational savings with respect to other state-of-the-art algorithms, making it particularly suited to scenarios where the performance time is an issue, such as in computer and robot vision applications. The initialization procedure is unique, allowing a deterministic evolution of the algorithm, while the parameter estimation is performed with a modification of the Expectation Maximization algorithm. To compare models with different complexity we use the Minimum Message Length information criteria that implement the trade-off between the number of components and data fit log-likelihood. We validate our new approach with experiments on synthetic data, and we test and compare to related approaches its computational efficiency in data-intensive image segmentation applications.  相似文献   

6.
提出了增量式有限混合模型来提取概率假设密度滤波器序贯蒙特卡罗实现方式中的多目标状态. 该模型以增量方式构建, 其混合分量采用逐个方式插入其中. 采用极大似然准则来估计多目标状态. 对于给定分量数目的混合模型, 应用期望极大化算法来获得参数的极大似然解. 在新分量插入混合模型时, 保持已有混合模型的参数不变, 仍旧采用极大似然准则从候选新分量集合中选择新插入分量. 新分量插入混合步和期望极大化算法拟合混合参数步交替应用直到混合分量数目达到概率假设密度滤波器的目标数目估计值. 利用k-d树生成插入到混合模型的新分量候选集合. 增量式有限混合模型统一了分量数目变化趋势和粒子集合似然函数的变化趋势, 有助于一步一步地搜寻混合模型的极大似然解. 仿真结果表明, 基于增量式有限混合模型的概率假设密度滤波器状态提取算法在多目标跟踪的应用中优于已有的状态提取算法.  相似文献   

7.
A new multivariate volatility model where the conditional distribution of a vector time series is given by a mixture of multivariate normal distributions is proposed. Each of these distributions is allowed to have a time-varying covariance matrix. The process can be globally covariance stationary even though some components are not covariance stationary. Some theoretical properties of the model such as the unconditional covariance matrix and autocorrelations of squared returns are derived. The complexity of the model requires a powerful estimation algorithm. A simulation study compares estimation by maximum likelihood with the EM algorithm. Finally, the model is applied to daily US stock returns.  相似文献   

8.
医学图像分割中的期望最大化(EM)算法在求解混合模型参数时存在局限性。为此,提出一种模糊约束的混合模型图像分割算法。该算法以像素的独立性假设为前提,在采用EM算法对模型参数进行求解的过程中,通过模糊集合论方法,引入像素空间信息。实验结果表明,该算法没有引入新的模型参数,能够保持独立混合模型的简单性,且具有自动模型选择能力,可以获得较理想的分割结果。  相似文献   

9.
The expectation maximization algorithm has been classically used to find the maximum likelihood estimates of parameters in probabilistic models with unobserved data, for instance, mixture models. A key issue in such problems is the choice of the model complexity. The higher the number of components in the mixture, the higher will be the data likelihood, but also the higher will be the computational burden and data overfitting. In this work, we propose a clustering method based on the expectation maximization algorithm that adapts online the number of components of a finite Gaussian mixture model from multivariate data or method estimates the number of components and their means and covariances sequentially, without requiring any careful initialization. Our methodology starts from a single mixture component covering the whole data set and sequentially splits it incrementally during expectation maximization steps. The coarse to fine nature of the algorithm reduce the overall number of computations to achieve a solution, which makes the method particularly suited to image segmentation applications whenever computational time is an issue. We show the effectiveness of the method in a series of experiments and compare it with a state-of-the-art alternative technique both with synthetic data and real images, including experiments with images acquired from the iCub humanoid robot.  相似文献   

10.
Over the last decades, the α-stable distribution has proved to be a very efficient model for impulsive data. In this paper, we propose an extension of stable distributions, namely mixture of α-stable distributions to model multimodal, skewed and impulsive data. A fully Bayesian framework is presented for the estimation of the stable density parameters and the mixture parameters. As opposed to most previous work on mixture models, the model order is assumed unknown and is estimated using reversible jump Markov chain Monte Carlo. It is important to note that the Gaussian mixture model is a special case of the presented model which provides additional flexibility to model skewed and impulsive phenomena. The algorithm is tested using synthetic and real data, accurately estimating α-stable parameters, mixture coefficients and the number of components in the mixture.  相似文献   

11.
The unsupervised learning of multivariate mixture models from on-line data streams has attracted the attention of researchers for its usefulness in real-time intelligent learning systems. The EM algorithm is an ideal choice for iteratively obtaining maximum likelihood estimation of parameters in presumable finite mixtures, comparing to some popular numerical methods. However, the original EM is a batch algorithm that works only on fixed datasets. To endow the EM algorithm with the capability to process streaming data, two on-line variants are studied, including Titterington’s method and a sufficient statistics-based method. We first prove that the two on-line EM variants are theoretically feasible for training the multivariate normal mixture model by showing that the model belongs to the exponential family. Afterward, the two on-line learning schemes for multivariate normal mixtures are applied to the problems of background learning and moving foreground detection. Experiments show that the two on-line EM variants can efficiently update the parameters of the mixture model and are capable of generating reliable backgrounds for moving foreground detection.  相似文献   

12.
 We study indices for choosing the correct number of components in a mixture of normal distributions. Previous studies have been confined to indices based wholly on probabilistic models. Viewing mixture decomposition as probabilistic clustering (where the emphasis is on partitioning for geometric substructure) as opposed to parametric estimation enables us to introduce both fuzzy and crisp measures of cluster validity for this problem. We presume the underlying samples to be unlabeled, and use the expectation-maximization (EM) algorithm to find clusters in the data. We test 16 probabilistic, 3 fuzzy and 4 crisp indices on 12 data sets that are samples from bivariate normal mixtures having either 3 or 6 components. Over three run averages based on different initializations of EM, 10 of the 23 indices tested for choosing the right number of mixture components were correct in at least 9 of the 12 trials. Among these were the fuzzy index of Xie-Beni, the crisp Davies-Bouldin index, and two crisp indices that are recent generalizations of Dunn’s index. Received: 29 July 1997/Accepted: 1 September 1997  相似文献   

13.
Bounded data with excess observations at the boundary are common in many areas of application. Various individual cases of inflated mixture models have been studied in the literature for bound-inflated data, yet the computational methods have been developed separately for each type of model. In this article we use a common framework for computing these models, and expand the range of models for both discrete and semi-continuous data with point inflation at the lower boundary. The quasi-Newton and EM algorithms are adapted and compared for estimation of model parameters. The numerical Hessian and generalized Louis method are investigated as means for computing standard errors after optimization. Correlated data are included in this framework via generalized estimating equations. The estimation of parameters and effectiveness of standard errors are demonstrated through simulation and in the analysis of data from an ultrasound bioeffect study. The unified approach enables reliable computation for a wide class of inflated mixture models and comparison of competing models.  相似文献   

14.
15.
基于混合概率模型的无监督离散化算法   总被引:10,自引:0,他引:10  
李刚 《计算机学报》2002,25(2):158-164
现实应用中常常涉及许多连续的数值属性,而且前许多机器学习算法则要求所处理的属性取离散值,根据在对数值属性的离散化过程中,是否考虑相关类别属性的值,离散化算法可分为有监督算法和无监督算法两类。基于混合概率模型,该文提出了一种理论严格的无监督离散化算法,它能够在无先验知识,无类别是属性的前提下,将数值属性的值域划分为若干子区间,再通过贝叶斯信息准则自动地寻求最佳的子区间数目和区间划分方法。  相似文献   

16.
A regression mixture model is proposed where each mixture component is a multi-kernel version of the Relevance Vector Machine (RVM). This mixture model exploits the enhanced modeling capability of RVMs, due to their embedded sparsity enforcing properties. In order to deal with the selection problem of kernel parameters, a weighted multi-kernel scheme is employed, where the weights are estimated during training. The mixture model is trained using the maximum a posteriori approach, where the Expectation Maximization (EM) algorithm is applied offering closed form update equations for the model parameters. Moreover, an incremental learning methodology is also presented that tackles the parameter initialization problem of the EM algorithm along with a BIC-based model selection methodology to estimate the proper number of mixture components. We provide comparative experimental results using various artificial and real benchmark datasets that empirically illustrate the efficiency of the proposed mixture model.  相似文献   

17.
针对聚类问题中的非随机性缺失数据,本文基于高斯混合聚类模型,分析了删失型数据期望最大化算法的有效性,并揭示了删失数据似然函数对模型算法的作用机制.从赤池弘次信息准则、信息散度等指标,比较了所提出方法与标准的期望最大化算法的优劣性.通过删失数据划分及指示变量,推导了聚类模型参数后验概率及似然函数,调整了参数截尾正态函数的...  相似文献   

18.
尹朋  谭德荣 《自动化信息》2009,(8):41-42,47
基于多元函数的不确定度理论,针对道路交通事故的速度计算模型建立了相应的不确定度分析方法。通过计算事故再现分析所需参数的不确定度分量,运用方差合成理论将各不确定参数的标准不确定度分量进行合成,最终得到车速估算结果的合成标准不确定度和扩展不确定度,以此给出较为合理的车速估算结果。该文以汽车碰撞自行车事故为例,验证了该方法的正确性。  相似文献   

19.
Bayesian feature and model selection for Gaussian mixture models   总被引:1,自引:0,他引:1  
We present a Bayesian method for mixture model training that simultaneously treats the feature selection and the model selection problem. The method is based on the integration of a mixture model formulation that takes into account the saliency of the features and a Bayesian approach to mixture learning that can be used to estimate the number of mixture components. The proposed learning algorithm follows the variational framework and can simultaneously optimize over the number of components, the saliency of the features, and the parameters of the mixture model. Experimental results using high-dimensional artificial and real data illustrate the effectiveness of the method.  相似文献   

20.
Factor Analysis (FA) is a well established probabilistic approach to unsupervised learning for complex systems involving correlated variables in high-dimensional spaces. FA aims principally to reduce the dimensionality of the data by projecting high-dimensional vectors on to lower-dimensional spaces. However, because of its inherent linearity, the generic FA model is essentially unable to capture data complexity when the input space is nonhomogeneous. A finite Mixture of Factor Analysers (MFA) is a globally nonlinear and therefore more flexible extension of the basic FA model that overcomes the above limitation by combining the local factor analysers of each cluster of the heterogeneous input space. The structure of the MFA model offers the potential to model the density of high-dimensional observations adequately while also allowing both clustering and local dimensionality reduction. Many aspects of the MFA model have recently come under close scrutiny, from both the likelihood-based and the Bayesian perspectives. In this paper, we adopt a Bayesian approach, and more specifically a treatment that bases estimation and inference on the stochastic simulation of the posterior distributions of interest. We first treat the case where the number of mixture components and the number of common factors are known and fixed, and we derive an efficient Markov Chain Monte Carlo (MCMC) algorithm based on Data Augmentation to perform inference and estimation. We also consider the more general setting where there is uncertainty about the dimensionalities of the latent spaces (number of mixture components and number of common factors unknown), and we estimate the complexity of the model by using the sample paths of an ergodic Markov chain obtained through the simulation of a continuous-time stochastic birth-and-death point process. The main strengths of our algorithms are that they are both efficient (our algorithms are all based on familiar and standard distributions that are easy to sample from, and many characteristics of interest are by-products of the same process) and easy to interpret. Moreover, they are straightforward to implement and offer the possibility of assessing the goodness of the results obtained. Experimental results on both artificial and real data reveal that our approach performs well, and can therefore be envisaged as an alternative to the other approaches used for this model.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号