首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
With the wide applications of Gaussian mixture clustering, e.g., in semantic video classification [H. Luo, J. Fan, J. Xiao, X. Zhu, Semantic principal video shot classification via mixture Gaussian, in: Proceedings of the 2003 International Conference on Multimedia and Expo, vol. 2, 2003, pp. 189-192], it is a nontrivial task to select the useful features in Gaussian mixture clustering without class labels. This paper, therefore, proposes a new feature selection method, through which not only the most relevant features are identified, but the redundant features are also eliminated so that the smallest relevant feature subset can be found. We integrate this method with our recently proposed Gaussian mixture clustering approach, namely rival penalized expectation-maximization (RPEM) algorithm [Y.M. Cheung, A rival penalized EM algorithm towards maximizing weighted likelihood for density mixture clustering with automatic model selection, in: Proceedings of the 17th International Conference on Pattern Recognition, 2004, pp. 633-636; Y.M. Cheung, Maximum weighted likelihood via rival penalized EM for density mixture clustering with automatic model selection, IEEE Trans. Knowl. Data Eng. 17(6) (2005) 750-761], which is able to determine the number of components (i.e., the model order selection) in a Gaussian mixture automatically. Subsequently, the data clustering, model selection, and the feature selection are all performed in a single learning process. Experimental results have shown the efficacy of the proposed approach.  相似文献   

2.
基于多特征的EM算法在昆虫图像分割中的应用   总被引:4,自引:0,他引:4  
提出了一种基于多特征的EM(Expectation-maximizarion)聚类的昆虫图像分割方法.与一般的EM算法不同,这种方法首先选用适当的彩色空间对图像中的每个像素抽取颜色、纹理及空间位置等综合特征,形成基于像素的8维综合特征空间,然后采用高斯混合模型,通过EM算法估计高斯混合模型参数,利用图像像素点特征的相似度在特征空间中得到初步的区域分割,最后利用连接原理对图像区域进一步分割.实验结果表明, 算法能较好地分割昆虫图像.  相似文献   

3.
The main difficulty with EM algorithm for mixture model concerns the number of components, say g. This is the question of model selection, and the EM algorithm itself could not estimate g. On the contrary, the algorithm requires g to be specified before the remaining parameters can be estimated. To solve this problem, a new algorithm, which is called stepwise split-and-merge EM (SSMEM) algorithm, is proposed. The SSMEM algorithm alternately splits and merges components, estimating g and other parameters of components simultaneously. Also, two novel criteria are introduced to efficiently select the components for split or merge. Experimental results on simulated and real data demonstrate the effectivity of the proposed algorithm.  相似文献   

4.
Equipment Managers (EMs) play a major role in a Manufacturing Execution System (MES). They serve as the communication bridge between the components of an MES and the equipment. The purpose of this paper is to propose a novel methodology for developing analytical and simulation models for the EM such that the validity and performance of the EM can be evaluated. Domain knowledge and requirements are collected from a real semiconductor packaging factory. By using IDEFO and state diagrams, a static functional model and a dynamic state model of the EM are built. Next, these two models are translated into a Petri net model. This allows qualitative and quantitative analyses of the system. The EM net model is then expanded into the MES net model. Therefore, the performance of an EM in the MES environment can be evaluated. These evaluation results are good references for design and decision making.  相似文献   

5.
6.
基于Ontology和EM方法的网页分类研究   总被引:1,自引:1,他引:1  
Works on abstracting semantic information from substantive pages of Web and their usage in search engine can lead to intelligent retrieval ,or other individual services. This paper mainly focuses on some research about analysis of Web page classification infor. Ontology as a base,using TFIDF word weights and Rocchio algorithm is combined with EM to improve accuracy of classifier. It's proved that this EM procedure works well on enhancing the veracity by the usage of unlabeled pages when the samples are limited.  相似文献   

7.
提出了利用大量用户评价结果来进行特征权重的计算方法,用于解决搜索引擎中查询串与搜索结果的相似度分析。该方法完全利用用户对搜索结果的“潜在评价”来进行。用户对输入查询串所做的点击反映了其内部的关联性,该文提出的方法可获取这种关联性,对该问题建立了数学模型,利用EM算法解决了特征权重的计算。由于模型的函数比较复杂,难于计算其收敛性,因此,使用了模拟退火算法作为EM算法的补充,用于验证算法的收敛性。实验使用百度搜索引擎在竞价广告上进行,提取的测试数据样本为100个广告和144 132个query,获得的数据结果显示,所有特征收敛到全局最优解,抽样部分数据获得检索相似准确率为93.32%,召回率为87.43%。  相似文献   

8.
Generalized linear mixed models (GLMM) form a very general class of random effects models for discrete and continuous responses in the exponential family. They are useful in a variety of applications. The traditional likelihood approach for GLMM usually involves high dimensional integrations which are computationally intensive. In this work, we investigate the case of binary outcomes analyzed under a two stage probit normal model with random effects. First, it is shown how ML estimates of the fixed effects and variance components can be computed using a stochastic approximation of the EM algorithm (SAEM). The SAEM algorithm can be applied directly, or in conjunction with a parameter expansion version of EM to speed up the convergence. A procedure is also proposed to obtain REML estimates of variance components and REML-based estimates of fixed effects. Finally an application to a real data set involving a clinical trial is presented, in which these techniques are compared to other procedures (penalized quasi-likelihood, maximum likelihood, Bayesian inference) already available in classical softwares (SAS Glimmix, SAS Nlmixed, WinBUGS), as well as to a Monte Carlo EM (MCEM) algorithm.  相似文献   

9.
Eigenmoments     
  相似文献   

10.
We propose a constrained EM algorithm for principal component analysis (PCA) using a coupled probability model derived from single-standard factor analysis models with isotropic noise structure. The single probabilistic PCA, especially for the case where there is no noise, can find only a vector set that is a linear superposition of principal components and requires postprocessing, such as diagonalization of symmetric matrices. By contrast, the proposed algorithm finds the actual principal components, which are sorted in descending order of eigenvalue size and require no additional calculation or postprocessing. The method is easily applied to kernel PCA. It is also shown that the new EM algorithm is derived from a generalized least-squares formulation.  相似文献   

11.
二维主分量分析是一种直接面向图像矩阵表达方式的特征抽取与降维方法. 提出了一个基于二维主分量分析的概率模型. 首先, 通过对此产生式概率模型参数的最大似然估计得到主分量(矢量); 然后, 考虑到缺失数据问题, 利用期望最大化算法迭代估计模型参数和主分量. 混合概率二维主分量分析模型在人脸聚类问题上的应用表明概率二维主分量分析模型能作为图像矩阵的密度估计工具. 含有缺失值的人脸图像重构实验阐述了此模型及迭代算法的有效性.  相似文献   

12.
In recent years, methods of feature selection have been increasingly emphasized as venues for reducing cost and shortening the length of time required for computation in data mining. This study utilizes electromagnetism-like mechanism as a wrapper approach to feature selection. Birbil and Fang proposed EM in 2003. EM uses the attraction-repulsion mechanism of the electromagnetism theory to ascertain the optimal solution. Although EM has been applied to the topic of optimization in continuous space and a small number of studies on discrete problems, it has not been applied to the subject of feature selection. In this study, EM combined with 1-nearest-neighbor (1NN) was applied for feature selection and classification. This study utilized the total force exerted on a particle and evaluated this force to determine which features are to be selected. The most crucial features were selected according to the proposed method based on the minimum miss-classification rate, which was attained through 1NN. An unknown datum was classified by 1NN based on the chosen reduced model. To estimate the effectiveness of the proposed method, a numerical experiment was conducted using several data sets with diverse sizes, features, separability, and classes. Experimental results indicated that the proposed method outperformed other well-known algorithms in not only balanced classification accuracy but also efficiency of feature selection. Lastly, this study used an actual case concerning gestational diabetes mellitus to demonstrate the workability of the proposed method.  相似文献   

13.
A simple program EMLCLLER coded in FORTRAN 77 is presented in this paper for modeling the electromagnetic (EM) response of a large circular loop source over a layered earth model in quasi-static as well as non-quasi-static (general frequency) regions. The program is based on a semi-analytical numerical approach for evaluating the improper integrals occurring in the expressions of EM field components. It takes into account both conduction as well as displacement current factors. The program is formulated in such a way that it is well suited for computing the EM response for any arbitrary position of the source loop in air or on the surface of the model, in contrast to the earlier methods which face convergence problem for source position on the surface of the model. The program has wide application and is capable of computing the EM response at any arbitrary point either inside or outside the source loop. The validity and accuracy of the program is demonstrated by computing the EM response of a large loop source over the homogeneous and multi-layer earth models. Response curves depict their characteristic variations. The computed results are in coincidence with the published results for the quasi-static region (f50 kHz) and are extension of their characteristic variation in the non-quasi-static (50 kHzf1000 kHz) region. Matching of computed results with the published results demonstrate the validity of the program.  相似文献   

14.
Design of microwave components is an inherently multiobjective task. Often, the objectives are at least partially conflicting and the designer has to work out a suitable compromise. In practice, generating the best possible trade‐off designs requires multiobjective optimization, which is a computationally demanding task. If the structure of interest is evaluated through full‐wave electromagnetic (EM) analysis, the employment of widely used population‐based metaheuristics algorithms may become prohibitive in computational terms. This is a common situation for miniaturized components, where considerable cross‐coupling effects make traditional representations (eg, network equivalents) grossly inaccurate. This article presents a framework for accelerated EM‐driven multiobjective design of compact microwave devices. It adopts a recently reported nested kriging methodology to identify the parameter space region containing the Pareto front and to render a fast surrogate, subsequently used to find the first approximation of the Pareto set. The final trade‐off designs are produced in a separate, surrogate‐assisted refinement process. Our approach is demonstrated using a three‐section impedance matching transformer designed for the best matching and the minimum footprint area. The Pareto set is generated at the cost of only a few hundred of high‐fidelity EM simulations of the transformer circuit despite a large number of geometry parameters involved.  相似文献   

15.
Design closure of compact microwave components is a challenging problem because of significant electromagnetic (EM) cross‐couplings in densely arranged layouts. A separate issue is a large number of designable parameters resulting from replacement of conventional transmission line sections by compact microstrip resonant cells. This increases complexity of the design optimization problem and requires employment of expensive high‐fidelity EM analysis for reliable performance evaluation of the structure at hand. Consequently, neither conventional numerical optimization algorithms nor interactive approaches (e.g., experience‐driven parameters sweeps) are capable of identifying optimum designs in reasonable timeframes. Here, we discuss application of feature‐based optimization for fast design optimization of dual‐ and multiband compact couplers. On one hand, design of such components is difficult because of multiple objectives (achieving equal power split and good matching and port isolation for all frequency bands of interest). On the other hand, because of well‐defined shapes of the S‐parameter responses for this class of components, feature‐based optimization seems to be well suited to control multiple figures of interest as demonstrated in this work. Two‐level EM modeling is used for further design cost reduction. More importantly, we develop a procedure for automated determination of the low‐fidelity EM model coarseness that allows us to find the fastest possible model that still ensures sufficient correlation with its high‐fidelity counterpart, which is critical for robustness of the optimization process. Our approach is illustrated using two dual‐band compact couplers. Experimental validation is also provided.  相似文献   

16.
This article is based on a session given by the authors at theACH/ALLC conference at the University of Victoria in June 2005.It discusses the prospects for partnership between the humanitiesand computing from the alternative perspective afforded by EmpiricalModelling (EM). Perceived dualities that separate the two culturesof science and art are identified as the primary impedimentto this partnership. A vision for ‘human computing’that promises to dissolve these dualities is outlined. The keycharacteristics and potential for EM for the humanities areillustrated with reference to a modelling exercise on the themeof Schubert's Erlkönig. This highlights how each of thesix varieties of modelling identified by McCarty can be representedwithin an EM model. The implications of EM are discussed withreference to McCarty's account of the key role for modellingin the humanities, in relation to James's ‘philosophicattitude’ of Radical Empiricism and to ideas from phenomenologicalsources.  相似文献   

17.
Clustering is a useful tool for finding structure in a data set. The mixture likelihood approach to clustering is a popular clustering method, in which the EM algorithm is the most used method. However, the EM algorithm for Gaussian mixture models is quite sensitive to initial values and the number of its components needs to be given a priori. To resolve these drawbacks of the EM, we develop a robust EM clustering algorithm for Gaussian mixture models, first creating a new way to solve these initialization problems. We then construct a schema to automatically obtain an optimal number of clusters. Therefore, the proposed robust EM algorithm is robust to initialization and also different cluster volumes with automatically obtaining an optimal number of clusters. Some experimental examples are used to compare our robust EM algorithm with existing clustering methods. The results demonstrate the superiority and usefulness of our proposed method.  相似文献   

18.
The Mixture Modeling (MIXMOD) program fits mixture models to a given data set for the purposes of density estimation, clustering or discriminant analysis. A large variety of algorithms to estimate the mixture parameters are proposed (EM, Classification EM, Stochastic EM), and it is possible to combine these to yield different strategies for obtaining a sensible maximum for the likelihood (or complete-data likelihood) function. MIXMOD is currently intended to be used for multivariate Gaussian mixtures, and fourteen different Gaussian models can be distinguished according to different assumptions regarding the component variance matrix eigenvalue decomposition. Moreover, different information criteria for choosing a parsimonious model (the number of mixture components, for instance) are included, their suitability depending on the particular perspective (cluster analysis or discriminant analysis). Written in C++, MIXMOD is interfaced with SCILAB and MATLAB. The program, the statistical documentation and the user guide are available on the internet at the following address: http://www-math.univ-fcomte.fr/mixmod/index.php.  相似文献   

19.
Since Hermite–Gaussian (HG) functions provide an orthonormal basis with the most compact time–frequency supports (TFSs), they are ideally suited for time–frequency component analysis of finite energy signals. For a signal component whose TFS tightly fits into a circular region around the origin, HG function expansion provides optimal representation by using the fewest number of basis functions. However, for signal components whose TFS has a non-circular shape away from the origin, straight forward expansions require excessively large number of HGs resulting to noise fitting. Furthermore, for closely spaced signal components with non-circular TFSs, direct application of HG expansion cannot provide reliable estimates to the individual signal components. To alleviate these problems, by using expectation maximization (EM) iterations, we propose a fully automated pre-processing technique which identifies and transforms TFSs of individual signal components to circular regions centered around the origin so that reliable signal estimates for the signal components can be obtained. The HG expansion order for each signal component is determined by using a robust estimation technique. Then, the estimated components are post-processed to transform their TFSs back to their original positions. The proposed technique can be used to analyze signals with overlapping components as long as the overlapped supports of the components have an area smaller than the effective support of a Gaussian atom which has the smallest time-bandwidth product. It is shown that if the area of the overlap region is larger than this threshold, the components cannot be uniquely identified. Obtained results on the synthetic and real signals demonstrate the effectiveness for the proposed time–frequency analysis technique under severe noise cases.  相似文献   

20.
刘保利 《计算机应用》2008,28(4):990-992
基于最大期望(EM)算法与遗传算法(GA),提出一种有效的多尺度SAR图像无监督分割方法。该方法首先利用混合多尺度自回归(MMAR)模型描述SAR图像中由于雷达斑点所引起的不同尺度和同一尺度内像素之间的统计相依性; 然后将GA与EM结合给出MMAR模型的参数估计算法。这种算法利用最小描述长度(MDL)准则,能够选择模型的分量数;最后利用Bayes分类器实现图像的分割。该方法集遗传算法和EM算法的优点,对初始值有较少的敏感性,避免局部最优解,提高了分割精度。实验结果表明GA EM方法优于EM算法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号