共查询到20条相似文献,搜索用时 0 毫秒
1.
基于新的距离度量的K-Modes聚类算法 总被引:5,自引:1,他引:4
传统的K-Modes聚类算法采用简单的0-1匹配差异方法来计算同一分类属性下两个属性值之间的距离, 没有充分考虑其相似性. 对此, 基于粗糙集理论, 提出了一种新的距离度量. 该距离度量在度量同一分类属性下两个属性值之间的差异时, 克服了简单0-1匹配差异法的不足, 既考虑了它们本身的异同, 又考虑了其他相关分类属性对它们的区分性. 并将提出的距离度量应用于传统K-Modes聚类算法中. 通过与基于其他距离度量的K-Modes聚类算法进行实验比较, 结果表明新的距离度量是更加有效的. 相似文献
2.
3.
4.
针对传统的脱靶量测试设备体积庞大、不便移动且测试成本高等缺点,提出了一种基于弹道曲线模型的双目立体视觉的脱靶量测试方法。研究了摄像机的成像模型及坐标系变换,分析了摄像机标定的一般方法,构建了摄像机标定模型,并推导了标定参数的表达式。基于双目立体视觉测试技术设计了弹道曲线模型的弹丸脱靶量测量系统,其测量过程简单且标定结果稳定,可实现弹丸运动目标的快速检测,通过对运动目标的空间定位和轨迹拟合实现了脱靶量的测试。实验结果表明:弹道直线模型和曲线模型均能简便地求解飞行弹丸脱靶量,后者比前者更贴近实际弹丸飞行轨迹,可获得更高精度的脱靶量,弹道曲线模型的脱靶量平均绝对误差(MAE)比直线模型降低了大约一半。 相似文献
5.
提出了一种基于图划分的全基因组并行拼接算法.该算法巧妙地将数据划分问题转化成图划分的问题,解决了传统数据划分算法中存在的节点负载不平衡的问题.同时,算法在建立关系图时有效地利用了WGS测序中所提供reads之间的长度信息和配对信息,使reads关系图能更准确地反映出数据之间的关系特性,从而提高了数据划分的准确性.实验结果表明,该算法可以准确地划分各种模拟数据、真实数据的数据集,相对于传统数据划分算法划分质量有了明显改善. 相似文献
6.
进化树是推演生命历史的一个重要工具.在构建进化树的所有算法中,基于进化距离的算法是其中研究的重点.但是,这一方法较为严重地依赖着距离矩阵的质量.人们开发了多种基于生物事实的进化模型来改进距离矩阵的构建过程,很大程度上提高了进化距离的准确性.同时,也提出了许多方法来检测距离矩阵的质量.文中提出了基于模型的距离以及p距离,采用一种组合的新距离的方式来构建距离矩阵.同时采用直接检测距离矩阵的统计学计分方法以及构建进化树,对比实验结果表明文中的方法实用且有效. 相似文献
7.
曹立 《计算机工程与应用》2003,39(27):98-99,137
该文讨论了图像序列中运动物体的姿态描述问题。以姿态向量作为刻画运动序列的统一参数,建立了由姿态向量计算外部特征点运动轨迹的关系式,讨论了逆向映射的计算方法。 相似文献
8.
聚类是数据挖掘中重要的研究方向。本文针对现有的聚类算法中相似度量的缺陷,提出了一种新的相似性度量方法。在此基础上,将粗糙集理论中的区分能力引入到聚类算法中,用来度量属性的重要性,进而提出了一种能够处理符号型数据的新的加权粗糙聚类算法。通过对UCI数据的实验表明,本文算法对数据输入顺序不敏感,且不需要预先给定簇的数目,提高了聚类的质量。 相似文献
9.
针对确定聚类中心上对密度峰值和距离两个元素综合考虑上的不足,在确定聚类中心上对密度峰值和距离两个元素综合考虑并作出归一化的处理.在聚类中心的确定上与其他聚类算法有所不同.介绍了该算法的核心思想、实现及测试,得出了算法实现过程中体现出的结论.对实现的代码用4个数据集进行了实验和测试,并将该算法与经典的k-means算法进行了NMI对比分析.从而得出结论,文章的聚类算法拥有较好的聚类能力. 相似文献
10.
本文讨论噪声模式和畸变模式的识别问题.用带有位置坐标的树状文法描述图象,对区域的边界进行分析,建立与产生式相对应的词意规则.这里既考虑模式的统计特征,又通过词意、句法指导下的变换来描写畸变模式的结构.在此基础上提出一种包括词意及句法的距离度量,从而用最小距离准则来进行识别. 相似文献
11.
一种新的基于隐Markov模型的分层时间序列聚类算法 总被引:4,自引:0,他引:4
针对传统的基于隐Markov模型(HMM)的聚类算法在时间序列聚类的不足,提出了一种新的基于HMM的分层时间序列聚类算法HBHCTS,旨在提高聚类质量,同时对聚类结果给出类的表示. HBHCTS算法应用HMM对时间序列进行建模,并按照“最相似”的原则得到序列所对应的初始模型集,进而对这些初始模型合并更新及迭代得到聚类结果.实验中主要研究了聚类正确率与序列长度及模型距离的关系,结果表明HBHCTS算法比传统的基于HMM的聚类算法准确性高. 相似文献
12.
Model-Based Clustering by Probabilistic Self-Organizing Maps 总被引:1,自引:0,他引:1
Shih-Sian Cheng Hsin-Chia Fu Hsin-Min Wang 《Neural Networks, IEEE Transactions on》2009,20(5):805-826
In this paper, we consider the learning process of a probabilistic self-organizing map (PbSOM) as a model-based data clustering procedure that preserves the topological relationships between data clusters in a neural network. Based on this concept, we develop a coupling-likelihood mixture model for the PbSOM that extends the reference vectors in Kohonen's self-organizing map (SOM) to multivariate Gaussian distributions. We also derive three expectation-maximization (EM)-type algorithms, called the SOCEM, SOEM, and SODAEM algorithms, for learning the model (PbSOM) based on the maximum-likelihood criterion. SOCEM is derived by using the classification EM (CEM) algorithm to maximize the classification likelihood; SOEM is derived by using the EM algorithm to maximize the mixture likelihood; and SODAEM is a deterministic annealing (DA) variant of SOCEM and SOEM. Moreover, by shrinking the neighborhood size, SOCEM and SOEM can be interpreted, respectively, as DA variants of the CEM and EM algorithms for Gaussian model-based clustering. The experimental results show that the proposed PbSOM learning algorithms achieve comparable data clustering performance to that of the deterministic annealing EM (DAEM) approach, while maintaining the topology-preserving property. 相似文献
13.
Yuming Zhou Hareton Leung Winoto P. 《IEEE transactions on pattern analysis and machine intelligence》2007,33(12):869-890
Web site success is significantly associated with navigability, an important attribute of usability that denotes the ease with which users find desired information as they move through a Web site. Navigable Web sites allow users to form a mental model of the type and location of information in the Web site and an expectation of where and to what a particular hyperlink will lead. Existing navigability measures are based mainly on the static hyperlink structure of a Web site. Such measures, however, have two main drawbacks: 1) the effect on navigability of a hyperlink structure cannot be well characterized and 2) the effect on navigability of the navigation aids (such as the "Back" button provided by a browser) is ignored. In this paper, we abstract a dynamic Web surfing behavior as a Markov model which synthesizes typical surfing actions. Based on this model, we propose a novel navigability measure MNav. The experimental results show that MNav can be efficiently computed and it provides an effective and useful measurement of Web site navigability. 相似文献
14.
许伟佳 《数字社区&智能家居》2009,(25)
文档聚类在Web文本挖掘中占有重要地位,是聚类分析在文本处理领域的应用。文章介绍了基于向量空间模型的文本表示方法,分析并优化了向量空间模型中特征词条权重的评价函数,使基于距离的相似性度量更为准确。重点分析了Web文档聚类中普遍使用的基于划分的k-means算法,对于k-means算法随机选取初始聚类中心的缺陷,详细介绍了采用基于最大最小距离法的原则,结合抽样技术思想,来稳定初始聚类中心的选取,改善聚类结果。 相似文献
15.
许伟佳 《数字社区&智能家居》2009,5(9):7281-7283,7286
文档聚类在Web文本挖掘中占有重要地位.是聚类分析在文本处理领域的应用。文章介绍了基于向量空间模型的文本表示方法,分析并优化了向量空间模型中特征词条权重的评价函数,使基于距离的相似性度量更为准确。重点分析了Web文档聚类中普遍使用的基于划分的k-means算法.对于k-means算法随机选取初始聚类中心的缺陷.详细介绍了采用基于最大最小距离法的原则,结合抽样技术思想,来稳定初始聚类中心的选取,改善聚类结果。 相似文献
16.
Sriparna Saha Student Member IEEE and Sanghamitra Bandyopadhyay Senior Member IEEE 《计算机科学技术学报》2009,24(3):544-556
In this paper,at first a new line-symmetry-based distance is proposed.The properties of the proposed distance are then elaborately described.Kd-tree-based nearest neighbor search is used to reduce the complexity of computing the proposed line-symmetry-based distance.Thereafter an evolutionary clustering technique is developed that uses the new linesymmetry -based distance measure for assigning points to different clusters.Adaptive mutation and crossover probabilities are used to accelerate the proposed c... 相似文献
17.
18.
We compare the three basic algorithms for model-based clustering on high-dimensional discrete-variable datasets. All three algorithms use the same underlying model: a naive-Bayes model with a hidden root node, also known as a multinomial-mixture model. In the first part of the paper, we perform an experimental comparison between three batch algorithms that learn the parameters of this model: the Expectation–Maximization (EM) algorithm, a winner take all version of the EM algorithm reminiscent of the K-means algorithm, and model-based agglomerative clustering. We find that the EM algorithm significantly outperforms the other methods, and proceed to investigate the effect of various initialization methods on the final solution produced by the EM algorithm. The initializations that we consider are (1) parameters sampled from an uninformative prior, (2) random perturbations of the marginal distribution of the data, and (3) the output of agglomerative clustering. Although the methods are substantially different, they lead to learned models that are similar in quality. 相似文献
19.
Yutaka Matsuo Yukio Ohsawa Mitsuru Ishizuka 《Journal of Intelligent Information Systems》2003,20(1):51-62
The pages and hyperlinks of the World Wide Web may be viewed as nodes and edges in a directed graph. In this paper, we propose a new definition of the distance between two pages, called average-clicks. It is based on the probability to click a link through random surfing. We compare the average-clicks measure to the classical measure of clicks between two pages, and show the average-clicks fits better to the users' intuition of distance. 相似文献