基于随机游走模型和KL-divergence的聚类算法 Clustering Algorithm Based on Random Walk Model and KL-divergence期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于随机游走模型和KL-divergence的聚类算法

引用本文：	何会民.基于随机游走模型和KL-divergence的聚类算法[J].计算机工程,2008,34(16):224-226.

作者姓名：	何会民

作者单位：	邯郸学院计算机系,邯郸,056005

摘要：	聚类分析在数据挖掘领域有着广泛的应用，该文提出一个聚类新思路，它不需要任何参数的假设，只基于数据两两之间的相似性。该方法假设数据点之间存在随机游走关系，根据数据相似性构造随机游走过程的转移矩阵，当随机游走过程进入收敛期后，t阶转移矩阵揭示了数据点的分布。用迭代方法寻找最小的KL-divergence来对这些分布聚类。该方法具有严谨的概率理论基础，避免了传统算法需要参数假设、限于局部最优等不足。实验表明，该算法具有较优的聚类效果。
关键词：	聚类随机游走 KL散度
修稿时间：
Clustering Algorithm Based on Random Walk Model and KL-divergence

HE Hui-min.Clustering Algorithm Based on Random Walk Model and KL-divergence[J].Computer Engineering,2008,34(16):224-226.

Authors:	HE Hui-min

Affiliation:	(Computer Science Department, Handan College, Handan 056005)

Abstract:	Clustering analysis is broadly applied in data mining. This paper presents a new idea in clustering based on pair-wise similarities, and assumes no parametric statistical model. Similarities are transformed to a Markov random walk probability matrix. It is assumed the dataset is under a Markov random walk process. When the process is going into convergence, the t-step transform matrix indicates the distribution of the dataset. It uses iterative algorithm to cluster these data with the goal of decreasing KL-divergence. This method has a solid foundation of probability theory, which can avoid some insufficiency of the traditional algorithms. The experiment shows the algorithm can achieve better results than K-means and mixture models.

Keywords:	clustering random walk KL-divergence
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《计算机工程》浏览原始摘要信息
	点击此处可从《计算机工程》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏