首页 | 本学科首页   官方微博 | 高级检索  
     

基于主动学习的微博聚类分析
引用本文:朱丽,陆建峰.基于主动学习的微博聚类分析[J].数据采集与处理,2016,31(3):599-605.
作者姓名:朱丽  陆建峰
作者单位:南京理工大学计算机科学与工程学院,南京,210094
摘    要:K Means聚类算法由于无法准确确定初始化聚类中心,容易造成 聚类结果准确率低下。对微博数据聚类时,可能会导致无法正确反映兴趣热点。本文 设计了基于主动学习的聚类算法,在确定初始聚类中心过程中应用Min Max主动学习策略, 使 得算法每次在很小数量的查询后都会提供数据点供用户进行初始中心点确认,并在K Means算 法中重新计算聚类中心时设置其权重值,从而减少迭代的数量,提高聚类结果的准确 率,并将这一算法运用于微博聚类分析,得出微博热门话题。

关 键 词:主动学习  K  Means  微博

Clustering Analysis of Micro Blogs Based on Active Learning
Zhu Li,Lu Jianfeng.Clustering Analysis of Micro Blogs Based on Active Learning[J].Journal of Data Acquisition & Processing,2016,31(3):599-605.
Authors:Zhu Li  Lu Jianfeng
Affiliation:Department of Computer Science and Engineering, Nanjing University of Science a nd Technology, Nanjing, 210094, China
Abstract:The K Means clustering algorithm can not determine the initial cluster ing centers, which results in low accuracy and inability to reflect the interest ing hotspots. Here, algorithm based on clustering is proposed through applying Min M ax active learning strategy to ask the user for identifying the seed points. Several points are provided in small quantities of query for users to confirm the initial centers, and the weight is set in the recalculation of K Means centers, which reduces the number of iterations and improves the accu racy of clustering results. Moreover, the hot topics are obtained by applying th is algorithm to the micro blog clustering analysis.
Keywords:active learning  K Means  micro blogs
点击此处可从《数据采集与处理》浏览原始摘要信息
点击此处可从《数据采集与处理》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号