首页 | 本学科首页   官方微博 | 高级检索  
     

基于关键字的用户聚类算法
引用本文:王荣,李晋宏,宋威.基于关键字的用户聚类算法[J].计算机工程与设计,2012,33(9):3553-3557,3568.
作者姓名:王荣  李晋宏  宋威
作者单位:北方工业大学信息工程学院,北京,100144
基金项目:国家自然科学基金项目(51075423、61105045);北京市属市管高等学校人才强教计划基金项目(PHR20100509、PHR201108057);北京市优秀人才培养基金项目(2009D005002000009)
摘    要:为了得到准确有效的用户聚类,提出了一种基于关键字的用户聚类算法.该算法是在传统Rock算法的基础上进行了改进,提出了相似权重和平均邻居的概念,并且将用户关键字事务集的平均邻居数定义为用户访问模式相似性的标准.在不产生离群用户点的基础上,缩小了用户聚类的范围,将一个大的用户聚类更加精确的划分为几个小的用户聚类.利用用户之间的相似度阈值对数据进行过滤,减小了用户聚类的计算量.经过实验验证该算法有效的提高了相似用户聚类的准确性和运行效率.

关 键 词:关键字  相似权重  平均邻居  相似度  用户聚类

User clustering algorithm based on Keywords
WANG Rong , LI Jin-hong , SONG Wei.User clustering algorithm based on Keywords[J].Computer Engineering and Design,2012,33(9):3553-3557,3568.
Authors:WANG Rong  LI Jin-hong  SONG Wei
Affiliation:(College of Information Engineering,North China University of Technology,Beijing 100144,China)
Abstract:To get the accurate user clustering,a user clustering algorithm based on the keyword is presented.Based on the foundation of the traditional Rock algorithm,the concept of the similarity weight and average neighbors are proposed.The average number of neighbors of the user Key words transactions is defined as the user access pattern similarity criteria.The big user clustering is divided into more several small user clustering,so the scope is narrowed based on no outliers user points.The user similarity threshold is used for data filtering,thus the calculation amount of user clustering is reduced.The experiments show that the algorithm can effectively improve the accuracy and the efficiency of the similar user clustering.
Keywords:keyword  similarity weight  average neighbors  similarity  user clustering
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号