首页 | 本学科首页   官方微博 | 高级检索  
     

基于微博短文本的用户兴趣建模方法
引用本文:邱云飞,王琳颍,邵良杉,等.基于微博短文本的用户兴趣建模方法[J].计算机工程,2014(2):275-279.
作者姓名:邱云飞  王琳颍  邵良杉  
作者单位:[1]辽宁工程技术大学软件学院,辽宁葫芦岛125100 [2]辽宁工程技术大学系统工程研究所,辽宁阜新123000 [3]阜新市实验高中,辽宁阜新123000
基金项目:国家自然科学基金资助项~(70971059);辽宁省创新团队基金资助项目(2009T045);辽宁省高等学校杰出青年学者成长计划基金资助项目(JQ2012027)
摘    要:针对微博用户兴趣建模问题,提出一种在微博短文本数据集上建立用户兴趣模型的方法。为缓解短文本造成的数据稀疏性问题,在分析微博短文本结构和内容的基础上,给出微博短文本重构概念,根据微博相关的其他微博短文本和文本中包含的3种特殊符号,进行文本内容的扩展,从而扩充原始微博的特征信息。利用HowNet2000概念词典将重构后文本的特征词集映射到概念集。以抽象到概念层的文本向量为基础进行聚类,划分用户的兴趣集合,并给出用户兴趣模型的表示机制。实验结果表明,短文本重构和概念映射提高了聚类效果,与基于协同过滤的微博用户兴趣建模方法相比,平衡均值提高29.1%,表明构建的微博用户兴趣模型具有较好的性能。

关 键 词:微博  短文本重构  概念映射  短文本聚类  用户兴趣模型

User Interest Modeling Approach Based on Short Text of Micro-blog
QIU Yun-fei,WANG Lin-ying,SHAO Liang-shan,GUO Hong-mei.User Interest Modeling Approach Based on Short Text of Micro-blog[J].Computer Engineering,2014(2):275-279.
Authors:QIU Yun-fei  WANG Lin-ying  SHAO Liang-shan  GUO Hong-mei
Affiliation:1. School of Software, Liaoning Technical University, Huludao 125100, China; 2. System Engineering Institute, Liaoning Technical University, Fuxin 123000, China; 3. Experimental High School ofFuxin, Fuxin 123000, China)
Abstract:In this paper, a method on modeling user's interests based on short text of micro-blog is presented. In order to overcome the lack of information in short text, on the base of analyzing the structure and content of micro-blog short text, this paper proposes an approach on micro-blog short text reconstruction, and namely, according to the other related and the three kinds of special symbols of the text, extends the content, thereby extending the characteristic information of original micro-biog. It takes advantage of HowNet2000 concept dictionary to map the feature set of reconstruction text to a set of concepts. It clusters the set of concepts to divide user's interests, and meanwhile, a representation mechanism of user interest model is presented. Experimental results show that the short text reconstruction and concept mapping can improve the effect of clustering. Compared with the modeling based on collaborative filtering, F-Measure value is increased by 29.1%. This means the proposed micro-blog user's interest modeling has a better performance.
Keywords:micro-blog  short-text reconstruction  concept mapping  short-text clustering  user interest model
本文献已被 CNKI 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号