首页 | 本学科首页   官方微博 | 高级检索  
     

基于信息内容和拓扑关系的社会媒体用户兴趣分类
引用本文:吴海涛,应 时.基于信息内容和拓扑关系的社会媒体用户兴趣分类[J].计算机科学,2015,42(4):185-189, 198.
作者姓名:吴海涛  应 时
作者单位:1. 武汉大学计算机学院软件工程国家重点实验室 武汉430072;黄淮学院软件学院 驻马店463000
2. 武汉大学计算机学院软件工程国家重点实验室 武汉430072
基金项目:本文受国家自然科学基金项目(61070012,61070022),国家自然科学基金重点项目(91118003,3,61272108)资助
摘    要:随着社会的发展,信息已经成为社会发展越来越重要的部分,人类的信息传播活动越来越明显地展示出分众特征,对用户的分类成为人类信息活动的一个重要研究课题.从这一目标出发,分别基于信息内容、拓扑关系和两者综合的方法,按兴趣主题对社会媒体用户进行分类.对于基于信息内容的用户分类,采用LDA主题模型从用户所发布的内容中提取其主题分布,基于这一分布,采用支持向量机、决策树、贝叶斯等多种模型按兴趣主题对用户进行分类.对于基于拓扑关系的分类,依据相同兴趣主题的用户倾向于拥有共同的粉丝这一发现,构建分类模型来按兴趣主题对用户进行分类.然后提出综合信息内容和拓扑关系的分类方法来对用户进行分类.最后基于大规模Twitter数据的实验发现,采用综合方法对用户进行的兴趣分类性能明显高于采用单一信息内容或粉丝拓扑方法的性能.

关 键 词:在线社会网络  兴趣分类  LDA  粉丝拓扑

Classifying Interests of Social Media Users Based on Information Content and Social Graph
WU Hai-tao and YING Shi.Classifying Interests of Social Media Users Based on Information Content and Social Graph[J].Computer Science,2015,42(4):185-189, 198.
Authors:WU Hai-tao and YING Shi
Affiliation:State Key Laboratory of Software Engineering,Computer School,Wuhan University,Wuhan 430072,China;Software College,Huanghuai University,Zhumadian 463000,China and State Key Laboratory of Software Engineering,Computer School,Wuhan University,Wuhan 430072,China
Abstract:With the development of society,there has been a more and more obvious presence of the characteristic of audience-segmentation in human activity over information spreading,and user classification has also become an important research topic.So the article carried out a study over online social network user from multiple perspectives which mainly include user classification based on interested topics and preference,classify interests of social media user based on information content and topological relation,and both them respectively.For user classification based on information content,we adopted LDA to extract the topic distribution from the content posted by users.And the distribution is used in support vector machine,decision tree,Bayes and other multiple models to classify interests of users.For user classification based on topological relation,we found that users with same interests tend to have more common fans,and based on this finding we built classification models to classify users.Then,we proposed methods of combining information content and topological relation to classify users.Based on the experiments using Twitter data,we found that the combined method outperforms the one based on information content or topological relation.
Keywords:Online social networks  User classification  LDA  Topological relation
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号