首页 | 本学科首页   官方微博 | 高级检索  
     

联合主题模型的标签聚类方法*
引用本文:胡学钢,李慧宗,潘剑寒,何伟,杨恒宇. 联合主题模型的标签聚类方法*[J]. 模式识别与人工智能, 2017, 30(5): 403-415. DOI: 10.16451/j.cnki.issn1003-6059.201705003
作者姓名:胡学钢  李慧宗  潘剑寒  何伟  杨恒宇
作者单位:1. 合肥工业大学 计算机与信息学院 合肥 230009
2. 安徽理工大学 经济与管理学院 淮南 232001
3.江苏师范大学 计算机科学与技术学院 徐州 221116
基金项目:国家自然科学基金项目(No.61673152,61672272,61303131,61273292)、教育部博士点基金项目(No.20130111110011)、教育部人文社会科学研究青年基金项目(No.13YJCZH077)、安徽高校人文社会科学重点研究基地“安徽理工大学矿业企业安全管理研究中心”招标项目(No.SK2015A082)资助
摘    要:提升标签聚类的质量是识别标签语义的一个关键问题.文中提出基于资源的联合主题模型标签聚类方法.利用资源的引用关系,采用随机游走的方法获取资源的权威度分数,以此设置“资源-标签”和“资源-词”这2个二元关系的权重.在此基础上,构建基于资源加权的词与标签的联合潜在狄利克雷分布(LDA)模型,通过迭代学习,获取标签的潜在主题,并根据主题最大隶属度聚类标签.实验表明,相比其它基于资源的标签聚类方法,文中方法能获取更好的聚类效果.

关 键 词:社会化标注系统   标签聚类   主题模型   潜在狄利克雷分布(LDA)   随机游走  
收稿时间:2016-05-05

Tag Clustering Method of Joint Topic Model
HU Xuegang,LI Huizong,PAN Jianhan,HE Wei,YANG Hengyu. Tag Clustering Method of Joint Topic Model[J]. Pattern Recognition and Artificial Intelligence, 2017, 30(5): 403-415. DOI: 10.16451/j.cnki.issn1003-6059.201705003
Authors:HU Xuegang  LI Huizong  PAN Jianhan  HE Wei  YANG Hengyu
Affiliation:1.School of Computer and Information, Hefei University of Technology, Hefei 230009
2. School of Economics and Management, Anhui University of Science and Technology, Huainan 232001
3. School of Computer Science and Technology, Jiangsu Normal University, Xuzhou 221116
Abstract:Improving the clustering quality of social tags is a key problem in the semantics recognition of tags. A joint topic model based on resource is proposed to cluster tags. Firstly, reference relations of the resource are utilized to acquire the authority scores of resource by using random walk method. Secondly, the resource authority is applied to set the weights of two binary relations of resource-tag and resource word. Grounded on that, the joint latent Dirichlet allocation(LDA) model of the word and the tag based on resource weighted is constructed. By iterative learning, the latent topics of the tag are acquired, and the clusters are decided according to the maximum membership degree of the tag. The results show that the proposed method has a better clustering performance than other tag clustering methods based on resource.
Keywords:Social Tagging System   Tag Clustering   Topic Model   Latent Dirichlet Allocation(LDA)   Random Walk  
点击此处可从《模式识别与人工智能》浏览原始摘要信息
点击此处可从《模式识别与人工智能》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号