首页 | 本学科首页   官方微博 | 高级检索  
     

基于异构知识库的命名实体消歧
引用本文:宁博,张菲菲.基于异构知识库的命名实体消歧[J].西安邮电学院学报,2014(4):70-76.
作者姓名:宁博  张菲菲
作者单位:[1]西安邮电大学国有资产管理处,陕西西安710121 [2]西安邮电大学计算机学院,陕西西安710121
基金项目:陕西省教育厅科研计划自然基金资助项目(12JK0938)
摘    要:针对自然语言处理中的中文命名实体消歧问题,提出一种基于异构知识库的层次聚类方法。利用中文信息抽取系统对中文维基百科等知识库进行抽取,形成包含人物信息、实体关系的实体信息对象,并在Hadoop平台上用分布式计算进行层次聚类,研究人物实体特征的选取和维基百科等知识库的使用对命名实体消歧结果的影响。结果表明加入百科知识库后,F值从91.33%增加到了92.68%。

关 键 词:人名消歧  维基百科  中文信息抽取  层次聚类  实体信息

Named entity disambiguation based on heterogeneous knowledge base
NING Bo,ZHANG Feifei.Named entity disambiguation based on heterogeneous knowledge base[J].Journal of Xi'an Institute of Posts and Telecommunications,2014(4):70-76.
Authors:NING Bo  ZHANG Feifei
Affiliation:1. State-owned Asset Management Department, Xi'an University of Posts and Telecornmtmications, Xi'an 710121, China; 2. School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an 710121, China)
Abstract:A scalable and robust system is proposed to deal with Named Entity disambiguation problem based on hierarchical clustering using Wikipedia as Knowledge Base.The entity profiles, as information obj ects which contain entity attributes and entity relations created by our IE system,are disambiguated with hierarchical clustering on Hadoop platform.Features selection on similarity measurement and comparison of the results using Heterogeneous as Knowledge Base are studied mainly in this paper.Results show that F-measure value increase from 91.33% to 92.68% by using Wikipedia as knowledge base.
Keywords:entity disambiguation  Wikipedia  Chinese information extraction  hierarchical clustering  entity information
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号