首页 | 本学科首页   官方微博 | 高级检索  
     

基于本体知识库的自动语义标注*
引用本文:戚欣,肖敏,孙建鹏. 基于本体知识库的自动语义标注*[J]. 计算机应用研究, 2011, 28(5): 1742-1744. DOI: 10.3969/j.issn.1001-3695.2011.05.042
作者姓名:戚欣  肖敏  孙建鹏
作者单位:武汉理工大学,计算机科学与技术学院,武汉,430063
基金项目:中央高校基本科研业务费专项资金资助
摘    要:为了产生语义Web中的元数据,需要提取Web文档中的语义信息。面对海量的Web文档,自动语义标注相对人工和半自动的语义标注是可行的方法。提出的基于本体知识库的自动语义标注方法,旨在提高标注的质量。为识别出文档中的候选命名实体,设计了语义词典的逻辑结构,论述了以实体之间语义关联路径计算语义距离的方法。语义标注中的复杂问题是语义消歧,提出了基于最短路径的语义消歧方法和基于n-gram的语义消歧方法。采用这种方法对文档进行语义标注,将标注结果持久化为语义索引,为实现语义信息检索提供基础。针对构建的测试数据集,进行的标注实验表明该方法能够依据本体知识库,有效地对Web文档进行自动语义标注。

关 键 词:语义标注;n-gram;语义消歧;有向图;知识库
收稿时间:2010-10-18
修稿时间:2010-11-18

Automatic semantic annotation based on ontology and knowledge base
QI Xin,XIAO Min,SUN Jian-peng. Automatic semantic annotation based on ontology and knowledge base[J]. Application Research of Computers, 2011, 28(5): 1742-1744. DOI: 10.3969/j.issn.1001-3695.2011.05.042
Authors:QI Xin  XIAO Min  SUN Jian-peng
Affiliation:(College of Computer Science & Technology, Wuhan University of Technology, Wuhan 430063, China)
Abstract:In order to generate the metadata of semantic web, semantic information need be extracted. Facing the mass scale of web documents, Compared to artificial or semi-automatic semantic annotation, automatic semantic annotation is a feasible method. To recognize candidate named entities, the semantic dictionary is designed and semantic distance between entities is calculated by semantic relevance path. The most complex problem in semantic annotation is semantic disambiguation. A semantic disambiguation method based on the shortest path and n-gram is proposed. After semantic annotation using this method, the results need to be converted to semantic indexes for semantic information retrieval. Experiments have been made on a news corpus. The result shows that the method is effective for the task of automatic semantic annotation.
Keywords:semantic annotation   n-gram   semantic disambiguation   directed acyclic graph   knowledge base
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号