首页 | 本学科首页   官方微博 | 高级检索  
     

一个基于本体主题的中文知识获取方法
引用本文:车海燕,孙吉贵,荆 涛,白 曦.一个基于本体主题的中文知识获取方法[J].计算机科学与探索,2007,1(2):206-215.
作者姓名:车海燕  孙吉贵  荆 涛  白 曦
作者单位:1.吉林大学 计算机科学与技术学院,长春 130012 ;2.吉林大学 教育部符号计算与知识工程重点实验室,长春 130012 ;
基金项目:国家自然科学基金 , 教育部高等学校博士学科点专项科研基金
摘    要:中文语言自身的特点决定了从中文自然语言文档中获取知识是非常困难的。尽管目前对中文的命名实体识别(简称为NER)已经取得了较好的效果,但是如果不借助同义词表或者类似WordNet的中文语言知识库,几乎无法正确地抽取已经识别出的实体之间的关系。文章提出了一个基于本体主题的思想进行中文知识获取的方法,该方法首次将主题思想引入领域本体,由领域专家对原始的领域本体中的概念和属性按照主题进行划分,建立起概念到主题、主题到属性的关联关系。在对一句话进行知识抽取时,通过简单的NER和直接与本体映射的方法可以识别出一句话中的部分概念、个体和属性,利用这些准确识别出的信息可以判定该句话所属的主题;该主题则进一步提供了寻找关系的线索。初步的实验结果表明与没有利用主题信息的方法相比,该方法可以取得更好的召回率和准确率。

关 键 词:知识获取  本体  主题  中文
修稿时间: 

An ontology-theme-based method of acquiring knowledge from Chinese natural language documents
CHE Haiyan,SUN Jigui,JING Tao,BAI Xi.An ontology-theme-based method of acquiring knowledge from Chinese natural language documents[J].Journal of Frontier of Computer Science and Technology,2007,1(2):206-215.
Authors:CHE Haiyan  SUN Jigui  JING Tao  BAI Xi
Affiliation:1.College of Computer Science and Technology, Jilin University, Changchun 130012, China 2.Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China
Abstract:Acquiring knowledge from Chinese natural language documents is very difficult due to the particular characteristic of Chinese. Although many researchers have made great progress on the Chinese named entity recognition(NER for short), it is hardly possible to extract correctly the binary relationships between a pair of recognized entities without the facilities of synonym tables, or some Chinese linguistic ontology like WordNet. Propose an ontology-theme-based method to extract these relationships from Chinese natural language documents. It is the first time to import the theme idea into domain ontology. Concepts and properties of the original domain ontology are partitioned according to the themes and the mapping relations between concepts and themes, themes and properties are established. For a sentence being processed, some entities, individuals and properties can be extracted firstly by simple NER and direct string-ontology matching. These correctly extracted information can then be used to infer the themes of this sentence. Further, the themes can provide useful clues to find more possible relationships. Results of elementary experiments indicate that this theme-based approach can obtain a higher recall rate and precision rate compared with other methods without the incorporation of theme.
Keywords:knowledge acquisition  ontology  theme  Chinese
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机科学与探索》浏览原始摘要信息
点击此处可从《计算机科学与探索》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号