首页 | 本学科首页   官方微博 | 高级检索  
     

基于Web语料的概念获取系统的研究与实现
引用本文:余蕾,曹存根.基于Web语料的概念获取系统的研究与实现[J].计算机科学,2007,34(2):161-165.
作者姓名:余蕾  曹存根
作者单位:1. 中国科学院计算技术研究所,北京,100080;中国科学院研究生院,北京,100080
2. 中国科学院计算技术研究所,北京,100080
基金项目:国家自然科学基金 , 国家重点基础研究发展计划(973计划)
摘    要:互联网网页中存在大量的专业知识。如何从这些资源中获取知识已经成为10多年来的一个重要的研究课题。概念和概念间的关系是知识的基本组成部分,因此如何获取并验证概念,成为从文本到知识的过程中的重要步骤。本文提出并实现了一种自动从Web语料中获取概念的方法,该方法利用了规则、统计、上下文信息等多种方法和信息。实验结果表明,该方法达到了较好的效果。

关 键 词:中文信息处理  知识获取  概念获取  概念验证

Concept Extraction and Verification from Web Corpus
YU Lei,CAO Cun-Gen.Concept Extraction and Verification from Web Corpus[J].Computer Science,2007,34(2):161-165.
Authors:YU Lei  CAO Cun-Gen
Affiliation:1.Institute of Computing Science, Chinese Academy of Sciences, Beijing 100080;2.Graduate School, Chinese Academy of Sciences, Beijing 100080
Abstract:There is a large amount of knowledge on the Web pages. How to intelligently acquire knowledge from the massive information on Web pages has become a very important task. Concepts as well as inter-conceptual relations and inter-attribute relations of concepts are the main parts of knowledge. Therefore how toacquire and verify concepts is an important step in the knowledge acquisition. This paper proposes a hybrid approach to automatically extract concepts from large Web corpus. The hybrid approach makes use of rules, statistic, and context information to identify and verify concepts. The experiment shows very good performance of this method for extracting concepts.
Keywords:Chinese information processing  Knowledge acquisition  Concept acquisition  Concept verification
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号