综合型语言知识库的建设与利用 The Coonstruction and Utilization of A Comprehensive Language Knowledge-base期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

综合型语言知识库的建设与利用

引用本文：	俞士汶,段慧明,朱学锋,张化瑞.综合型语言知识库的建设与利用[J].中文信息学报,2004,18(5):2-11.

作者姓名：	俞士汶段慧明朱学锋张化瑞

作者单位：	北京大学计算语言学研究所

基金项目：	国家高技术研究发展计划(863计划)，国家自然科学基金

摘要：	语言知识库的规模和质量决定了自然语言处理系统的成败。经过18年的努力,北京大学计算语言学研究所已经积累了一系列颇具规模、质量上乘的语言数据资源:现代汉语语法信息词典,大规模基本标注语料库,现代汉语语义词典,中文概念词典,不同单位对齐的双语语料库,多个专业领域的术语库,现代汉语短语结构规则库,中国古代诗词语料库等等。本项研究将把这些语言数据资源集成为一个综合型的语言知识库。集成不同的语言数据资源时,必须克服它们之间的“缝隙”。规划中的综合型语言知识库除了有统一的友好的使用界面和方便的应用程序接口外,还将提供支持知识挖掘的工具软件,促使现有的语言数据资源从初级产品形式向深加工产品形式不断发展;提供多种形式的知识传播和信息服务机制,让综合型语言知识库为语言信息处理研究、语言学本体研究和语言教学提供全方位的、多层次的支持。
关键词：	计算机应用中文信息处理语言处理语言知识库语言数据资源电子词典语料库
文章编号：	1003-0077(2004)05-0001-10
The Coonstruction and Utilization of A Comprehensive Language Knowledge-base

YU Shi-wen,DUAN Hui-ming,ZHU Xue-feng,ZHANG Hua-rui.The Coonstruction and Utilization of A Comprehensive Language Knowledge-base[J].Journal of Chinese Information Processing,2004,18(5):2-11.

Authors:	YU Shi-wen DUAN Hui-ming ZHU Xue-feng ZHANG Hua-rui

Affiliation:	Institute of Computational Linguistics , Peking University

Abstract:	The scale and quality of the knowledge-base decides the success or failure of the natural language processing system. Institute of computational linguistics of Peking university has accumulated a series of languages-data resources that have good quality with considerable scale after 18 years of diligent work: the grammatical knowledge-base of contemporary Chinese, the large-scale POS-Tagged corpus of contemporary Chinese, Semantics Knowledge-base of Contemporary Chinese (SKCC), Chinese Concept Dictionary (CCD), a bilingual parallel corpus with different aligned units, special term bank of different disciplines, the phrase structure knowledge-base of contemporary Chinese, a corpus of ancient Chinese poems. The present research will integrate these language data resources into one unified and comprehensive language knowledge-base. While incorporating all these different resources, the gaps between them must be filled up. The comprehensive language knowledge-base being planned will provide not only friendly using interface and convenient application program interface but also various software toolssupporting knowledge mining. Therefore, the research promotes the present language data resources to develop constantly from primary products into deep processed products. It will set up diversified forms of knowledge spreading mechanism and information service mechanism to offer omni-directional and multi-level support to language information processing, traditional linguistics research and language teaching.

Keywords:	computer application Chinese information processing natural language processing language data resources language knowledge-base electronic dictionary corpus
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《中文信息学报》浏览原始摘要信息
	点击此处可从《中文信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏