首页 | 本学科首页   官方微博 | 高级检索  
     

基于中心驱动模型的宾州中文树库(CTB)句法分析
引用本文:曹海龙,赵铁军,李生.基于中心驱动模型的宾州中文树库(CTB)句法分析[J].高技术通讯,2007,17(1):15-20.
作者姓名:曹海龙  赵铁军  李生
作者单位:哈尔滨工业大学语言语音教育部-微软重点实验室,哈尔滨,150001
基金项目:国家自然科学基金 , 国家高技术研究发展计划(863计划)
摘    要:报告了依托宾州中文树库进行句法分析研究的最新进展.以著名的中心驱动模型为基础,首次在宾州中文树库5.0上进行了句法分析实验.同前人的工作相比,这次实验取得了更加成功的结果,极大缩小了中、英文句法分析的差距.在公共的测试集上对句法分析器的性能进行了评价,对于正确分词和词性标注的句子,句法分析的精确率和召回率分别达到85.89%和85.61%.介绍了模型的实现过程,并进一步分析了模型中决策表和基本名词短语(BNP)两个关键环节在句法分析器中所起到的作用.本文的工作对于研制实用化句法分析系统具有一定参考价值.

关 键 词:中心驱动模型  宾州中文树库  句法分析  结构模式识别
收稿时间:2005-12-07
修稿时间:2005-12-07

Parsing Penn Chinese treebank (CTB) with head-driven model
Cao Hailong,Zhao Tiejun,Li Sheng.Parsing Penn Chinese treebank (CTB) with head-driven model[J].High Technology Letters,2007,17(1):15-20.
Authors:Cao Hailong  Zhao Tiejun  Li Sheng
Affiliation:MOE-MS Key Laboratory of Natural Language Processing and Speech, Harbin Institute of Technology, Harbin 150001
Abstract:This paper reports the new improvement of the work on parsing the Penn Chinese treebank(CTB),one of the most important technologies of Chinese information processing. The well-known head-driven model was applied to the new available CTB5.0 and the parsing experiment was performed for the first time.Compared with the previous work on CTB,the experiment achieved more promising result and greatly narrowed the performance gap between Chinese parsing and English parsing.The parser was evaluated on the standard test set with PARSEVAL metric.It performed with the precision of 85.89% and the recall rate of 85.61% on the sentences with gold segmentation and POS tagging.The construction of the parser was described,and the functions of the two important technologies that can significantly improve the parsing performance were analyzed.This work is referential to the development of Chinese parser for real applications.
Keywords:head-driven model  Penn Chinese treebank  parsing  syntactic pattern recognition
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号