首页 | 本学科首页   官方微博 | 高级检索  
     

基于双向标注融合的汉语最长短语识别方法
引用本文:鉴萍,宗成庆.基于双向标注融合的汉语最长短语识别方法[J].智能系统学报,2009,4(5):406-413.
作者姓名:鉴萍  宗成庆
作者单位:中国科学院自动化研究所模式识别国家重点实验室,北京100190
基金项目:国家自然科学基金资助项目,"十一五"国家科技支撑计划项目,国家"863"计划资助项目,中国新加坡数字媒体研究院资助项目.通信作者:鉴萍.E-mail:pjian@nlpr.ia.ac.cn 
摘    要:汉语最长短语(最长名词短语和介词短语)具有显著的语言学特点.采用基于分类器的确定性标注方法进行双向标注,其结果能够显示最长短语识别在汉语句子正(由左至右)反(由右至左)2个方向上的互补性.基于此,利用确定性的双向标注技术来识别汉语最长短语,并提出了一种基于“分歧点”的概率融合策略以融合该双向标注结果.实验表明,这一融合算法能够有效发掘这2个方向的互补特性,从而获得较好的短语识别效果.

关 键 词:最长名词短语识别  介词短语识别  序列标注  双向标注  分歧点

A new approach to identifying Chinese maximal-length phrasesusing bidirectional labeling
JIAN Ping,ZONG Cheng-qing.A new approach to identifying Chinese maximal-length phrasesusing bidirectional labeling[J].CAAL Transactions on Intelligent Systems,2009,4(5):406-413.
Authors:JIAN Ping  ZONG Cheng-qing
Affiliation:National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
Abstract:Chinese maximal-length phrases (maximal-length noun phrases and prepositional phrases) possess remarkable linguistic properties. Bidirectional labeling results of Chinese maximal-length phrases obtained using sequential classifiers reveal complementary properties in both directions. In this paper, both left-right and right-left sequential labeling were employed to identify the Chinese maximal-length noun phrases and prepositional phrases. Then a novel “fork position” based probabilistic algorithm was developed to fuse the bidirectional results. Experiments were carried out on the Penn Chinese Treebank, a segmented, part-of-speech tagged, and fully bracketed corpus. The results confirmed that the proposed algorithm is able to effectively exploit the complementary strengths of the two directions.
Keywords:maximal-length noun phrase identification  prepositional phrase identification  sequence labeling  bidirectional labeling  fork position
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《智能系统学报》浏览原始摘要信息
点击此处可从《智能系统学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号