首页 | 本学科首页   官方微博 | 高级检索  
     

基于中心语块扩展的汉藏基本名词短语对的识别
引用本文:诺明花,刘汇丹,马龙龙,吴健,丁治明.基于中心语块扩展的汉藏基本名词短语对的识别[J].中文信息学报,2013,27(4):63-70.
作者姓名:诺明花  刘汇丹  马龙龙  吴健  丁治明
作者单位:中国科学院 软件研究所,北京 100190
基金项目:国家重大科技专项资助项目,国家自然科学基金资助项目
摘    要:该文提出汉藏基本名词短语对齐框架。从汉语基本名词短语出发,找藏文正确译文过程中,参考英汉短语对齐的方法,针对藏语的特殊性,提出基于中心语块扩展的藏语基本名词短语识别方法。提出词典与自动词对齐结果相结合的方法和基于序列相交的方法抽取藏语中心语块,再以扩展可信度为依据扩展中心语块。实验结果表明,基于序列相交的方法所抽取的汉藏基本名词短语对能够节省人工校正的工作量,有效辅助于汉藏基本名词短语库的建设。

关 键 词:藏文信息处理  基本名词短语  中心语块扩展  

Chinese-Tibetan Base Noun Phrase Alignment Based on Head-Phrase Extension
NUO Minghua , LIU Huidan , MA Longlong , WU Jian , DING Zhiming.Chinese-Tibetan Base Noun Phrase Alignment Based on Head-Phrase Extension[J].Journal of Chinese Information Processing,2013,27(4):63-70.
Authors:NUO Minghua  LIU Huidan  MA Longlong  WU Jian  DING Zhiming
Affiliation:Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
Abstract:This paper presents a Chinese-Tibetan base noun phrase alignment method. Its a two-phase procedureChinese base noun phrases identification and finding their Tibetan correspondences. We propose head-phrase extension based Tibetan base noun phrase identification method in accordance with the morphologic characteristics of Tibetan. In the first phase, we use sequence intersection operation to get Tibetan head-phrase. In the second phase, head-phrase extension confidence is defined and applied to determine the boundary of correspondence. Experimental result indicates that sequence intersection outperforms other methods in head-phrase extension. Chinese-Tibetan base noun phrase produced by our method is effective in reducing subsequent manual check, facilitating the construction of translation lexicon on phrase level.
Key wordsTibetan information processing;BaseNP;head-phrase extension
Keywords:Tibetan information processing  BaseNP  head-phrase extension
本文献已被 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号