首页 | 本学科首页   官方微博 | 高级检索  
     

CRFs融合语义信息的英语功能名词短语识别
引用本文:马建军,裴家欢,黄德根.CRFs融合语义信息的英语功能名词短语识别[J].中文信息学报,2016,30(6):59-66.
作者姓名:马建军  裴家欢  黄德根
作者单位:1. 大连理工大学 外国语学院,辽宁 大连 116024;
2. 大连理工大学 计算机科学与技术学院,辽宁 大连 116024
基金项目:教育部人文社会科学研究规划基金(13YJAZH062)
摘    要:名词短语识别在句法分析中有着重要的作用,而英汉机器翻译的瓶颈之一就是名词短语的歧义消解问题。研究英语功能名词短语的自动识别,则将名词短语的结构消歧问题转化成名词短语的识别问题。基于名词短语在小句中的语法功能来确定名词短语的边界,选择商务领域语料,采用了细化词性标注集和条件随机域模型结合语义信息的方法,识别了名词短语的边界和句法功能。在预处理基于宾州树库细化了词性标注集,条件随机域模型中加入语义特征主要用来识别状语类的名词短语。实验结果表明,结合金标准词性实验的F值达到了89.04%,改进词性标注集有助于提高名词短语的识别,比使用宾州树库标注集提高了2.21%。将功能名词短语识别信息应用到NiuTrans统计机器翻译系统,英汉翻译质量略有提高。

关 键 词:功能名词短语  名词短语识别  条件随机域模型  语义信息  />  

Identification of English Functional Noun Phrases #br# by CRFs and the Semantic Information
MA Jianjun,PEI Jiahuan,HUANG Degen.Identification of English Functional Noun Phrases #br# by CRFs and the Semantic Information[J].Journal of Chinese Information Processing,2016,30(6):59-66.
Authors:MA Jianjun  PEI Jiahuan  HUANG Degen
Affiliation:1. School of Foreign Languages, Dalian University of Technology, Dalian, Liaoning 116024, China ;
   2. School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
Abstract:The study on the automatic identification of English functional noun phrases (NP) may transform the task of resolving structural ambiguity caused by noun phrases into the task of NP chunking. Functional noun phrases refer to those noun phrases which are defined based on their syntactic functions in clauses. On a corpus of business domain, this study aims to identify both the scope of NP chunks and their syntactic function types by refining the Part-of-speech (POS) tagset, and adopting conditional random fields (CRFs) model combined with the semantic information. Modification to the Penn Treebank tagset is completed in the pre-processing, and semantic features are added to the CRFs model to improve the recognition of the adjunct types of noun phrases. Test results show that the system has achieved an F-score of 89.04% in the open test using our gold standard tags; and refining the POS tagset is a better approach for NP chunking, which has increased the F-score by 2.21%, compared with the model using the Penn Tree bank POS tags. This knowledge of English functional noun phrases is then combined with the NiuTrans SMT system, which slightly improves the English Chinese translation performance.
Keywords:functional noun phrases  noun phrase identification  CRFs  semantic information
        
        
        
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号