首页 | 本学科首页   官方微博 | 高级检索  
     

二分图顶点配对模型下的英汉句子对齐研究
引用本文:严灿勋.二分图顶点配对模型下的英汉句子对齐研究[J].中文信息学报,2016,30(5):153-159.
作者姓名:严灿勋
作者单位:解放军外国语学院 语言工程系,河南 洛阳 471003
基金项目:中央文献对外翻译与传播协同创新中心科学研究项目(2013XT08)
摘    要:英汉平行文本句子对齐可以视为一个二分图顶点配对模型。利用完全基于英汉词典的双语句子相关性评价函数,能够对二分图的“顶点对”进行加权。该文提出的顶点配对句子对齐方法首先获取二分图全局最大权重顶点配对作为临时锚点;在此基础上,根据句子先后顺序,局部最大权重顶点配对和英汉句长比的值域范围,纠正临时锚点中的错误,补充锚点序列未覆盖的合法顶点对,同时划分句对,实现句子对齐处理。在对比实验中该句子对齐方法优于Champollion句子对齐系统。从实验对比结果和实践效果看,该句子对齐方法可行。


关 键 词:句子对齐  双语词典  平行文本  二分图  顶点配对  顶点对
  

Sentence Alignment Under A Bipartite Graph Vertex Pairing Model
YAN Canxun.Sentence Alignment Under A Bipartite Graph Vertex Pairing Model[J].Journal of Chinese Information Processing,2016,30(5):153-159.
Authors:YAN Canxun
Affiliation:Language Engineering Department, PLA Foreign Languages Institute, Luoyang, Henan 471003, China
Abstract:Pairing vertices properly in a bipartite graph can be taken as a model for the bilingual sentence alignment. The vertex pairs in the bipartite graph can be weighted with a totally bilingual-dictionary-based evaluation function which evaluates the word correspondences between an English sentence and a Chinese sentence. In our appoach, the globally-maximum-weighted vertex pairs are first chosen as temporary anchors. Then, based on the temporary anchors, the results of the locally-maximum-weighted vertex pairs and the range of the ratio of English and Chinese sentence lengths, the mistakes in the original anchor vertex pairs are corrected and the missing vertex pairs are supplemented. Meanwhile, the sentences in the bipartite graph are simultaneously grouped into minimal groups of corresponding sentences. The comparison experiments show that the vertex-pairing sentence alignment approach works better than the Champollion sentence alignment system.
Keywords:sentence alignment  bilingual dictionary  parallel text  bipartite graph  vertex pairing  vertex pair  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号