A unified approach for effectively integrating source-side syntactic reordering rules into phrase-based translation |
| |
Authors: | Jiajun Zhang Chengqing Zong |
| |
Affiliation: | 1. National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
|
| |
Abstract: | Phrase-based translation models, with sequences of words (phrases) as translation units, achieve state-of-the-art translation performance. However, phrase reordering is a major challenge for this model. Recently, researchers have focused on utilizing syntax to improve phrase reordering. In adding syntactic knowledge into phrase reordering model, using handcrafted or probabilistic syntactic rules to reorder the source-language approximating the target-language word order has been successful in improving translation quality. However, it suffers from propagating the pre-ordering errors to the later translation step (e.g. decoding). In this paper, we propose a novel framework to uniformly represent the handcrafted and probabilistic syntactic rules and integrate them more effectively into phrase-based translation. In the translation phase, for a source sentence to be translated, handcrafted or probabilistic syntactic rules are first acquired from the source parse tree prior to translation, and then instead of reordering the source sentence directly, we input these rules into the decoder and design a new algorithm to apply these rules during decoding. In order to attach more importance to the syntactic rules and distinguish reordering between syntactic and non-syntactic unit reordering, we propose to design respectively a syntactic reordering model and a non-syntactic reordering model. The syntactic rules will guide phrase reordering in decoding within the syntactic reordering model. Extensive experiments on Chinese-to-English translation show that our approach, whether incorporating handcrafted or probabilistic syntactic rules, significantly outperforms the previous methods. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|