Automatic induction of bilingual resources from aligned parallel corpora: application to shallow-transfer machine translation期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Automatic induction of bilingual resources from aligned parallel corpora: application to shallow-transfer machine translation

Authors:	Helena M. Caseli Maria das Graças V. Nunes Mikel L. Forcada

Affiliation:	1. NILC – ICMC, University of S?o Paulo, S?o Carlos, SP, Brazil 2. Departament de Llenguatges i Sistemes Informàtics, Universitat d’Alacant, 03071, Alacant, Spain

Abstract:	The availability of machine-readable bilingual linguistic resources is crucial not only for rule-based machine translation but also for other applications such as cross-lingual information retrieval. However, the building of such resources (bilingual single-word and multi-word correspondences, translation rules) demands extensive manual work, and, as a consequence, bilingual resources are usually more difficult to find than “shallow” monolingual resources such as morphological dictionaries or part-of-speech taggers, especially when they involve a less-resourced language. This paper describes a methodology to build automatically both bilingual dictionaries and shallow-transfer rules by extracting knowledge from word-aligned parallel corpora processed with shallow monolingual resources (morphological analysers, and part-of-speech taggers). We present experiments for Brazilian Portuguese–Spanish and Brazilian Portuguese–English parallel texts. The results show that the proposed methodology can enable the rapid creation of valuable computational resources (bilingual dictionaries and shallow-transfer rules) for machine translation and other natural language processing tasks).

Keywords:	Machine translation Automatic induction Transfer rule Bilingual dictionary Shallow transfer
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏