首页 | 本学科首页   官方微博 | 高级检索  
     


Automatic induction of bilingual resources from aligned parallel corpora: application to shallow-transfer machine translation
Authors:Helena M. Caseli  Maria das Graças V. Nunes  Mikel L. Forcada
Affiliation:1. NILC – ICMC, University of S?o Paulo, S?o Carlos, SP, Brazil
2. Departament de Llenguatges i Sistemes Informàtics, Universitat d’Alacant, 03071, Alacant, Spain
Abstract:
The availability of machine-readable bilingual linguistic resources is crucial not only for rule-based machine translation but also for other applications such as cross-lingual information retrieval. However, the building of such resources (bilingual single-word and multi-word correspondences, translation rules) demands extensive manual work, and, as a consequence, bilingual resources are usually more difficult to find than “shallow” monolingual resources such as morphological dictionaries or part-of-speech taggers, especially when they involve a less-resourced language. This paper describes a methodology to build automatically both bilingual dictionaries and shallow-transfer rules by extracting knowledge from word-aligned parallel corpora processed with shallow monolingual resources (morphological analysers, and part-of-speech taggers). We present experiments for Brazilian Portuguese–Spanish and Brazilian Portuguese–English parallel texts. The results show that the proposed methodology can enable the rapid creation of valuable computational resources (bilingual dictionaries and shallow-transfer rules) for machine translation and other natural language processing tasks).
Keywords:Machine translation  Automatic induction  Transfer rule  Bilingual dictionary  Shallow transfer
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号