Machine translation evaluation versus quality estimation期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Machine translation evaluation versus quality estimation

Authors:	Lucia Specia Dhwaj Raj Marco Turchi

Affiliation:	(1) University of Maryland, College Park, MD, USA;(2) Defense Advanced Research Projects Agency, Arlington, VA, USA;(2) University of Maryland, College Park, MD, USA;(2) University of Maryland, College Park, MD, USA;(2) National Institute of Standards and Technology, Gaithersburg, MD, USA;(2) Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, USA;(2) National Institute of Standards and Technology, Gaithersburg, MD, USA;(2) University of Maryland, College Park, MD, USA;(2) University of Maryland, College Park, MD, USA;(2) BBN Technologies, Cambridge, MA, USA;(2) University of Maryland, College Park, MD, USA;(2) BBN Technologies, Cambridge, MA, USA;(2) University of Maryland, College Park, MD, USA;(2) BBN Technologies, Cambridge, MA, USA;(2) University of Maryland, College Park, MD, USA;(2) Lincoln Laboratory, Massachusetts Institute of Technology, Lexington, MA, USA;(2) University of Maryland, College Park, MD, USA;(2) Lincoln Laboratory, Massachusetts Institute of Technology, Lexington, MA, USA;(2) Defense Language Institute, Monterey, CA, USA;(2) University of Maryland, College Park, MD, USA;(2) Lincoln Laboratory, Massachusetts Institute of Technology, Lexington, MA, USA;(2) National Institute of Standards and Technology, Gaithersburg, MD, USA;(2) Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, USA;(2) National Institute of Standards and Technology, Gaithersburg, MD, USA;(2) Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, USA;(2) National Institute of Standards and Technology, Gaithersburg, MD, USA;(2) Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, USA;(2) National Institute of Standards and Technology, Gaithersburg, MD, USA;(2) Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, USA;(2) University of Maryland, College Park, MD, USA;(2) RWTH Aachen University, Aachen, Germany;(2) RWTH Aachen University, Aachen, Germany;(2) Carnegie Mellon University, Pittsburgh, PA, USA;(2) Carnegie Mellon University, Pittsburgh, PA, USA;(2) Carnegie Mellon University, Pittsburgh, PA, USA;(2) University of Maryland, College Park, MD, USA;(2) BBN Technologies, Cambridge, MA, USA;(2) Columbia University, New York, NY, USA;(2) University of Washington, Seattle, WA, USA;(2) Oregon Health & Sciences University, Portland, OR, USA;(2) Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, USA;(2) University of Pennsylvania, Philadelphia, PA, USA;(2) Stanford University, Stanford, CA, USA;;

Abstract:	Most evaluation metrics for machine translation (MT) require reference translations for each sentence in order to produce a score reflecting certain aspects of its quality. The de facto metrics, BLEU and NIST, are known to have good correlation with human evaluation at the corpus level, but this is not the case at the segment level. As an attempt to overcome these two limitations, we address the problem of evaluating the quality of MT as a prediction task, where reference-independent features are extracted from the input sentences and their translation, and a quality score is obtained based on models produced from training data. We show that this approach yields better correlation with human evaluation as compared to commonly used metrics, even with models trained on different MT systems, language-pairs and text domains.

Keywords:
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏