首页 | 本学科首页   官方微博 | 高级检索  
     


Evaluating and automating the annotation of a learner corpus
Authors:Alexandr Rosen  Jirka Hana  Barbora Štindlová  Anna Feldman
Affiliation:1. Charles University, Prague, Czech Republic
2. Technical University, Liberec, Czech Republic
3. Montclair State University, Montclair, NJ, USA
Abstract:The paper describes a corpus of texts produced by non-native speakers of Czech. We discuss its annotation scheme, consisting of three interlinked tiers, designed to handle a wide range of error types present in the input. Each tier corrects different types of errors; links between the tiers allow capturing errors in word order and complex discontinuous expressions. Errors are not only corrected, but also classified. The annotation scheme is tested on a data set including approx. 175,000 words with fair inter-annotator agreement results. We also explore the possibility of applying automated linguistic annotation tools (taggers, spell checkers and grammar checkers) to the learner text to support or even substitute manual annotation.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号