A Machine Learning Approach for Identification Thesis and Conclusion Statements in Student Essays |
| |
Authors: | Jill Burstein and Daniel Marcu |
| |
Affiliation: | (1) Educational Testing Service, Princeton, NJ 08541, USA;(2) University of Southern California/Information Sciences Institute, 4676 Admiralty Way, Suite 1001, Marina del Rey, CA 90292, USA |
| |
Abstract: | This study describes and evaluates twoessay-based discourse analysis systems thatidentify thesis and conclusion statements fromstudent essays written on six different essaytopics. Essays used to train and evaluate thesystems were annotated by two human judges,according to a discourse annotation protocol. Using a machine learning approach, a number ofdiscourse-related features were automaticallyextracted from a set of annotated trainingdata. Using these features, two discourseanalysis models were built using C5.0 withboosting: a topic-dependent and atopic-independent model. Both systemsoutperformed a positional algorithm. While thetopic-dependent system showed somewhat higherperformance, the topic-independent systemshowed similar results, indicating that asystem can generalize to unseen data – thatis, essay responses on topics that the systemhas not seen in training. |
| |
Keywords: | discourse analysis discourse annotation essay evaluation machine learning text classification |
本文献已被 SpringerLink 等数据库收录! |
|