基于最大熵模型的介词纠错系统 Preposition Error Correction System Based on Maximun Entropy Model期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于最大熵模型的介词纠错系统

引用本文：	李悦,吴敏,吴桂兴,郭燕. 基于最大熵模型的介词纠错系统[J]. 计算机系统应用, 2016, 25(1): 96-100

作者姓名：	李悦吴敏吴桂兴郭燕

作者单位：	中国科学技术大学现代教育技术中心, 合肥 2300260,中国科学技术大学现代教育技术中心, 合肥 2300260,中国科学技术大学苏州研究院, 苏州 235123,中国科学技术大学苏州研究院, 苏州 235123

摘要：	英语介词纠错系统,针对英语学习者英语语言中常见的介词错误进行计算机自动纠正.首先,对标注过得语料库中介词错误进行了分类统计,总结出21种常见介词,在英语wiki语料库中利用计算机自动错误插值算法获得训练集合.然后在训练集合基础之上,通过使用基于最大熵模型的分类器,选择了包括上下文、介词补足语等特征,在训练集上进行模型的训练,最后使用模型对于输入句子进行预测并纠正存在的使用错误.在NUCLE语料的实验中,给出了语料处理、模型特点、训练语料的大小、迭代次数对于测试集效果的影响,并且比较了朴素贝叶斯模型的结果,最后在测试数据达到27.68的F值,相对于CoNLL2013的shared task中最好结果有小幅提升.
关键词：	介词错误计算机自动纠正最大熵模型
收稿时间：	2015-05-05
修稿时间：	2015-06-08
Preposition Error Correction System Based on Maximun Entropy Model

LI Yue,WU Min,WU Gui-Xing and GUO Yan. Preposition Error Correction System Based on Maximun Entropy Model[J]. Computer Systems& Applications, 2016, 25(1): 96-100

Authors:	LI Yue WU Min WU Gui-Xing GUO Yan

Affiliation:	Center of Modern Educational Technology, University of Science and Technology of China, Hefei 230026, China,Center of Modern Educational Technology, University of Science and Technology of China, Hefei 230026, China,Suzhou Institute, University of Science and Technology of China, Suzhou 235123, China and Suzhou Institute, University of Science and Technology of China, Suzhou 235123, China

Abstract:	English preposition error correction system is to help English language learners to correct automatically the common mistakes of English prepositions. First, to classsify and count up the preposition errors in the marked corpus files, sum up 21 kinds of common prepositions, with English wiki corpus and use of computer algorithms automatically error interpolation algorithm to get the training set, and then based on the training set, by using a classification based on the maximum entropy model chosen, including context, prepositions complement other features, training model on the training set, and then use the model to predict the input sentence and correct use of the presence of errors. In NUCLE corpus experiment, given corpus processing, model features, size, number of iterations to test the effect of the impact of training data set, and compare the results of the Naive Bayes model, and finally to the F value 27.68 in the test data with respect to the shared task CoNLL2013 best results have slightly improved.

Keywords:	preposition errors computer automation correction maximum entropy model

	点击此处可从《计算机系统应用》浏览原始摘要信息
	点击此处可从《计算机系统应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏