首页 | 本学科首页   官方微博 | 高级检索  
     

HSK自动作文评分的特征选取研究
引用本文:黄志娥,谢佳莉,荀恩东.HSK自动作文评分的特征选取研究[J].计算机工程与应用,2014(6):118-122,126.
作者姓名:黄志娥  谢佳莉  荀恩东
作者单位:北京语言大学 汉语国际教育技术研发中心,北京100083
基金项目:国家自然科学基金(No.60573184,No.60973062,No.61170162).
摘    要:作文特征选取是研究汉语作为第二语言的水平测试自动作文评分的关键问题之一,以中国汉语水平考试作文为研究对象,从字、词、语法、成段表达、庄雅度等多个层面上,选取107个作文特征,经相关度计算得到19个与作文分数较为相关的作文特征。基于选取的作文特征,采用多元线性回归方法进行回归实验和稳定性交叉实验。实验表明,作文长度、词汇使用和成段表达方面的作文特征对作文得分具有较好的解释能力,多元线性回归方法应用于中国汉语水平考试自动作文评分具有较好的稳定性。

关 键 词:中国汉语水平考试  自动作文评分  特征选取  多元线性回归

Study of feature selection in HSK automated essay scoring
Affiliation:HUANG Zhi’e, XIE Jiali, XUN Endong (International R&D Center for Chinese Education, Beijing Language and Culture University, Beijing 100083, China)
Abstract:Feature selection is a key issue in automated essay scoring for Chinese as second language. Focusing on HSK composition test, 107 features are extracted, mainly describing Chinese character using, word using, grammatical mis-takes, paragraph expression, formality measuring, etc. 19 of them are proved to have strong correlation with composition scoring, through relativity calculation. Based on the selected features, multiple linear regression and stability cross experi-ment are utilized. Essay length, word use and paragraph expression are found to be explanatory capable and multiple lin-ear regression provides better stability in HSK composition test.
Keywords:HSK  Automated Essay Scoring(AES)  feature selection  multiple linear regression
本文献已被 CNKI 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号