首页 | 本学科首页   官方微博 | 高级检索  
     

基于文本语义离散度的自动作文评分关键技术研究
引用本文:王耀华,李舟军,何跃鹰,巢文涵,周建设.基于文本语义离散度的自动作文评分关键技术研究[J].中文信息学报,2016,30(6):173-181.
作者姓名:王耀华  李舟军  何跃鹰  巢文涵  周建设
作者单位:1. 北京航空航天大学 计算机学院,北京 100191;
2. 国家计算机网络应急技术处理协调中心,北京 100029;
3. 首都师范大学 成像技术高精尖创新中心 北京 100048
基金项目:国家自然科学基金(61170189,61370126, 61202239,U1636211);国家863计划(2015AA016004,2014AA015105);北京成像技术高精尖创新中心项目(BAICIT-2016001)
摘    要:该文尝试从文本语义离散度的角度去提升自动作文评分的效果,提出了两种文本语义离散度的表示方法,并给出了数学化的计算公式。基于现有的LDA模型、段落向量、词向量等具体方法,提取出四种表征文本语义离散度的实例,应用于自动作文评分。该文从统计学角度将文本语义离散度向量化,从去中心化的角度将文本语义离散度矩阵化,并使用多元线性回归、卷积神经网络和循环神经网络三种方法进行对比实验。实验结果表明,在50篇作文的验证集上,在加入文本语义离散度特征后,预测分数与真实分数之间均方根误差最大降低10.99%,皮尔逊相关系数最高提升2.7倍。该表示方法通用性强,没有语种限制,可以扩展到任何语言。

关 键 词:作文评分  语义离散度  神经网络  />  

Research on Key Technology of Automatic Essay Scoring #br# Based on Text Semantic Dispersion
WANG Yaohua,LI Zhoujun,HE Yueying,CHAO Wenhan,ZHOU Jianshe.Research on Key Technology of Automatic Essay Scoring #br# Based on Text Semantic Dispersion[J].Journal of Chinese Information Processing,2016,30(6):173-181.
Authors:WANG Yaohua  LI Zhoujun  HE Yueying  CHAO Wenhan  ZHOU Jianshe
Affiliation:1. School of Computer Science and Engineering, Beihang University, Beijing 100191, China;
2. National Computer Network Emergency Response Technical Team, Beijing 100029, China;
3. Beijing Advanced Innovation Center for Imaging Technology, Capital Normal University, Beijing 100048, China
Abstract:Based on the existing methods, including LDA model, paragraph vector, word vector text, we extract four kinds of text semantic dispersion representations, and apply them on the automatic essay scoring. This paper gives a vector form of the text semantic dispersion from the statistical point of view and gives a matrix form from the perspective of decentralized text semantic dispersion, experimented on the multiple linear regression, convolution neural network and recurrent neural network. The results showed that, on the test data of 50 essays, after the addition of text semantic dispersion feature, the Root Mean Square Error is reduced by 10.99% and the Pearson correlation coefficient increases 2.7 times.
Keywords:Automatic Essay Scoring  semantic dispersion  neural network
        
        
        
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号