首页 | 本学科首页   官方微博 | 高级检索  
     

基于数据增强的高考阅读理解自动答题研究
引用本文:张虎,张颖,杨陟卓,钱揖丽,李茹.基于数据增强的高考阅读理解自动答题研究[J].中文信息学报,2021,35(9):132-140.
作者姓名:张虎  张颖  杨陟卓  钱揖丽  李茹
作者单位:1.山西大学 计算机与信息技术学院,山西 太原 030006;
2.山西大学 计算智能与中文信息处理教育部重点实验室,山西 太原 030006
基金项目:国家重点基础研究发展计划(2018YFB1005103-3);国家自然科学基金(61806117);山西省自然科学基金(201901D111028)
摘    要:机器阅读理解是自然语言处理领域中的一项重要研究任务,高考阅读理解自动答题是近年来阅读理解任务中的又一挑战。目前高考语文阅读理解任务中真题和模拟题的数量相对较少,基于深度学习的方法受到实验数据规模较小的限制,所得的实验结果相比传统方法无明显优势。基于此,该文探索了面向高考语文阅读理解的数据增强方法,结合传统的EDA数据增强思路提出了适应于高考阅读理解的EDA策略,针对阅读材料普遍较长的特征提出了基于滑动窗口的材料动态裁剪方式,围绕材料中不同句子的重要性差异明显的问题,提出了基于相似度计算的材料句质量评价方法。实验结果表明,三种方法均能提升高考题阅读理解自动答题的效果,答题准确率最高可提升5个百分点以上。

关 键 词:阅读理解  高考题  数据增强  深度学习  
收稿时间:2020-03-31

Data Augmentation Based Automatic Answering of Reading Comprehension in College Entrance Examination
ZHANG Hu,ZHANG Ying,YANG Zhizhuo,QIAN Yili,LI Ru.Data Augmentation Based Automatic Answering of Reading Comprehension in College Entrance Examination[J].Journal of Chinese Information Processing,2021,35(9):132-140.
Authors:ZHANG Hu  ZHANG Ying  YANG Zhizhuo  QIAN Yili  LI Ru
Affiliation:1.School of Computer and Information Technology, Shanxi University, Taiyuan, Shanxi 030006, China;2.Key Laboratory of Ministry of Education for Computation Intelligence andChinese Information Processing, Shanxi University, Taiyuan, Shanxi 030006, China
Abstract:Automatic answering of reading comprehension in college entrance examination is a challenge in the machine reading comprehension task. At present, the number available question-answering pairs in Chinese reading comprehension of the college entrance examination is limited, and deep learning method is obstructed by the small scale of the experimental data. This paper propose to adapt the traditional EDA data enhancement is to the reading comprehension in college entrance examination. To deal with the long contexts in reading materials, a dynamic material clipping method based on sliding window is proposed. And a method for evaluating the quality of sentences in the reading material is designed on similarity calculation. The experimental results show that all three strategies can improve the automatic answering in reading comprehension of college entrance examination questions, with 5% or more increase in accuracy.
Keywords:reading comprehension  college entrance examination questions  data augmentation  deep learning  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号