首页 | 本学科首页   官方微博 | 高级检索  
     

通过检测语义分歧识别无答案问题
引用本文:刘咏彬,王小捷,袁彩霞,易炼.通过检测语义分歧识别无答案问题[J].北京邮电大学学报,2019,42(6):126-133,141.
作者姓名:刘咏彬  王小捷  袁彩霞  易炼
作者单位:北京邮电大学计算机院,北京100876;阿里巴巴(北京)软件服务有限公司,北京100022
基金项目:中央高校基本科研业务费专项资金项目(500419302)
摘    要:机器阅读理解中存在无法仅从给定文档中获取问题答案的特殊情况,为此,基于语义冲突检测的机器阅读理解网络(SCDNet)提出应通过检测问题与文档内容之间的语义分歧来识别这种情况.经分析发现,文档无法为问题提供答案的根本原因主要分为两类:一是文档中不包含问题所需的语义信息;二是二者包含的语义成分之间存在分歧.据此推断,可以通过检测文档语义信息是否全面涵盖问题所需的信息来识别问题是否可由文档信息给出回答.此外,通过在损失函数中加入答案文本长度惩罚项,网络优化目标函数更接近评测指标,系统性能得到提升.网络模型使用联合训练模型建模无答案的问题识别与答案抽取2个子任务,并使用端到端的方式训练.实验结果证明,其对无答案问题类别预测的正确率超过了性能先进的基线模型SAN2.0,在SQuAD2.0数据集上取得了72.43的F1值和76.96的无答案问题识别正确率.

关 键 词:机器阅读理解  问答系统  无答案的问题
收稿时间:2019-09-28

Unanswerable Questions Recognition by Semantic Discrepancy Detection
LIU Yong-bin,WANG Xiao-jie,YUAN Cai-xia,YI Lian.Unanswerable Questions Recognition by Semantic Discrepancy Detection[J].Journal of Beijing University of Posts and Telecommunications,2019,42(6):126-133,141.
Authors:LIU Yong-bin  WANG Xiao-jie  YUAN Cai-xia  YI Lian
Affiliation:1. School of Telecommunication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China;
2. Alibaba(Beijing) Software Services Company Limited, Beijing 100022, China
Abstract:Machine reading comprehension (MRC) with unanswerable questions is challenging to the field of natural language processing research. Unlike previous work which ignores the mechanism of answerable and unanswerable, the semantic conflicts detection-based MRC network (SCDNet) was proposed aiming at detections of no-answer (NA) questions through semantic conflicts detection network. The basic idea is that if the given question is unanswerable, there exists semantic absence or conflicts between the question and the reference passages. Therefore, SCDNet predicts the NA probability by checking whether the passage covers the integral semantics of the question. Besides, in order to extract the exact answer from the passage, SCDNet is applied an answer length penalty in the loss function, which helps the learning objective to be more consistent with the evaluation metrics. SCDNet packs the NA question predictor and the answer extractor in a joint model and is trained in an end-to-end manner. Experiments show that SCDNet performs better than some strong baseline models, and achieve an F1 score of 72.43 and 76.96 NA accuracy on SQuAD 2.0 dataset.
Keywords:machine reading comprehension  question answering  unanswerable question  
本文献已被 万方数据 等数据库收录!
点击此处可从《北京邮电大学学报》浏览原始摘要信息
点击此处可从《北京邮电大学学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号