首页 | 本学科首页   官方微博 | 高级检索  
     

中文文本蕴含类型及语块识别方法研究
引用本文:于东,金天华,谢婉莹,张艺,荀恩东.中文文本蕴含类型及语块识别方法研究[J].软件学报,2020,31(12):3772-3786.
作者姓名:于东  金天华  谢婉莹  张艺  荀恩东
作者单位:北京语言大学信息科学学院,北京100083;北京语言大学信息科学学院,北京100083;北京语言大学信息科学学院,北京100083;北京语言大学信息科学学院,北京100083;北京语言大学信息科学学院,北京100083
基金项目:国家重点研发计划(2018YFB1005105)
摘    要:文本蕴含识别(RTE)是判断两个句子语义是否具有蕴含关系的任务.近年来英文蕴含识别研究取得了较大发展,但主要是以类型判断为主,在数据中精确定位蕴含语块的研究比较少,蕴含类型识别的解释性较低.从中文文本蕴含识别(CNLI)数据中挑选12 000个中文蕴含句对,人工标注引起蕴含现象的语块,结合语块的语言学特征分析归纳了7种具体的蕴含类型.在此基础上,将中文蕴含识别任务转化为7分类的蕴含类型识别和蕴含语块边界-类型识别任务,在深度学习模型上达到69.19%和62.09%的准确率.实验结果表明,所提出的方法可以有效发现中文蕴含语块边界及与之对应的蕴含类型,为下一步研究提供了可靠的基准方法.

关 键 词:文本蕴含识别  语块识别  蕴含类型  深度学习
收稿时间:2019/4/2 0:00:00
修稿时间:2019/6/5 0:00:00

Recognition Method Based on Deep Learning for Chinese Textual Entailment Chunks and Labels
YU Dong,JIN Tian-Hu,XIE Wan-Ying,ZHANG Yi,XUN En-Dong.Recognition Method Based on Deep Learning for Chinese Textual Entailment Chunks and Labels[J].Journal of Software,2020,31(12):3772-3786.
Authors:YU Dong  JIN Tian-Hu  XIE Wan-Ying  ZHANG Yi  XUN En-Dong
Affiliation:College of Information Science, Beijing Language and Culture University, Beijing 100083, China
Abstract:Recognizing textual entailment (RTE) is a task to recognize whether two sentences have an entailment relationship. In recent years, RTE in English had made a great progress. The current researches are mainly based on type judgment, and pay less attention to locate the language chunks that lead to the entailment relationship. More over, it leads to a low interpretability of the RTE models. This study selects 12 000 Chinese entailment sentence pairs from the Chinese Natural Language Inference (CNLI) data and labeled chunks which lead to their entailment relationship. Then 7 entailment types are summarized considering Chinese linguistic features. On the basis, two tasks are proposed. One is to recognize the seven-category of entailment type for each entailment sentence pairs, another is to recognize the boundaries of the entailment chunks in it. The proposed deep learning based method reaches an accuracy of 69.19% and 62.09% in the two tasks. The experimental results show that proposed approaches can effectively identifying different types of entailment in Chinese and find the boundaries of the entailment chunks, which demonstrate that the proposed model provides a reliable benchmark for further research.
Keywords:recognizing textual entailment  chunk labeling  deep learning
本文献已被 万方数据 等数据库收录!
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号