基于高困惑样本对比学习的隐式篇章关系识别 Contrastive Learning with Confused Samples for Implicit Discourse Relation Recognition期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于高困惑样本对比学习的隐式篇章关系识别

引用本文：	李晓,洪宇,窦祖俊,徐旻涵,陆煜翔,周国栋. 基于高困惑样本对比学习的隐式篇章关系识别[J]. 中文信息学报, 2022, 36(11): 38-49

作者姓名：	李晓洪宇窦祖俊徐旻涵陆煜翔周国栋

作者单位：	苏州大学计算机科学与技术学院,江苏苏州 215006

基金项目：	科技部重大专项课题(2020YFB1313601);国家自然科学基金(61773276,62076174)

摘要：	隐式篇章关系识别是一种自动判别论元语义关系的自然语言处理任务。该任务蕴含的关键科学问题涉及两个方面: 其一是准确表征论元语义;其二是基于语义表示,有效地判别论元之间的关系类型。该文将集中在第一个方面开展研究。精准可靠的语义编码有助于关系分类,其根本原因是,编码表示的可靠性促进了正负例样本的可区分性(正例样本特指一对蕴含了“目标关系类”的论元,负例则是一对持有“非目标关系类”的论元)。近期研究显示,集成对比学习机制的语义编码方法能够提升模型在正负例样本上的可辨识性。为此,该文将对比学习机制引入论元语义的表示学习过程,利用“对比损失”驱动正负例样本的“相异性”,即在语义空间中聚合同类正样本,且驱散异类负样本的能力。特别地,该文提出基于条件自编码器的高困惑度负例生成方法,并利用这类负例增强对比学习数据的迷惑性,提升论元语义编码器的鲁棒性。该文使用篇章关系分析的公开语料集PDTB进行实验,实验结果证明,上述方法相较于未采用对比学习的基线模型,在面向对比(Comparison)、偶然(Contingency)、扩展(Expansion)及时序(Temporal)四种PDTB关系类型的二元分类场景中,分别产生了4.68%、4.63%、3.14%、12.77%的F₁值性能提升。
关键词：	隐式篇章关系识别对比学习条件变分编码
收稿时间：	2022-01-13
Contrastive Learning with Confused Samples for Implicit Discourse Relation Recognition

LI Xiao,HONG Yu,DOU Zujun,XU Minhan,LU Yuxiang,ZHOU Guodong. Contrastive Learning with Confused Samples for Implicit Discourse Relation Recognition[J]. Journal of Chinese Information Processing, 2022, 36(11): 38-49

Authors:	LI Xiao HONG Yu DOU Zujun XU Minhan LU Yuxiang ZHOU Guodong

Affiliation:	School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu 215006, China

Abstract:	Implicit discourse relation recognition automatically identifies the semantic relation between arguments. The key to this task involves two issues: one is to represent the argument semantics, the other is to recognize the relation between arguments. Focusing on better representation of the arguments, this paper introduces the contrast learning into the process of argument representation learning. We further propose a method generating confused samples based on conditional auto-encoders, so as to enhance the confused data in contrastive learning. Experiments on the Penn Discourse Treebank (PDTB) corpus show that,our method increases F₁ score by 4.68%, 4.63%, 3.14% and 12.77% on four top relations (Comparison, Contingency, Expansion, and Temporal), respectively.

Keywords:	implicit discourse relation recognition contrastive learning condition variational auto-encoder

	点击此处可从《中文信息学报》浏览原始摘要信息
	点击此处可从《中文信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏