首页 | 本学科首页   官方微博 | 高级检索  
     

一个实体关系与事件抽取的通用模型
引用本文:杨红菊,靳新宇. 一个实体关系与事件抽取的通用模型[J]. 计算机工程, 2023, 49(2): 143-149. DOI: 10.19678/j.issn.1000-3428.0063662
作者姓名:杨红菊  靳新宇
作者单位:山西大学计算机与信息技术学院,太原 030006;山西大学计算智能与中文信息处理教育部重点实验室,太原 030006;山西大学计算机与信息技术学院,太原 030006
基金项目:国家自然科学基金(61976128);山西省高等学校科技创新计划项目(2019L0103);山西省1331工程项目。
摘    要:信息提取的目的是从自然语言文件中找到具体信息,现有研究在信息抽取的实体关系和事件抽取任务中仅解决事件论元重叠和实体关系重叠的问题,未考虑两个任务共有的角色重叠问题,导致抽取结果准确率降低。提出一个两阶段的通用模型用于完成实体关系抽取和事件抽取子任务。基于预训练语言模型RoBERTa的共享特征表示,分别对实体关系/事件类型和实体关系/事件论元进行预测。将传统抽取触发词任务转化为多标签抽取事件类型任务,利用多尺度神经网络进一步提取文本特征。在此基础上,通过抽取文本相关类型的事件论元,根据论元角色的重要性对损失函数重新加权,解决数据不平衡、实体关系抽取和事件抽取中共同存在论元角色重叠的问题。在千言数据集中事件抽取和关系抽取任务测试集上的实验验证了该模型的有效性,结果表明,该模型的F1值分别为83.1%和75.3%。

关 键 词:事件抽取  实体关系抽取  角色重叠  RoBERTa模型  多标签分类
收稿时间:2021-12-30
修稿时间:2022-03-16

A General Model for Entity Relationship and Event Extraction
YANG Hongju,JIN Xinyu. A General Model for Entity Relationship and Event Extraction[J]. Computer Engineering, 2023, 49(2): 143-149. DOI: 10.19678/j.issn.1000-3428.0063662
Authors:YANG Hongju  JIN Xinyu
Affiliation:1. School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China;2. Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University, Taiyuan 030006, China
Abstract:The purpose of information extraction is to find specific information from natural language files.Existing research has only focused on solving the problem of event argument overlap and entity relationship overlap in the entity relationship and event extraction tasks of information extraction;it has not considered the problem of roles overlap shared by the two tasks, which leads to a reduction in the accuracy of extraction results.A general two-phase model is proposed to complete the sub-tasks of entity relationship extraction and event extraction.Based on the shared feature representation of the pre-training language model RoBERTa, the entity relationship/event type and entity relationship/event argument are predicted.The traditional task of extracting trigger words is transformed into a task of extracting event types from multi-label, and the text features are further extracted using multi-scale neural networks.On this basis, the loss function is reweighted according to the importance of argument roles by extracting event arguments of text-related types to solve the problem of data imbalance and overlapping of argument roles in entity relationship extraction and event extraction.Experiments on event extraction task testset and relation extraction task testset in Luge dataset verify the effectiveness of the proposed model.The experimental results show that the F1 values of the proposed model on these two test sets are 83.1% and 75.3%, respectively.
Keywords:event extraction  entity relationship extraction  roles overlap  RoBERTa model  multi-label classification  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号