首页 | 本学科首页   官方微博 | 高级检索  
     

基于刑事Electra的编-解码关系抽取模型
引用本文:王小鹏,孙媛媛,林鸿飞.基于刑事Electra的编-解码关系抽取模型[J].计算机应用,2022,42(1):87-93.
作者姓名:王小鹏  孙媛媛  林鸿飞
作者单位:大连理工大学 计算机科学与技术学院,辽宁 大连 116024
基金项目:国家重点研发计划项目(2018YFC0830603)。
摘    要:针对司法领域关系抽取任务中模型对句子上下文理解不充分、重叠关系识别能力弱的问题,提出了一种基于刑事Electra(CriElectra)的编-解码关系抽取模型.首先,参考中文Electra的训练方法,在1000000份刑事数据集上训练得到了CriElectra;然后,在双向长短期记忆网络(BiLSTM)模型上加入Cri...

关 键 词:司法领域  关系抽取  预训练语言模型  双向长短期记忆网络  胶囊网络
收稿时间:2021-02-21
修稿时间:2021-06-27

Encoding-decoding relationship extraction model based on criminal Electra
WANG Xiaopeng,SUN Yuanyuan,LIN Hongfei.Encoding-decoding relationship extraction model based on criminal Electra[J].journal of Computer Applications,2022,42(1):87-93.
Authors:WANG Xiaopeng  SUN Yuanyuan  LIN Hongfei
Affiliation:School of Computer Science and Technology,Dalian University of Technology,Dalian Liaoning 116024,China
Abstract:Aiming at the problem that the model in the judicial field relation extraction task does not fully understand the context of sentence and has weak recognition ability of overlapping relations, based on Criminal-Efficiently learning an encoder that classi?es token replacements accurately (CriElectra), an encoding-decoding relationship extraction model was proposed. Firstly, referred to the training method of Chinese Electra, CriElectra was trained on one million criminal dataset. Then, the word vectors of CriElectra were added to Bidirectional Long Short-Term Memory (BiLSTM) model for feature extraction of judicial texts. Finally, the vector clustering was performed to the features through Capsule Network (CapsNet), so that the relationships between entities were extracted. Experimental results show that on the self-built relationship dataset of intentional injury crime, compared with the pre-trained language model based on Chinese Electra, CriElectra has retraining process on judicial texts to make the learned word vectors contain richer domain information, and the F1-score increased by 1.93 percentage points. Compared with the model based on pooling clustering, CapsNet can effectively prevent the loss of spatial information by vector operation and improve the recognition ability of overlapping relationships, which increases the F1-score by 3.53 percentage points.
Keywords:judicial field  relation extraction  pretrained language model  Bidirectional Long Short-Term Memory(BiLSTM)  Capsule Network(CapsNet)
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号