首页 | 本学科首页   官方微博 | 高级检索  
     

基于BERT的心血管医疗指南实体关系抽取方法
引用本文:武小平,张强,赵芳,焦琳.基于BERT的心血管医疗指南实体关系抽取方法[J].计算机应用,2021,41(1):145-149.
作者姓名:武小平  张强  赵芳  焦琳
作者单位:1. 武汉大学 计算机学院, 武汉 430072;2. 武汉大学中南医院 心血管内科, 武汉 430070
基金项目:湖北省卫生健康委员会面上项目
摘    要:实体关系抽取是医疗领域知识问答、知识图谱构建及信息抽取的重要基础环节之一。针对在心血管专病知识图谱构建的过程中尚无公开数据集可用的情况,收集了心血管疾病领域的医疗指南并进行相应的实体和关系类别的专业标注,构建了心血管专病知识图谱实体关系抽取的专业数据集。基于该数据集,首先提出双向变形编码器卷积神经网络(BERT-CNN)模型以实现中文语料中的关系抽取,然后根据中文语义中主要以词而不是字为基本单位的特性,提出了改进的基于全词掩模的双向变形编码器卷积神经网络(BERT(wwm)-CNN)模型用于提升在中文语料中关系抽取的性能。实验结果表明,改进的BERT(wwm)-CNN在所构建的关系抽取数据集上准确率达到0.85,召回率达到0.80,F1值达到0.83,优于对比的基于双向变形编码器长短期记忆网络(BERT-LSTM)模型和BERT-CNN模型,验证了改进网络模型的优势。

关 键 词:实体关系抽取  心血管疾病  双向变形编码器网络  卷积神经网络  知识图谱  
收稿时间:2020-05-31
修稿时间:2020-09-09

Entity relation extraction method for guidelines of cardiovascular disease based on bidirectional encoder representation from transformers
WU Xiaoping,ZHANG Qiang,ZHAO Fang,JIAO Lin.Entity relation extraction method for guidelines of cardiovascular disease based on bidirectional encoder representation from transformers[J].journal of Computer Applications,2021,41(1):145-149.
Authors:WU Xiaoping  ZHANG Qiang  ZHAO Fang  JIAO Lin
Affiliation:1. School of Computer Science, Wuhan University, Wuhan Hubei 430072, China;2. Department of Cardiovascular Disease, Zhongnan Hospital of Wuhan University, Wuhan Hubei 430070, China
Abstract:Entity relation extraction is a critical basic step of question answering,knowledge graph construction and information extraction in the medical field.In view of the fact that there is no open dataset available in the process of building knowledge graph specialized for cardiovascular disease,a professional training set for entity relation extraction of specialized cardiovascular disease knowledge graph was constructed by collecting some medical guidelines for cardiovascular disease and performing the corresponding professional labeling of the categories of entities and relations.Based on this dataset,firstly,Bidirectional Encoder Representation from Transformers and Convolutional Neural Network(BERT-CNN)model was proposed to realize the relation extraction in Chinese corpus.Then,the improved Bidirectional Encoder Representation from Transformers and Convolutional Neural Networks based on whole word mask(BERT(wwm)-CNN)model was proposed to improve the performance of relation extraction in Chinese corpus,according to the fact that word instead of character is the fundamental unit in Chinese.Experimental results show that,the improved BERT(wwm)-CNN model has the accuracy of 0.85,the recall of 0.80 and the F1 value of 0.83 on the constructed relation extraction dataset,which are better than those of the comparison models,Bidirectional Encoder Representation from Transformers and Long Short Term Memory(BERTLSTM)and BERT-CNN,verifying the superiority of the improved BERT(wwm)-CNN.
Keywords:entity relation extraction  cardiovascular disease  Bidirectional Encoder Representation from Transformers(BERT)network  Convolutional Neural Network(CNN)  knowledge graph
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号