首页 | 本学科首页   官方微博 | 高级检索  
     

MRC-PBM: 一种中文电子病历嵌套命名 实体识别方法
引用本文:周佳伦,李琳宇,马洪彬,姜艳静.MRC-PBM: 一种中文电子病历嵌套命名 实体识别方法[J].国外电子测量技术,2024,43(1):159-165.
作者姓名:周佳伦  李琳宇  马洪彬  姜艳静
作者单位:三峡大学计算机与信息学院
基金项目:国家重点研究发展计划项目(2016YFC0802500)资助;
摘    要:中文电子病历实体包含大量的医学领域词汇并具有明显的嵌套特征。嵌套实体识别时往往存在目标实体定位不完整、不准确的问题。针对这一问题,提出了一种基于机器阅读理解的中文电子病历嵌套命名实体识别模型MRC-PBM (machine reading comprehension-position information biaffine and MLP)。该模型将命名实体识别(named entity recognition, NER)转化为机器阅读理解任务,将中文电子病历文本和预定义的查询语句串联作为输入,使用基于医学的预训练模型MC_BERT获取词向量,然后通过双向长短期记忆网络模型(BiLSTM)和多粒度扩张卷积模型分别获取双向的特征信息以及单词之间的信息,得到相应的特征向量,最后使用Hybrid-PBM预测器进行实体预测。在嵌套和平面NER数据集上进行实验。实验表明,该模型在糖尿病语料和公开医学数据集上优于其他主流神经网络模型,F1值比基线模型提高了1.21%~5.80%。

关 键 词:中文电子病历  命名实体识别  机器阅读理解  嵌套实体

MRC-PBM:A Chinese electronic medical Record nested named entity recognition method
Zhou Jialun,Li Linyu,Ma Hongbin,Jiang Yanjing.MRC-PBM:A Chinese electronic medical Record nested named entity recognition method[J].Foreign Electronic Measurement Technology,2024,43(1):159-165.
Authors:Zhou Jialun  Li Linyu  Ma Hongbin  Jiang Yanjing
Affiliation:1.College of Computer and Information,China Three Gorges University
Abstract:The Chinese electronic medical record entities contain a large number of medical domain vocabulary and have obvious nested features.When identifying nested entities,there is often a problem of incomplete or inaccurate location of the target entity.To address this problem,a Chinese electronic medical record nested named entity recognition model machine reading comprehension-position information biaffine and MLP(MRC-PBM),based on MRC is proposed.The model transforms named entity recognition(NER)into an MRC task,concatenating the Chinese EMR text and predefined query statements as input,using the medical-based pre-trained model MC_BERT to obtain word vectors,and then using a bidirectional long short-term memory network(BiLSTM)and a multi-granularity expansion convolution model to obtain bidirectional feature information and information between words,respectively,to obtain corresponding feature vectors.Finally,the Hybrid-PBM predictor is used to predict the entities.Experiments are conducted on nested and flat NER datasets.The experimental results show that the proposed model outperforms other mainstream neural network models on the diabetes corpus and public medical datasets,with Fl scores improved by 1.21%to 5.80% compared to baseline models.
Keywords:Chinese  electronic  medical  record  machine  reading  comprehension  named  entity  recognition  nested  entity
点击此处可从《国外电子测量技术》浏览原始摘要信息
点击此处可从《国外电子测量技术》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号