首页 | 本学科首页   官方微博 | 高级检索  
     

采用Transformer-CRF的中文电子病历命名实体识别
引用本文:李博,康晓东,张华丽,王亚鸽,陈亚媛,白放. 采用Transformer-CRF的中文电子病历命名实体识别[J]. 计算机工程与应用, 2020, 56(5): 153-159. DOI: 10.3778/j.issn.1002-8331.1909-0211
作者姓名:李博  康晓东  张华丽  王亚鸽  陈亚媛  白放
作者单位:天津医科大学 医学影像学院,天津 300203
摘    要:命名实体识别是自然语言处理的基本任务之一。针对中文电子病历命名实体识别传统模型识别效果不佳的问题,提出一种完全基于注意力机制的神经网络模型。实验采用自建真实中文电子病历数据集并对数据集进行人工标注、分词等预处理;对Transformer模型进行训练优化,以提取文本特征;利用条件随机场对提取到的文本特征进行分类识别。为验证所提方法的有效性,将构建的Transformer-CRF神经网络模型与其他7种传统模型进行比较研究,实验采用精确率、召回率和[F1]值三个指标评估模型的识别性能。实验结果显示,在同一语料集下,Transformer-CRF模型对身体部位类的命名实体识别效果较好,[F1]值高达95.02%;且与其他7种传统模型相比,Transformer-CRF模型的精确率、召回率和[F1]值均较高,在一定程度上验证了所构建模型具有较好的识别性能。

关 键 词:电子病历(EMR)  命名实体识别  Transformer  条件随机场(CRF)  

Named Entity Recognition in Chinese Electronic Medical Records Using Transformer-CRF
LI Bo,KANG Xiaodong,ZHANG Huali,WANG Yage,CHEN Yayuan,BAI Fang. Named Entity Recognition in Chinese Electronic Medical Records Using Transformer-CRF[J]. Computer Engineering and Applications, 2020, 56(5): 153-159. DOI: 10.3778/j.issn.1002-8331.1909-0211
Authors:LI Bo  KANG Xiaodong  ZHANG Huali  WANG Yage  CHEN Yayuan  BAI Fang
Affiliation:College of Medical Imaging, Tianjin Medical University, Tianjin 300203, China
Abstract:Named entity recognition is one of the basic tasks of natural language processing.Aiming at the problem that the traditional model of Chinese EMR named entity recognition is not effective,a neural network model based on attention mechanism is proposed.Firstly,the experiment uses self-built real Chinese electronic medical record data sets and preprocesses the data sets by manual labeling and word segmentation.Secondly,it trains optimization of Transformer model to extract text features.Finally,it uses conditional random fields to classify and recognize the extracted text features.To verify the effectiveness of the proposed method,the Transformer-CRF neural network model is compared with seven other traditional models.The recognition performance of the model is evaluated by three indicators:precision,recall and F1 value.The experimental results show that in the same corpus,the transformer-CRF model has a better recognition effect on the named entity of Body parts,and the F1 value is as high as 95.02%,and compared with the other seven traditional models,the precision,recall and F1 value of the transformer-CRF model are higher,which proves that the model has a better recognition performance in a certain degree.
Keywords:Electronic Medical Records(EMR)  named entity recognition  Transformer  Conditional Random Fields(CRF)
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号