首页 | 本学科首页   官方微博 | 高级检索  
     

基于位置特征和句法依存树的可度量数量信息抽取模型
引用本文:聂文杰,莫迪,黄邦锐,刘海,郝天永.基于位置特征和句法依存树的可度量数量信息抽取模型[J].计算机系统应用,2022,31(10):279-287.
作者姓名:聂文杰  莫迪  黄邦锐  刘海  郝天永
作者单位:华南师范大学 计算机学院, 广州 510631;华南师范大学 人工智能学院, 佛山 528225
基金项目:广东自然科学基金(2021A1515011339)
摘    要:随着医疗信息化水平的不断提高, 电子病历得到了越来越广泛的应用, 其中的非结构化文本包含大量蕴含患者病况信息的可度量数量信息, 由于实体与数量信息表述的复杂性, 从非结构化电子病历文档中精准抽取可度量数量信息是一个重要的挑战. 本文基于双向门控循环单元提出了结合相对位置特征与注意力机制的RPA-GRU模型, 通过将相对位置特征融入注意力机制更新双向门控循环单元输出, 识别实体与数量信息. 并基于重构句法依存树的图注意力网络学习图级表示提出GATM模型, 实现实体与数量信息的关联. 实验基于1 359份三甲医院烧伤科电子病历数据, 结果表明RPA-GRU模型与GATM模型在可度量数量信息识别和关联上分别获得97.58%与97.86%的F1值, 比表现最好的基线模型分别高出2.17%与1.74%, 验证了所提出模型的有效性.

关 键 词:可度量数量信息  电子病历  相对位置特征  句法依存树  图注意力网络  信息抽取
收稿时间:2022/1/17 0:00:00
修稿时间:2022/2/17 0:00:00

Extraction Model of Measurable Quantitative Information Based on Position Feature and Dependency Tree
NIE Wen-Jie,MO Di,HUANG Bang-Rui,LIU Hai,HAO Tian-Yong.Extraction Model of Measurable Quantitative Information Based on Position Feature and Dependency Tree[J].Computer Systems& Applications,2022,31(10):279-287.
Authors:NIE Wen-Jie  MO Di  HUANG Bang-Rui  LIU Hai  HAO Tian-Yong
Affiliation:Shool of Computer Science, South China Normal University, Guangzhou 510631, China;School of Artificial Intelligence, South China Normal University, Foshan 528225, China
Abstract:As medical informatization is constantly improving, electronic medical records have been more and more widely used, of which the unstructured text contains massive measurable quantitative information including patient clinical conditions. Due to the complexity of entities and quantitative information, it is a challenge to accurately extract measurable quantitative information. In this study, we propose the RPA-GRU model combining the relative position feature and attention mechanism based on a bi-directional gated recurrent unit. It incorporates the relative position feature into the attention mechanism to identify entities and quantity information. Meanwhile, the GATM model is proposed according to the reconstructed dependency tree-based graph attention network to learn graph-level representation, thus achieving the association between entities and quantity information. The experiment is based on 1359 electronic medical records from the burn injury department of a three-A hospital. The results show that the F1 values of RPA-GRU model and GATM model are 97.58% and 97.86% respectively in terms of identification and association of measurable quantitative information, up by 2.17% and 1.74% compared with the best-performing baseline model. In this way, the effectiveness of the proposed models is validated.
Keywords:measurable quantitative information  electronic medical records  relative position feature  dependency tree  graph attention networks  information extraction
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号