首页 | 本学科首页   官方微博 | 高级检索  
     

结合CRF的边界组合生物医学命名实体识别
引用本文:扈应,陈艳平.结合CRF的边界组合生物医学命名实体识别[J].计算机应用研究,2021,38(7):2025-2031.
作者姓名:扈应  陈艳平
作者单位:贵州大学 计算机科学与技术学院,贵阳550025;贵州大学 计算机科学与技术学院,贵阳550025;贵州大学 贵州省公共大数据重点实验室,贵阳550025
基金项目:国家自然科学基金通用联合基金重点资助项目(U1836205);国家自然科学基金重大研究计划资助项目(91746116);国家自然科学基金资助项目(62066007,62066008);贵州省科技重大专项计划资助项目(黔科合重大专项字[2017]3002);贵州省科学技术基金重点资助项目(黔科合基础[2020]1Z055)
摘    要:许多的生物医学命名实体识别(Bio-NER)工作都集中于提取扁平化的实体,而忽略了嵌套实体和不连续实体.此外,大多数生物医学命名实体都未遵循统一的命名法,具有许多典型的领域特征,但其使用效率较低.为此提出一种结合CRF的边界组合命名实体识别方法,有效地利用了生物医学实体特征.该方法包括边界检测、边界组合和实体筛选三个步骤.首先使用神经网络模型和基于特征的CRF模型识别实体开始和结束边界,然后经过边界组合产生候选实体,最后使用多输入的卷积神经网络模型对候选实体进行筛选并分类.实验表明,该方法能够有效地识别生物医学文献中的嵌套和不连续实体,在GENIA数据集上达到81.89%的F值.

关 键 词:生物医学命名实体识别  深度学习  条件随机场  信息抽取
收稿时间:2020/9/12 0:00:00
修稿时间:2021/6/15 0:00:00

CRF-combined boundary assembly method for biomedical named entity recognition
Hu Ying and Chen Yan Ping.CRF-combined boundary assembly method for biomedical named entity recognition[J].Application Research of Computers,2021,38(7):2025-2031.
Authors:Hu Ying and Chen Yan Ping
Affiliation:Guizhou University,
Abstract:Many biomedical named entity recognition(Bio-NER) works focus on extracting flat entities, ignore nested entities and discontinuous entities. In addition most biomedical named entities do not follow a unified nomenclature and have many typical field features, but there is only lower efficiency. To this end, this paper proposed a CRF-combined boundary assembly method, it effectively utilized the features of biomedical entities. This method consisted of three steps: boundary detec ting, boundary assembling and entity discriminating. Firstly, it used a neural network model and a feature-based CRF model to identify the start and end boundaries of entities, and then generated candidate entities through boundary combination. Finally it screened and classified the candidate entities using a multi-input convolutional neural network model. Experiments show that this method can effectively recognize nested and discontinuous entities in biomedical literature, and achieve an F-score of 81 89% on the GENIAdata set.
Keywords:biomedical named entity recognition  deep learning  CRF  information extraction
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号