首页 | 本学科首页   官方微博 | 高级检索  
     


ArRaNER: A novel named entity recognition model for biomedical literature documents
Authors:Ramachandran  R.  Arutchelvan  K.
Affiliation:1.Department of Computer and Information Science, Annamalai University, Chidambaram, Tamil Nadu, India
;
Abstract:

Developments in advanced innovations have prompted the generation of an immense amount of digital information. The data deluge contains hidden information that is difficult to extract. In the biomedical domain, the development of technology has caused the production of voluminous data. Processing these voluminous textual data is referred to as ‘biomedical content mining’. Emerging artificial intelligence (AI) models play a major role in the automation of Pharma 4.0. In AI, natural language processing (NLP) plays a dynamic role in extracting knowledge from biomedical documents. Research articles published by scientists and researchers contain an enormous amount of hidden information. Most of the original and peer-reviewed articles are indexed in PubMed. Extracting meaningful information from a large number of literature documents is very difficult for human beings. This research aims to extract the named entities of literature documents available in the life science domain. A high-level architecture is proposed along with a novel named entity recognition (NER) model. The model is built using rule-based machine learning (ML). The proposed ArRaNER model produced better accuracy and was also able to identify more entities. The NER model was tested on two different datasets: a PubMed dataset and a Wikipedia talk dataset. The ArRaNER model obtains an accuracy of 83.42% on the PubMed articles and 77.65% on the Wikipedia articles.

Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号