首页 | 本学科首页   官方微博 | 高级检索  
     


Learning to extract adverse drug reaction events from electronic health records in Spanish
Affiliation:1. Dep. Electricity and Electronics. Faculty of Science and Technology, (UPV-EHU), Leioa, Spain;2. Dep. Languages and Computer Systems. School of Engineering of Bilbao (UPV-EHU), Bilbao, Spain;3. Dep. Languages and Computer Systems. Faculty of Computer Science (UPV-EHU), San Sebastian, Spain;1. Horizon Digital Economy Research Institute, The University of Nottingham, Nottingham, UK;2. Faculty of Engineering, The University of Nottingham, Nottingham, UK;1. Industrial Engineering and Management, Ariel University, Ariel 40700, Israel;2. Information Systems Engineering, Ben-Gurion University of the Negev, P.O.Box 653 Beer-Sheva 8410501, Israel;1. Faculty of Life Sciences and Computing, London Metropolitan University, Holloway Road, London N78DB, United Kingdom;2. STS Defence Ltd, Mumby Rd, Gosport PO12 1AF, United Kingdom
Abstract:Objective: To tackle the extraction of adverse drug reaction events in electronic health records. The challenge stands in inferring a robust prediction model from highly unbalanced data. According to our manually annotated corpus, only 6% of the drug-disease entity pairs trigger a positive adverse drug reaction event and this low ratio makes machine learning tough.Method: We present a hybrid system utilising a self-developed morpho-syntactic and semantic analyser for medical texts in Spanish. It performs named entity recognition of drugs and diseases and adverse drug reaction event extraction. The event extraction stage operates using rule-based and machine learning techniques.Results: We assess both the base classifiers, namely a knowledge-based model and an inferred classifier, and also the resulting hybrid system. Moreover, for the machine learning approach, an analysis of each particular bio-cause triggering the adverse drug reaction is carried out.Conclusions: One of the contributions of the machine learning based system is its ability to deal with both intra-sentence and inter-sentence events in a highly skewed classification environment. Moreover, the knowledge-based and the inferred model are complementary in terms of precision and recall. While the former provides high precision and low recall, the latter is the other way around. As a result, an appropriate hybrid approach seems to be able to benefit from both approaches and also improve them. This is the underlying motivation for selecting the hybrid approach. In addition, this is the first system dealing with real electronic health records in Spanish.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号