首页 | 本学科首页   官方微博 | 高级检索  
     


Evaluation of semantic similarity metrics applied to the automatic retrieval of medical documents: An UMLS approach
Affiliation:1. Sustainable Energy Technologies Center, College of Engineering, King Saud University, P.O. Box 800, Riyadh 11421, KSA;2. Department of Electrical Engineering, College of Engineering, King Saud University, P.O. Box 800, Riyadh 11421, KSA;1. School of Computer Science, Fudan University, Shanghai 200433, China;2. Engineering Research Center of Cyber Security Auditing and Monitoring, Ministry of Education, Shanghai 200433, China\n;3. Business School, University of Shanghai for Science and Technology, Shanghai 200093, China;1. Computer School of Wuhan University, No. 299, Bayi Road, Wuchang District, Wuhan, China;2. Netease Research Hangzhou, No. 599,Wangshang Road, Binjiang District, Hangzhou, China;3. Shenzhen Institute of Wuhan University, Yuexing Road, Nanshan District, Shenzhen, China
Abstract:One promise of current information retrieval systems is the capability to identify risk groups for certain diseases and pathologies based on the automatic analysis of vast amounts of Electronic Medical Records repositories. However, the complexity and the degree of specialization of the language used by the experts in this context, make this task both challenging and complex. In this work, we introduce a novel experimental study to evaluate the performance of the two semantic similarity metrics (Path and Intrinsic IC-Path, both widely accepted in the literature) in a real-life information retrieval situation. In order to achieve this goal and due to the lack of methodologies for this context in the literature, we propose a straightforward information retrieval system for the biomedical field based on the UMLS Metathesaurus and on semantic similarity metrics. In contrast with previous studies which focus on testbeds with limited and controlled sets of concepts, we use a large amount of information (101,712 medical documents extracted from TREC Medical Records Track 2011). Our results show that in real-life cases, both metrics display similar performance, Path (F-Measure = 0.430) e Intrinsic IC-Path (F-Measure = 0.427). Thereby we suggest that the use of Intrinsic IC-Path is not justified in real scenarios.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号