Ontology-based information content computation |
| |
Authors: | David Sánchez Montserrat Batet David Isern |
| |
Affiliation: | 1. School of Computer Science, South China Normal University, Guangzhou 510631, China;2. Division of Science and Technology, University of Education, Lahore, Pakistan;1. State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou 510060, China;2. Biodata Mining and Discovery Section, Office of Science and Technology, Intramural Research Program, National Institute of Arthritis and Musculoskeletal and Skin Diseases, Bethesda, Maryland;3. Laboratory of Skin Biology, Intramural Research Program, National Institute of Arthritis and Musculoskeletal and Skin Diseases, Bethesda, Maryland;4. Laboratory of Bioinformatics, Center for Advanced Computer Studies, University of Louisiana at Lafayette, Lafayette, Louisiana |
| |
Abstract: | The information content (IC) of a concept provides an estimation of its degree of generality/concreteness, a dimension which enables a better understanding of concept’s semantics. As a result, IC has been successfully applied to the automatic assessment of the semantic similarity between concepts. In the past, IC has been estimated as the probability of appearance of concepts in corpora. However, the applicability and scalability of this method are hampered due to corpora dependency and data sparseness. More recently, some authors proposed IC-based measures using taxonomical features extracted from an ontology for a particular concept, obtaining promising results. In this paper, we analyse these ontology-based approaches for IC computation and propose several improvements aimed to better capture the semantic evidence modelled in the ontology for the particular concept. Our approach has been evaluated and compared with related works (both corpora and ontology-based ones) when applied to the task of semantic similarity estimation. Results obtained for a widely used benchmark show that our method enables similarity estimations which are better correlated with human judgements than related works. |
| |
Keywords: | |
本文献已被 ScienceDirect 等数据库收录! |
|