首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 484 毫秒
1.
The quantification of the semantic similarity between terms is an important research area that configures a valuable tool for text understanding. Among the different paradigms used by related works to compute semantic similarity, in recent years, information theoretic approaches have shown promising results by computing the information content (IC) of concepts from the knowledge provided by ontologies. These approaches, however, are hampered by the coverage offered by the single input ontology. In this paper, we propose extending IC-based similarity measures by considering multiple ontologies in an integrated way. Several strategies are proposed according to which ontology the evaluated terms belong. Our proposal has been evaluated by means of a widely used benchmark of medical terms and MeSH and SNOMED CT as ontologies. Results show an improvement in the similarity assessment accuracy when multiple ontologies are considered.  相似文献   

2.
In the past decade, existing and new knowledge and datasets have been encoded in different ontologies for semantic web and biomedical research. The size of ontologies is often very large in terms of number of concepts and relationships, which makes the analysis of ontologies and the represented knowledge graph computational and time consuming. As the ontologies of various semantic web and biomedical applications usually show explicit hierarchical structures, it is interesting to explore the trade-offs between ontological scales and preservation/precision of results when we analyze ontologies. This paper presents the first effort of examining the capability of this idea via studying the relationship between scaling biomedical ontologies at different levels and the semantic similarity values. We evaluate the semantic similarity between three gene ontology slims (plant, yeast, and candida, among which the latter two belong to the same kingdom - fungi) using four popular measures commonly applied to biomedical ontologies (Resnik, Lin, Jiang-Conrath, and SimRel). The results of this study demonstrate that with proper selection of scaling levels and similarity measures, we can significantly reduce the size of ontologies without losing substantial detail. In particular, the performances of Jiang- Conrath and Lin are more reliable and stable than that of the other two in this experiment, as proven by 1) consistently showing that yeast and candida are more similar (as compared to plant) at different scales, and 2) small deviations of the similarity values after excluding a majority of nodes from several lower scales. This study provides a deeper understanding of the application of semantic similarity to biomedical ontologies, and shed light on how to choose appropriate semantic similarity measures for biomedical engineering.   相似文献   

3.
Estimation of the semantic likeness between words is of great importance in many applications dealing with textual data such as natural language processing, knowledge acquisition and information retrieval. Semantic similarity measures exploit knowledge sources as the base to perform the estimations. In recent years, ontologies have grown in interest thanks to global initiatives such as the Semantic Web, offering an structured knowledge representation. Thanks to the possibilities that ontologies enable regarding semantic interpretation of terms many ontology-based similarity measures have been developed. According to the principle in which those measures base the similarity assessment and the way in which ontologies are exploited or complemented with other sources several families of measures can be identified. In this paper, we survey and classify most of the ontology-based approaches developed in order to evaluate their advantages and limitations and compare their expected performance both from theoretical and practical points of view. We also present a new ontology-based measure relying on the exploitation of taxonomical features. The evaluation and comparison of our approach’s results against those reported by related works under a common framework suggest that our measure provides a high accuracy without some of the limitations observed in other works.  相似文献   

4.
The information content (IC) of a concept provides an estimation of its degree of generality/concreteness, a dimension which enables a better understanding of concept’s semantics. As a result, IC has been successfully applied to the automatic assessment of the semantic similarity between concepts. In the past, IC has been estimated as the probability of appearance of concepts in corpora. However, the applicability and scalability of this method are hampered due to corpora dependency and data sparseness. More recently, some authors proposed IC-based measures using taxonomical features extracted from an ontology for a particular concept, obtaining promising results. In this paper, we analyse these ontology-based approaches for IC computation and propose several improvements aimed to better capture the semantic evidence modelled in the ontology for the particular concept. Our approach has been evaluated and compared with related works (both corpora and ontology-based ones) when applied to the task of semantic similarity estimation. Results obtained for a widely used benchmark show that our method enables similarity estimations which are better correlated with human judgements than related works.  相似文献   

5.
李选如  何洁月 《微机发展》2007,17(2):121-124
本体是客观世界知识的表现形式,随着语义Web研究的深入,研究者们构建了越来越多的本体,如何实现本体之间的知识共享和重用,成为了语义Web发展的关键。文中对本体映射的方法进行了研究,系统阐述了本体及本体映射的定义、本体映射中的相似度计算和本体映射框架等。如何减少本体映射中的人工干预,实现本体的半自动化或自动化映射将是该领域的发展方向。  相似文献   

6.
Computation of semantic similarity between concepts is a very common problem in many language related tasks and knowledge domains. In the biomedical field, several approaches have been developed to deal with this issue by exploiting the structured knowledge available in domain ontologies (such as SNOMED-CT or MeSH) and specific, closed and reliable corpora (such as clinical data). However, in recent years, the enormous growth of the Web has motivated researchers to start using it as the corpus to assist semantic analysis of language. This paper proposes and evaluates the use of the Web as background corpus for measuring the similarity of biomedical concepts. Several ontology-based similarity measures have been studied and tested, using a benchmark composed by biomedical terms, comparing the results obtained when applying them to the Web against approaches in which specific clinical data were used. Results show that the similarity values obtained from the Web for ontology-based measures are at least and even more reliable than those obtained from specific clinical data, showing the suitability of the Web as information corpus for the biomedical domain.  相似文献   

7.
In this paper, we propose an intelligent distributed query processing method considering the characteristics of a distributed ontology environment. We suggest more general models of the distributed ontology query and the semantic mapping among distributed ontologies compared with the previous works. Our approach rewrites a distributed ontology query into multiple distributed ontology queries using the semantic mapping, and we can obtain the integrated answer through the execution of these queries. Furthermore, we propose a distributed ontology query processing algorithm with several query optimization techniques: pruning rules to remove unnecessary queries, a cost model considering site load balancing and caching, and a heuristic strategy for scheduling plans to be executed at a local site. Finally, experimental results show that our optimization techniques are effective to reduce the response time.  相似文献   

8.
Learning to match ontologies on the Semantic Web   总被引:19,自引:0,他引:19  
On the Semantic Web, data will inevitably come from many different ontologies, and information processing across ontologies is not possible without knowing the semantic mappings between them. Manually finding such mappings is tedious, error-prone, and clearly not possible on the Web scale. Hence the development of tools to assist in the ontology mapping process is crucial to the success of the Semantic Web. We describe GLUE, a system that employs machine learning techniques to find such mappings. Given two ontologies, for each concept in one ontology GLUE finds the most similar concept in the other ontology. We give well-founded probabilistic definitions to several practical similarity measures and show that GLUE can work with all of them. Another key feature of GLUE is that it uses multiple learning strategies, each of which exploits well a different type of information either in the data instances or in the taxonomic structure of the ontologies. To further improve matching accuracy, we extend GLUE to incorporate commonsense knowledge and domain constraints into the matching process. Our approach is thus distinguished in that it works with a variety of well-defined similarity notions and that it efficiently incorporates multiple types of knowledge. We describe a set of experiments on several real-world domains and show that GLUE proposes highly accurate semantic mappings. Finally, we extend GLUE to find complex mappings between ontologies and describe experiments that show the promise of the approach.Received: 16 December 2002, Accepted: 16 April 2003, Published online: 17 September 2003Edited by: Edited by B.V. Atluri, A. Joshi, and Y. Yesha  相似文献   

9.
Ontology reuse is recommended as a key factor to develop cost-effective and high-quality ontologies because it could reduce development costs by avoiding rebuilding existing ontologies. Selecting the desired ontology from existing ontologies is essential for ontology reuse. Until now, much research on ontology selection has focused on lexical-level support. However, in these cases, it is almost impossible to find an ontology that includes all the concepts matched by the search terms at the semantic level. Finding an ontology that meets users’ needs requires a new ontology selection and ranking mechanism based on semantic similarity matching. We propose an ontology selection and ranking model consisting of selection standards and metrics based on better semantic matching capabilities. The model we propose presents two novel features different from previous research models. First, it enhances the ontology selection and ranking method practically and effectively by enabling semantic matching of taxonomy or relational linkage between concepts. Second, it identifies what measures should be used to rank ontologies in the given context and what weight should be assigned to each selection measure.  相似文献   

10.
随着语义网的发展,本体已经成为很多领域表达知识的主要手段。许多领域都根据自己的需求建立了本体来描述本领域内的知识。但是目前许多针对本体的语义查询只能对一个本体进行查询。为了实现一个查询能够对多个本体进行访问并且返回适当的查询结果,文中提出了一种利用本体映射实现对多本体的查询方法。其中的映射方法是一种基于语义的多策略结合方式。通过实验发现查询的速度与本体的数量基本呈线性关系且不会因为本体异构程度而增加。  相似文献   

11.
Ontologies, which are formal representations of knowledge within a domain, can be used for designing and sharing conceptual models of enterprises information for the purpose of enhancing understanding, communication and interoperability. For representing a body of knowledge, different ontologies may be designed. Recently, designing ontologies in a modular manner has emerged for achieving better reasoning performance, more efficient ontology management and change handling. One of the important challenges in the employment of ontologies and modular ontologies in modeling information within enterprises is the evaluation of the suitability of an ontology for a domain and the performance of inference operations over it. In this paper, we present a set of semantic metrics for evaluating ontologies and modular ontologies. These metrics measure cohesion and coupling of ontologies, which are two important notions in the process of assessing ontologies for enterprise modeling. The proposed metrics are based on semantic-based definitions of relativeness, and dependencies between local symbols, and also between local and external symbols of ontologies. Based on these semantic definitions, not only the explicitly asserted knowledge in ontologies but also the implied knowledge, which is derived through inference, is considered for the sake of ontology assessment. We present several empirical case studies for investigating the correlation between the proposed metrics and reasoning performance, which is an important issue in applicability of employing ontologies in real-world information systems.  相似文献   

12.
In biomedical informatics, ontologies are considered a key technology for annotating, retrieving and sharing the huge volume of publicly available data. Due to the increasing amount, complexity and variety of existing biomedical ontologies, choosing the ones to be used in a semantic annotation problem or to design a specific application is a difficult task. As a consequence, the design of approaches and tools addressed to facilitate the selection of biomedical ontologies is becoming a priority. In this paper we present BiOSS, a novel system for the selection of biomedical ontologies. BiOSS evaluates the adequacy of an ontology to a given domain according to three different criteria: (1) the extent to which the ontology covers the domain; (2) the semantic richness of the ontology in the domain; (3) the popularity of the ontology in the biomedical community. BiOSS has been applied to 5 representative problems of ontology selection. It also has been compared to existing methods and tools. Results are promising and show the usefulness of BiOSS to solve real-world ontology selection problems. BiOSS is openly available both as a web tool and a web service.  相似文献   

13.
Determining semantic similarity among entity classes from different ontologies   总被引:20,自引:0,他引:20  
Semantic similarity measures play an important role in information retrieval and information integration. Traditional approaches to modeling semantic similarity compute the semantic distance between definitions within a single ontology. This single ontology is either a domain-independent ontology or the result of the integration of existing ontologies. We present an approach to computing semantic similarity that relaxes the requirement of a single ontology and accounts for differences in the levels of explicitness and formalization of the different ontology specifications. A similarity function determines similar entity classes by using a matching process over synonym sets, semantic neighborhoods, and distinguishing features that are classified into parts, functions, and attributes. Experimental results with different ontologies indicate that the model gives good results when ontologies have complete and detailed representations of entity classes. While the combination of word matching and semantic neighborhood matching is adequate for detecting equivalent entity classes, feature matching allows us to discriminate among similar, but not necessarily equivalent entity classes.  相似文献   

14.
欧灵  张玉芳  吴中福  钟将 《计算机科学》2006,33(11):188-191
现有的知识系统使用的是集中式的、一致性的、可扩充的Ontology库,不同本体间的语义匹配是语义网发展面临的最富挑战性的问题之一。本文针对领域中存在不同的Ontology的问题,讨论了一种基于多策略机器学习的Ontology匹配方法,重点分析了本体概念的相似度计算,并提出了一种相似度测量算法。  相似文献   

15.
本体相似度研究   总被引:1,自引:0,他引:1  
不同本体之间的交互成为语义Web的首要任务,其中本体相似度计算是本体映射的关健环节。在以往的研究中,本体相似度计算通常专注于模式及其结构的匹配。目前研究朝着进一步考虑本体内部语义信息方向努力。本文描述了语义相似度栈的各个层次,依据各个层次的语义特征对目前本体相似度方法进行分类,并对每种方法进行了详细描述。最后对现有一些主要的本体间相似度计算方法进行归纳总结。这项研究工作将为人们提出新的相似度方法或者组合的计算方法作一个参考。  相似文献   

16.
Biomedical entity alignment, composed of two subtasks: entity identification and entity-concept mapping, is of great research value in biomedical text mining while these techniques are widely used for name entity standardization, information retrieval, knowledge acquisition and ontology construction.Previous works made many efforts on feature engineering to employ feature-basedmodels for entity identification and alignment. However, the models depended on subjective feature selection may suffer error propagation and are not able to utilize the hidden information.With rapid development in healthrelated research, researchers need an effective method to explore the large amount of available biomedical literatures.Therefore, we propose a two-stage entity alignment process, biomedical entity exploring model, to identify biomedical entities and align them to the knowledge base interactively. The model aims to automatically obtain semantic information for extracting biomedical entities and mining semantic relations through the standard biomedical knowledge base. The experiments show that the proposed method achieves better performance on entity alignment. The proposed model dramatically improves the F1 scores of the task by about 4.5% in entity identification and 2.5% in entity-concept mapping.  相似文献   

17.
18.
Extending the Unified Modeling Language for ontology development   总被引:3,自引:0,他引:3  
There is rapidly growing momentum for web enabled agents that reason about and dynamically integrate the appropriate knowledge and services at run-time. The dynamic integration of knowledge and services depends on the existence of explicit declarative semantic models (ontologies). We have been building tools for ontology development based on the Unified Modeling Language (UML). This allows the many mature UML tools, models and expertise to be applied to knowledge representation systems, not only for visualizing complex ontologies but also for managing the ontology development process. UML has many features, such as profiles, global modularity and extension mechanisms that are not generally available in most ontology languages. However, ontology languages have some features that UML does not support. Our paper identifies the similarities and differences (with examples) between UML and the ontology languages RDF and DAML+OIL. To reconcile these differences, we propose a modification to the UML metamodel to address some of the most problematic differences. One of these is the ontological concept variously called a property, relation or predicate. This notion corresponds to the UML concepts of association and attribute. In ontology languages properties are first-class modeling elements, but UML associations and attributes are not first-class. Our proposal is backward-compatible with existing UML models while enhancing its viability for ontology modeling. While we have focused on RDF and DAML+OIL in our research and development activities, the same issues apply to many of the knowledge representation languages. This is especially the case for semantic network and concept graph approaches to knowledge representations. Initial sbmission: 16 February 2002 / Revised submission: 15 October 2002 Published online: 2 December 2002  相似文献   

19.
Automatic extraction of semantic information from text and links in Web pages is key to improving the quality of search results. However, the assessment of automatic semantic measures is limited by the coverage of user studies, which do not scale with the size, heterogeneity, and growth of the Web. Here we propose to leverage human-generated metadata—namely topical directories—to measure semantic relationships among massive numbers of pairs of Web pages or topics. The Open Directory Project classifies millions of URLs in a topical ontology, providing a rich source from which semantic relationships between Web pages can be derived. While semantic similarity measures based on taxonomies (trees) are well studied, the design of well-founded similarity measures for objects stored in the nodes of arbitrary ontologies (graphs) is an open problem. This paper defines an information-theoretic measure of semantic similarity that exploits both the hierarchical and non-hierarchical structure of an ontology. An experimental study shows that this measure improves significantly on the traditional taxonomy-based approach. This novel measure allows us to address the general question of how text and link analyses can be combined to derive measures of relevance that are in good agreement with semantic similarity. Surprisingly, the traditional use of text similarity turns out to be ineffective for relevance ranking.  相似文献   

20.
Semantic-oriented service matching is one of the challenges in automatic Web service discovery. Service users may search for Web services using keywords and receive the matching services in terms of their functional profiles. A number of approaches to computing the semantic similarity between words have been developed to enhance the precision of matchmaking, which can be classified into ontology-based and corpus-based approaches. The ontology-based approaches commonly use the differentiated concept information provided by a large ontology for measuring lexical similarity with word sense disambiguation. Nevertheless, most of the ontologies are domain-special and limited to lexical coverage, which have a limited applicability. On the other hand, corpus-based approaches rely on the distributional statistics of context to represent per word as a vector and measure the distance of word vectors. However, the polysemous problem may lead to a low computational accuracy. In this paper, in order to augment the semantic information content in word vectors, we propose a multiple semantic fusion (MSF) model to generate sense-specific vector per word. In this model, various semantic properties of the general-purpose ontology WordNet are integrated to fine-tune the distributed word representations learned from corpus, in terms of vector combination strategies. The retrofitted word vectors are modeled as semantic vectors for estimating semantic similarity. The MSF model-based similarity measure is validated against other similarity measures on multiple benchmark datasets. Experimental results of word similarity evaluation indicate that our computational method can obtain higher correlation coefficient with human judgment in most cases. Moreover, the proposed similarity measure is demonstrated to improve the performance of Web service matchmaking based on a single semantic resource. Accordingly, our findings provide a new method and perspective to understand and represent lexical semantics.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号