首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
The quantification of the semantic similarity between terms is an important research area that configures a valuable tool for text understanding. Among the different paradigms used by related works to compute semantic similarity, in recent years, information theoretic approaches have shown promising results by computing the information content (IC) of concepts from the knowledge provided by ontologies. These approaches, however, are hampered by the coverage offered by the single input ontology. In this paper, we propose extending IC-based similarity measures by considering multiple ontologies in an integrated way. Several strategies are proposed according to which ontology the evaluated terms belong. Our proposal has been evaluated by means of a widely used benchmark of medical terms and MeSH and SNOMED CT as ontologies. Results show an improvement in the similarity assessment accuracy when multiple ontologies are considered.  相似文献   

2.
3.
Domain ontologies facilitate the organization, sharing and reuse of domain knowledge, and enable various vertical domain applications to operate successfully. Most methods for automatically constructing ontologies focus on taxonomic relations, such as is-kind-of and is-part-of relations. However, much of the domain-specific semantics is ignored. This work proposes a semi-unsupervised approach for extracting semantic relations from domain-specific text documents. The approach effectively utilizes text mining and existing taxonomic relations in domain ontologies to discover candidate keywords that can represent semantic relations. A preliminary experiment on the natural science domain (Taiwan K9 education) indicates that the proposed method yields valuable recommendations. This work enriches domain ontologies by adding distilled semantics.  相似文献   

4.
More people than ever before have access to information with the World Wide Web; information volume and number of users both continue to expand. Traditional search methods based on keywords are not effective, resulting in large lists of documents, many of which unrelated to users’ needs. One way to improve information retrieval is to associate meaning to users’ queries by using ontologies, knowledge bases that encode a set of concepts about one domain and their relationships. Encoding a knowledge base using one single ontology is usual, but a document collection can deal with different domains, each organized into an ontology. This work presents a novel way to represent and organize knowledge, from distinct domains, using multiple ontologies that can be related. The model allows the ontologies, as well as the relationships between concepts from distinct ontologies, to be represented independently. Additionally, fuzzy set theory techniques are employed to deal with knowledge subjectivity and uncertainty. This approach to organize knowledge and an associated query expansion method are integrated into a fuzzy model for information retrieval based on multi-related ontologies. The performance of a search engine using this model is compared with another fuzzy-based approach for information retrieval, and with the Apache Lucene search engine. Experimental results show that this model improves precision and recall measures.  相似文献   

5.
A high-level electrical energy ontology with weighted attributes   总被引:1,自引:0,他引:1  
One of the significant application areas of domain ontologies is known to be text analysis applications like information extraction and text classification systems, and semantic portals. In this paper, we present a high-level ontology for the electrical energy domain. This domain ontology has weighted attributes to cover the inherent fuzziness in the textual representations of its concepts. Additionally, we have included in the ontology the necessary attributes to align the ontology concepts to on-line collaborative knowledge bases like Wikipedia and linked open data sources like DBpedia, other attributes to facilitate its use in multilingual applications, and concepts to hold the named entities in the domain. The ultimate ontology is aligned with the previously proposed ontologies for the energy-related subdomains after extending the latter ones with weighted attributes. We make the ultimate form of the electrical energy ontology, as well as the extended versions of the domain ontologies for the subdomains, available for research purposes. Also included in the paper are sample text analysis applications which mainly exploit the weighted attributes within the ontology.  相似文献   

6.
Ontologies and other schemes are useful for allowing semantic tagging of documents for many applications on the semantic web. Representing uncertainty on the semantic web is becoming increasingly common, using ontologies and other techniques. Ontology and declarative tools allow documents using concepts contained in these ontologies to be reasoned about using computer systems. Very large ontologies and vocabularies have been created; however, users may find it difficult to select the correct concept or term when there are large numbers of items that on face value appear to represent the same idea. Creating subsets of ontologies is a popular approach to solve this problem but this may not fit well with the need to deal with complex domains. However, crowdsourcing techniques, which harness the power of large groups, may be more effective than document analysis or expert opinion. In crowdsourcing, large numbers of people collaborate by performing relatively simple tasks usually using applications distributed via the World Wide Web. This approach is being tested in the medical domain using a very large clinical vocabulary, SNOMED CT.  相似文献   

7.
Knowledge-based vector space model for text clustering   总被引:5,自引:4,他引:1  
This paper presents a new knowledge-based vector space model (VSM) for text clustering. In the new model, semantic relationships between terms (e.g., words or concepts) are included in representing text documents as a set of vectors. The idea is to calculate the dissimilarity between two documents more effectively so that text clustering results can be enhanced. In this paper, the semantic relationship between two terms is defined by the similarity of the two terms. Such similarity is used to re-weight term frequency in the VSM. We consider and study two different similarity measures for computing the semantic relationship between two terms based on two different approaches. The first approach is based on the existing ontologies like WordNet and MeSH. We define a new similarity measure that combines the edge-counting technique, the average distance and the position weighting method to compute the similarity of two terms from an ontology hierarchy. The second approach is to make use of text corpora to construct the relationships between terms and then calculate their semantic similarities. Three clustering algorithms, bisecting k-means, feature weighting k-means and a hierarchical clustering algorithm, have been used to cluster real-world text data represented in the new knowledge-based VSM. The experimental results show that the clustering performance based on the new model was much better than that based on the traditional term-based VSM.  相似文献   

8.
There are currently many active movements towards computerizing patient healthcare information. As Electronic Medical Record (EMR) systems are being increasingly adopted in healthcare facilities, however, there is a big challenge in effectively utilizing this massive information source. It is very time-consuming for healthcare providers to dig into the voluminous medical records of a patient to find the few that are indeed relevant to the patient’s current problem. Due to the complex semantic relationships among medical concepts and use of many synonyms, antonyms, and hypernym/hyponym, simple word-based information retrieval does not produce satisfactory results. In this paper, we propose an EMR retrieval system that leverages semantic query expansion to retrieve medical records that are relevant to the patient’s current symptom/problem. The proposed framework integrates various technologies, including information retrieval, domain ontologies, automatic semantic relationship learning, as well as a body of domain knowledge elicited from healthcare experts. Knowledge of semantic relationships among medical concepts, such as symptoms, exams and tests, diagnoses, and treatments, as well as knowledge of synonyms and hypernym/hyponyms, is used to expand and enhance initial queries posed by a user. We have implemented a preliminary prototype and conducted a pilot testing using sample nursing notes drawn from the EMR system of a community health center.  相似文献   

9.
An ontology is a computational model of some portion of the world. It is often captured in some form of a semantic network-a graph whose nodes are concepts or individual objects and whose arcs represent relationships or associations among the concepts. This network is augmented by properties and attributes, constraints, functions, and rules that govern the behavior of the concepts. Formally, an ontology is an agreement about a shared conceptualization, which includes frameworks for modeling domain knowledge and agreements about the representation of particular domain theories. Definitions associate the names of entities in a universe of discourse (for example, classes, relations, functions, or other objects) with human readable text describing what the names mean, and formal axioms that constrain the interpretation and well formed use of these names. For information systems, or for the Internet, ontologies can be used to organize keywords and database concepts by capturing the semantic relationships among the keywords or among the tables and fields in a database. The semantic relationships give users an abstract view of an information space for their domain of interest  相似文献   

10.
Mahalingam  K. Huhns  M.N. 《Computer》1997,30(6):80-83
The physical and logical differences among information sources on the Internet complicate information retrieval. For instance, data is no longer just simple text or tuples, but now includes objects and multimedia. Data can also have varied and often arcane semantics. Sources have different policies, procedures, and conventions and are hosted by diverse platforms. Ontologies-models of concepts and their relationships-are a powerful way to organize query formulation and semantic reconciliation in large distributed information environments. They can capture both the structure and semantics of information environments, so an ontology-based search engine can handle both simple keyword-based queries as well as complex queries on structured data. Ontology-based interoperation is especially good at dealing with inconsistent semantics. However; ontologies are difficult to construct. The Java Ontology Editor (JOE) helps users build and browse ontologies. It also enables query formulation at several levels of abstraction. The authors discuss the use of JOE to develop a health care information system  相似文献   

11.
Text document clustering plays an important role in providing better document retrieval, document browsing, and text mining. Traditionally, clustering techniques do not consider the semantic relationships between words, such as synonymy and hypernymy. To exploit semantic relationships, ontologies such as WordNet have been used to improve clustering results. However, WordNet-based clustering methods mostly rely on single-term analysis of text; they do not perform any phrase-based analysis. In addition, these methods utilize synonymy to identify concepts and only explore hypernymy to calculate concept frequencies, without considering other semantic relationships such as hyponymy. To address these issues, we combine detection of noun phrases with the use of WordNet as background knowledge to explore better ways of representing documents semantically for clustering. First, based on noun phrases as well as single-term analysis, we exploit different document representation methods to analyze the effectiveness of hypernymy, hyponymy, holonymy, and meronymy. Second, we choose the most effective method and compare it with the WordNet-based clustering method proposed by others. The experimental results show the effectiveness of semantic relationships for clustering are (from highest to lowest): hypernymy, hyponymy, meronymy, and holonymy. Moreover, we found that noun phrase analysis improves the WordNet-based clustering method.  相似文献   

12.
13.
基于OWL的本体建模研究   总被引:1,自引:0,他引:1  
计算机对语义Web上的文档理解是建立在本体基础之上的,本体是用来定义概念和数据之间的关系的,因此本体建模是非常重要的。介绍了本体与语义Web的关系,并以基于OWL的本体为例,对本体的建模以及相关问题进行研究。  相似文献   

14.
基于形式概念的语义网本体的构建与展现   总被引:4,自引:0,他引:4  
作为语义网基础的本体是共享概念模型的明确的形式化规范说明,它提供一种让计算机可以交换、搜寻和认同文字信息的方式。有效地构建、展现本体成为应用本体的关键问题,然而,现有构建本体的各种方法都在不同方面存在着限制。经过分析比较,本文采用形式概念分析理论构造本体阶层来弥补缺陷,并结合机率模式展现本体,用于表达概念之间及概念、资料间的相关性,利用文件与概念的相关性排序结果,以便于用户找到最相关的信息,从而有效地提高了信息查找的效率。本文通过实例来演示本体的构造与表达。  相似文献   

15.
In the context of technological expansion and development, companies feel the need to renew and optimize their information systems as they search for the best way to manage knowledge. Business ontologies within the semantic web are an excellent tool for managing knowledge within this space. The proposal in this article consists of a methodology for integrating information in companies. The application of this methodology results in the creation of a specific business ontology capable of semantic interoperability. The resulting ontology, developed from the information system of specific companies, represents the fundamental business concepts, thus making it a highly appropriate information integration tool. Its level of semantic expressivity improves on that of its own sources, and its solidity and consistency are guaranteed by means of checking by current reasoning tools. An ontology created in this way could drive the renewal processes of companies’ information systems. A comparison is also made with a number of well-known business ontologies, and similarities and differences are drawn, highlighting the difficulty in aligning general ontologies to specific ones, such as the one we present.  相似文献   

16.
Before undertaking new biomedical research, identifying concepts that have already been patented is essential. A traditional keyword-based search on patent databases may not be sufficient to retrieve all the relevant information, especially for the biomedical domain. This paper presents BioPatentMiner, a system that facilitates information retrieval and knowledge discovery from biomedical patents. The system first identifies biological terms and relations from the patents and then integrates the information from the patents with knowledge from biomedical ontologies to create a semantic Web. Besides keyword search and queries linking the properties specified by one or more RDF triples, the system can discover semantic associations between the Web resources. The system also determines the importance of the resources to rank the results of a search and prevent information overload while determining the semantic associations.  相似文献   

17.
18.
Currently, most of the information available in the Web is adapted primarily for human consumption, but there is so much information that can no longer be processed by a person in a reasonable time, either in digital or physical formats. To solve this problem, the idea of the Semantic Web arose. The Semantic Web deals with adding machine-readable information to Web pages. Ontologies represent a very important element of this web, as they provide a valid and robust structure to represent knowledge based on concepts, relations, axioms, etc. The need for overcoming the bottleneck provoked by the manual construction of ontologies has generated several studies and research on obtaining semiautomatic methods to learn ontologies. In this sense, this paper proposes a new ontology learning methodology based on semantic role labeling from digital Spanish documents. The method makes it possible to represent multiple semantic relations specially taxonomic and partonomic ones in the standardized OWL 2.0. A set of experiments has been performed with the approach implemented in educational domain that show promising results.  相似文献   

19.
With the development of the Semantic Web technology, the use of ontologies to store and retrieve information covering several domains has increased. However, very few ontologies are able to cope with the ever-growing need of frequently updated semantic information or specific user requirements in specialized domains. As a result, a critical issue is related to the unavailability of relational information between concepts, also coined missing background knowledge. One solution to address this issue relies on the manual enrichment of ontologies by domain experts which is however a time consuming and costly process, hence the need for dynamic ontology enrichment. In this paper we present an automatic coupled statistical/semantic framework for dynamically enriching large-scale generic ontologies from the World Wide Web. Using the massive amount of information encoded in texts on the Web as a corpus, missing background knowledge can therefore be discovered through a combination of semantic relatedness measures and pattern acquisition techniques and subsequently exploited. The benefits of our approach are: (i) proposing the dynamic enrichment of large-scale generic ontologies with missing background knowledge, and thus, enabling the reuse of such knowledge, (ii) dealing with the issue of costly ontological manual enrichment by domain experts. Experimental results in a precision-based evaluation setting demonstrate the effectiveness of the proposed techniques.  相似文献   

20.
基于本体论的文本挖掘技术综述   总被引:6,自引:0,他引:6  
贾焰  王永恒  杨树强 《计算机应用》2006,26(9):2013-2015
文本挖掘技术是从海量文本信息中获取潜在有用知识的有效途径。传统的文本挖掘方法由于不能有效运用语义信息而难以达到更高的准确度。本体论为语义信息的合理表示和有效组织提供了理论支持和技术手段。介绍和分析了常识本体和领域本体以及基于这些本体的文本挖掘方法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号