首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 66 毫秒
1.
在异质媒体应用迅速兴起,线上内容和线下服务对网络用户影响日益深刻的背景下,介绍了异质媒体分析的相关概念和方法,对异质媒体的多源自然属性和社会属性进行有效感知,揭示海量异质媒体的语义多样性、复杂关联和内在信息传播机制。文章主要内容涵盖以下几方面:首先,讨论异质媒体数据的跨平台、多模态和来源广泛等特性及其带来的挑战和机遇,介绍异质媒体分析技术的特点和传统单一媒体分析的不同之处,以及异质媒体研究可能带来的科学和社会影响力;其次,分别从异质媒体语义分析与理解、异质媒体关联建模和异质媒体社群分析等三个方面介绍异质媒体分析技术的国内外研究现状;最后,介绍作者及所在研究团队在异质语义分析理解,异质媒体中热点事件和话题分析以及异质媒体用户行为分析等方面的最新研究成果。  相似文献   

2.
Schema integration aims to create a mediated schema as a unified representation of existing heterogeneous sources sharing a common application domain. These sources have been increasingly written in XML due to its versatility and expressive power. Unfortunately, these sources often use different elements and structures to express the same concepts and relations, thus causing substantial semantic and structural conflicts. Such a challenge impedes the creation of high-quality mediated schemas and has not been adequately addressed by existing integration methods. In this paper, we propose a novel method, named XINTOR, for automating the integration of heterogeneous schemas. Given a set of XML sources and a set of correspondences between the source schemas, our method aims to create a complete and minimal mediated schema: it completely captures all of the concepts and relations in the sources without duplication, provided that the concepts do not overlap. Our contributions are fourfold. First, we resolve structural conflicts inherent in the source schemas. Second, we introduce a new statistics-based measure, called path cohesion, for selecting concepts and relations to be a part of the mediated schema. The path cohesion is statistically computed based on multiple path quality dimensions such as average path length and path frequency. Third, we resolve semantic conflicts by augmenting the semantics of similar concepts with context-dependent information. Finally, we propose a novel double-layered mediated schema to retain a wider range of concepts and relations than existing mediated schemas, which are at best either complete or minimal, but not both. Performed on both real and synthetic datasets, our experimental results show that XINTOR outperforms existing methods with respect to (i) the mediated-schema quality using precision, recall, F-measure, and schema minimality; and (ii) the execution performance based on execution time and scale-up performance.  相似文献   

3.
In this paper, we introduce an approach to task-driven ontology design which is based on information discovery from database schemas. Techniques for semi-automatically discovering terms and relationships used in the information space, denoting concepts, their properties and links are proposed, which are applied in two stages. At the first stage, the focus is on the discovery of heterogeneity/ambiguity of data representations in different schemas. For this purpose, schema elements are compared according to defined comparison features and similarity coefficients are evaluated. This stage produces a set of candidates for unification into ontology concepts. At the second stage, decisions are made on which candidates to unify into concepts and on how to relate concepts by semantic links. Ontology concepts and links can be accessed according to different perspectives, so that the ontology can serve different purposes, such as, providing a search space for powerful mechanisms for concept location, setting a basis for query formulation and processing, and establishing a reference for recognizing terminological relationships between elements in different schemas.  相似文献   

4.
This paper presents an approach to query decomposition in a multidatabase environment. The unique aspect of this approach is that it is based on performing transformations over an object algebra that can be used as the basis for a global query language. In the paper, we first present our multidatabase environment and semantic framework, where a global conceptual schema based on the Object Data Management Group standard encompasses the information from heterogeneous data sources that include relational databases as well as object-oriented databases and flat file sources. The meta-data about the global schema is enhanced with information about virtual classes as well as virtual relationships and inheritance hierarchies that exist between multiple sources. The AQUA object algebra is used as the formal foundation for manipulation of the query expression over the multidatabase. AQUA is enhanced with distribution operators for dealing with data distribution issues. During query decomposition we perform an extensive analysis of traversals for path expressions that involve virtual relationships and hierarchies for access to several heterogeneous sources. The distribution operators defined in algebraic terms enhance the global algebra expression with semantic information about the structure, distribution, and localization of the data sources relevant to the solution of the query. By using an object algebra as the basis for query processing, we are able to define algebraic transformations and exploit rewriting techniques during the decomposition phase. Our use of an object algebra also provides a formal and uniform representation for dealing with an object-oriented approach to multidatabase query processing. As part of our query processing discussion, we include an overview of a global object identification approach for relating semantically equivalent objects from diverse data sources, illustrating how knowledge about global object identity is used in the decomposition and assembly processes.  相似文献   

5.
The distributed nature of the Web, as a decentralized system exchanging information between heterogeneous sources, has underlined the need to manage interoperability, i.e., the ability to automatically interpret information in Web documents exchanged between different sources, necessary for efficient information management and search applications. In this context, XML was introduced as a data representation standard that simplifies the tasks of interoperation and integration among heterogeneous data sources, allowing to represent data in (semi-) structured documents consisting of hierarchically nested elements and atomic attributes. However, while XML was shown most effective in exchanging data, i.e., in syntactic interoperability, it has been proven limited when it comes to handling semantics, i.e.,  semantic interoperability, since it only specifies the syntactic and structural properties of the data without any further semantic meaning. As a result, XML semantic-aware processing has become a motivating challenge in Web data management, requiring dedicated semantic analysis and disambiguation methods to assign well-defined meaning to XML elements and attributes. In this context, most existing approaches: (i) ignore the problem of identifying ambiguous XML elements/nodes, (ii) only partially consider their structural relationships/context, (iii) use syntactic information in processing XML data regardless of the semantics involved, and (iv) are static in adopting fixed disambiguation constraints thus limiting user involvement. In this paper, we provide a new XML Semantic Disambiguation Framework titled XSDFdesigned to address each of the above limitations, taking as input: an XML document, and then producing as output a semantically augmented XML tree made of unambiguous semantic concepts extracted from a reference machine-readable semantic network. XSDF consists of four main modules for: (i) linguistic pre-processing of simple/compound XML node labels and values, (ii) selecting ambiguous XML nodes as targets for disambiguation, (iii) representing target nodes as special sphere neighborhood vectors including all XML structural relationships within a (user-chosen) range, and (iv) running context vectors through a hybrid disambiguation process, combining two approaches: concept-basedand context-based disambiguation, allowing the user to tune disambiguation parameters following her needs. Conducted experiments demonstrate the effectiveness and efficiency of our approach in comparison with alternative methods. We also discuss some practical applications of our method, ranging over semantic-aware query rewriting, semantic document clustering and classification, Mobile and Web services search and discovery, as well as blog analysis and event detection in social networks and tweets.  相似文献   

6.
An evaluation of ontology matching in geo-service applications   总被引:1,自引:0,他引:1  
Matching between concepts describing the meaning of services representing heterogeneous information sources is a key operation in many application domains, including web service coordination, data integration, peer-to-peer information sharing, query answering, and so on. In this paper we present an evaluation of an ontology matching approach, specifically of structure-preserving semantic matching (SPSM) solution. In particular, we discuss the SPSM approach used to reduce the semantic heterogeneity problem among geo web services and we evaluate the SPSM solution on real world GIS ESRI ArcWeb services. The first experiment included matching of original web service method signatures to synthetically alterated ones. In the second experiment we compared a manual classification of our dataset to the automatic (unsupervised) classification produced by SPSM. The evaluation results demonstrate robustness and good performance of the SPSM approach on a large (ca. 700 000) number of matching tasks.  相似文献   

7.
为解决电子目录集成中的本体实例冗余问题,提出面向本体合并的电子目录本体实例消重机制,综合考虑实例的名称、属性和关系设计实例语义相似度算法。结合字符串匹配和基于Wordnet计算名称语义相似度,综合数据类型属性和对象类型属性计算属性语义相似度,依据类的多重继承关系计算关系语义相似度。当2个实例的语义相似度大于事先设置的阈值时,需删除其中一个以降低目标本体库的冗余度。实验结果验证了该机制的有效性。  相似文献   

8.
基于混合推理的知识库的构建及其应用研究   总被引:2,自引:0,他引:2  
该文提出了基于OWL本体与Prolog规则的平面几何知识库的构建方法,从而可形式化地表示平面几何中丰富的语义信息.一方面,用类型、定义域、值域、分类、属性、实例等本体描述来表达结构化的知识,为领域内概念与概念之间关系的描述提供形式化的语义;另一方面,用Prolog规则来解决本体不能有效表达的诸如属性之间的关系和操作等问题,从而支持复杂关系间的推理.在此基础上,用Protégé和Prolog构建了一个基于本体和规则的平面几何知识库.实验证明:此知识库可实现知识和语义层次上的信息查询,还可进行复杂问题求解,其丰富的语义描述和混合推理能力弥补了传统知识库的不足.  相似文献   

9.
Harmonising the metadata format alone does not solve the issue of efficient access to relevant information in heterogeneous environments, when different systems use different content, contextual and semantic concepts for certain entities. One such type of heterogeneous systems are also Current Research Information Systems (CRIS), which store their data primarily in local relational databases, using different formats and various local concepts.In this article, we study the possibilities and propose a new ontologically supported semantic search engine (OSSSE) which, in addition to the harmonisation of the metadata format among local CRIS systems, also ensures that the meaning of data and/or concepts that belong to various metadata entities are also harmonised. A special model of ontological infrastructure was designed, and dedicated test ontology was created alongside with a new simplified algorithm for creating ontology, the basis of which is the distinction between new and already existing classes in terms of content. Finally, we evaluated the proposed OSSSE model using a simulation of the search process on the base of 41,113 real searches within SICRIS. The obtained results show that regardless of the search situation, the proposed OSSSE is always at least as efficient as a search without ontological support in terms of precision, while recall remains the same; the improvement has been shown to be statistically significant with a high confidence interval (p<0.005).The proposed OSSSE model is able to solve the issue of harmonizing the data where different heterogeneous systems use different content, contextual and semantic concepts, which is the case in many advanced expert systems. In this manner, the more the search is carried out based on the properties described by the supporting ontology, the more the infrastructure can help a searcher. The proposed concepts, ontological infrastructure and the designed semantic search engine may well help to improve search precision in several information retrieval systems.  相似文献   

10.
Object-oriented semantic metrics address software quality by assessing underlying code meaning. Previous metrics were based on mapping a class's semantic information onto concepts in an application domain knowledge base. Quality measurements were made by operating on the concepts mapped onto. In this work, we consider more complex inter-concept relationships—semantic disambiguities through semantic connections. The idea is that a level of ambiguity is indicated by the connectivity within the knowledge base between two concepts. A cohesion metric based on this idea is shown to perform as well as traditional metrics, and is available much earlier in the development cycle.  相似文献   

11.
12.
自动获取视频语义信息有助于提高基于内容的视频检索系统的性能。其主要方法之一是利用视频语义网络推理得到视频的语义。为了获得视频语义网络,在传统的三阶段相关性分析算法(TTPDA)的基础上,提出了改进的三阶段相关性分析算法(ITPDA),用以学习语义概念之间的联系,以便对视频进行语义标注。相比于TTPDA、ITPDA算法的优点是:在无法获得节点的排序或只能获得部分节点排序的情况下,也能较快地学习得到语义网络结构,而且确定语义网中边方向的时间复杂度从TTPDA的O(n4)降为O(n2)(其中n是语义网中节点的数目)。实验结果表明:利用ITPDA算法建立语义网是行之有效的,而且在所得到的语义网上进行视频语义标注,其效果优于TTPDA。  相似文献   

13.
Metadata about information sources (e.g., databases and repositories) can be collected by Query Sampling (QS). Such metadata can include topics and statistics (e.g., term frequencies) about the information sources. This provides important evidence for determining which sources in the distributed information space should be selected for a given user query. The aim of this paper is to find out the semantic relationships between the information sources in order to distribute user queries to a large number of sources. Thereby, we propose an evolutionary approach for automatically conducting QS using multiple crawlers and obtaining the optimized semantic network from the sources. The aim of combining QS and evolutionary methods is to collaboratively extract metadata about target sources and optimally integrate the metadata, respectively. For evaluating the performance of contextualized QS on 122 information sources, we have compared the ranking lists recommended by the proposed method with user feedback (i.e., ideal ranks), and also computed the precision of the discovered subsumptions in terms of the semantic relationships between the target sources.  相似文献   

14.
15.
Information sources such as relational databases, spreadsheets, XML, JSON, and Web APIs contain a tremendous amount of structured data that can be leveraged to build and augment knowledge graphs. However, they rarely provide a semantic model to describe their contents. Semantic models of data sources represent the implicit meaning of the data by specifying the concepts and the relationships within the data. Such models are the key ingredients to automatically publish the data into knowledge graphs. Manually modeling the semantics of data sources requires significant effort and expertise, and although desirable, building these models automatically is a challenging problem. Most of the related work focuses on semantic annotation of the data fields (source attributes). However, constructing a semantic model that explicitly describes the relationships between the attributes in addition to their semantic types is critical.We present a novel approach that exploits the knowledge from a domain ontology and the semantic models of previously modeled sources to automatically learn a rich semantic model for a new source. This model represents the semantics of the new source in terms of the concepts and relationships defined by the domain ontology. Given some sample data from the new source, we leverage the knowledge in the domain ontology and the known semantic models to construct a weighted graph that represents the space of plausible semantic models for the new source. Then, we compute the top k candidate semantic models and suggest to the user a ranked list of the semantic models for the new source. The approach takes into account user corrections to learn more accurate semantic models on future data sources. Our evaluation shows that our method generates expressive semantic models for data sources and services with minimal user input. These precise models make it possible to automatically integrate the data across sources and provide rich support for source discovery and service composition. They also make it possible to automatically publish semantic data into knowledge graphs.  相似文献   

16.
Shared ontologies describe concepts and relationships to resolve semantic conflicts amongst users accessing multiple autonomous and heterogeneous information sources. We contend that while ontologies are useful in semantic reconciliation, they do not guarantee correct classification of semantic conflicts, nor do they provide the capability to handle evolving semantics or a mechanism to support a dynamic reconciliation process. Their limitations are illustrated through a conceptual analysis of several prominent examples used in heterogeneous database systems and in natural language processing. We view semantic reconciliation as a nonmonotonic query-dependent process that requires flexible interpretation of query context, and as a mechanism to coordinate knowledge elicitation while constructing the query context. We propose a system that is based on these characteristics, namely the SCOPES (Semantic Coordinator Over Parallel Exploration Spaces) system. SCOPES takes advantage of ontologies to constrain exploration of a remote database during the incremental discovery and refinement of the context within which a query can be answered. It uses an Assumption-based Truth Maintenance System (ATMS) to manage the multiple plausible contexts which coexist while the semantic reconciliation process is unfolding, and the Dempster-Shafer (DS) theory of belief to model the likelihood of these plausible contexts.  相似文献   

17.
18.
医疗物联网及移动医疗应用中多种传感器采集的生命体征数据,以及各类健康医疗数据彼此之间存在语义异构性,导致智能医疗物联设备数据融合困难。针对这一问题,研究了基于开放关联数据的语义消歧方法。首先对设备数据进行本体建模,形成局部本体;然后利用图匹配算法将局部本体与开放医疗关联数据进行概念对齐,间接消除异源数据间的语义异构性;最后,在运动手环与体重计数据融合实验中,通过与开放关联数据源的关联匹配判定血压和体重等异构概念属于语义相关概念。实验结果表明,通过与开放关联数据源关联,可以实现局部本体语义扩展,进一步实现异源医联网设备数据融合。  相似文献   

19.
Establishing semantic interoperability among heterogeneous information sources has been a critical issue in the database community for the past two decades. Despite the critical importance, current approaches to semantic interoperability of heterogeneous databases have not been sufficiently effective. We propose a common ontology called semantic conflict resolution ontology (SCROL) that addresses the inherent difficulties in the conventional approaches, i.e., federated schema and domain ontology approaches. SCROL provides a systematic method for automatically detecting and resolving various semantic conflicts in heterogeneous databases. SCROL provides a dynamic mechanism of comparing and manipulating contextual knowledge of each information source, which is useful in achieving semantic interoperability among heterogeneous databases. We show how SCROL is used for detecting and resolving semantic conflicts between semantically equivalent schema and data elements. In addition, we present evaluation results to show that SCROL can be successfully used to automate the process of identifying and resolving semantic conflicts.  相似文献   

20.
The availability of numerous sources of structured data on the Internet poses the problem of their integration into a unified information space just as the unstructured data and weakly structured data sources are integrated in the framework of the WWW. The main requirement for such an information space is the simplicity of operation for the users that are not trained IT experts. The architecture of a system of semantic integration of distributed and heterogeneous data sources is proposed in the framework of a unified semantic access interface. The semantic nature of this interface lies in the fact that one can interact with such a system in terms of the concepts of the application domain completely ignoring the implementation details of the systems being integrated. The proposed architecture simplifies the users’ work, the integration process, and the development of user forms with a rich functionality including a semantic navigation between the forms. A distinctive feature of this architecture is that the system integration and the development of user forms are performed declaratively in the interactive mode without programming. The simplification of the users’ work with the system is achieved due to some special properties of the semantically complete model (SCM) and of the semantically complete query language (SCQL), which provide a basis for the system. A prototype of the system under study is briefly described. The prototype is implemented as a type of the client-server technology based on the SCM-SCQL.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号