首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 294 毫秒
1.
Structural disambiguation is acknowledged as a very real and frequent problem for many semantic-aware applications. In this paper, we propose a unified answer to sense disambiguation on a large variety of structures both at data and metadata level such as relational schemas, XML data and schemas, taxonomies, and ontologies. Our knowledge-based approach achieves a general applicability by converting the input structures into a common format and by allowing users to tailor the extraction of the context to the specific application needs and structure characteristics. Flexibility is ensured by supporting the combination of different disambiguation methods together with different information extracted from different sources of knowledge. Further, we support both assisted and completely automatic semantic annotation tasks, while several novel feedback techniques allow us to improve the initial disambiguation results without necessarily requiring user intervention. An extensive evaluation of the obtained results shows the good effectiveness of the proposed solutions on a large variety of structure-based information and disambiguation requirements.  相似文献   

2.
Keyword query is an important means to find object information in XML document. Most of the existing keyword query approaches adopt the subtrees rooted at the smallest lowest common ancestors of the keyword matching nodes as the basic result units. The structural relationships among XML nodes are excessively emphasized but the semantic relevance is not fully exploited.To change this situation, we propose the concept of entity subtree and emphasis the semantic relevance among different nodes as querying information from XML. In our approach, keyword query cases are improved to a new keyword-based query language, Grouping and Categorization Keyword Expression (GCKE) and the core query algorithm, finding entity subtrees (FEST) is proposed to return high quality results by fully using the keyword semantic meanings exposed by GCKE. We demonstrate the effectiveness and the efficiency of our approach through extensive experiments.  相似文献   

3.
The Semantic Web envisions a World Wide Web in which data is described with rich semantics and applications can pose complex queries. To this point, researchers have defined new languages for specifying meanings for concepts and developed techniques for reasoning about them, using RDF as the data model. To flourish, the Semantic Web needs to provide interoperability—both between sites with different terminologies and with existing data and the applications operating on them. To achieve this, we are faced with two problems. First, most of the world’s data is available not in RDF but in XML; XML and the applications consuming it rely not only on the domain structure of the data, but also on its document structure. Hence, to provide interoperability between such sources, we must map between both their domain structures and their document structures. Second, data management practitioners often prefer to exchange data through local point-to-point data translations, rather than mapping to common mediated schemas or ontologies.This paper describes the Piazza system, which addresses these challenges. Piazza offers a language for mediating between data sources on the Semantic Web, and it maps both the domain structure and document structure. Piazza also enables interoperation of XML data with RDF data that is accompanied by rich OWL ontologies. Mappings in Piazza are provided at a local scale between small sets of nodes, and our query answering algorithm is able to chain sets mappings together to obtain relevant data from across the Piazza network. We also describe an implemented scenario in Piazza and the lessons we learned from it.  相似文献   

4.
A protocol adapter ideally suited to enable enterprises to gradually transition from SOAP Web Services to RESTful HTTP Web Services without impacting existing clients is presented in this paper. The inherent advantage of such a transition is the visibility of RESTful HTTP messages to Web intermediaries such as caches. In contrast, SOAP messages are opaque, which disables Web intermediaries. While both approaches can use HyperText Transfer Protocol (HTTP) for message transfer, the paradigms contrast sharply. SOAP uses an interface specific approach whereas RESTful HTTP uses a Uniform Interface approach. SOAP marks up its payload with eXtensible Markup Language (XML) whereas in certain situations RESTful HTTP requires no XML. We present the disadvantages of the SOAP approach and outline how the RESTful HTTP approach solves these issues. We present results showing opaque SOAP messages transformed into transparent RESTful HTTP messages. We present StoRHm (SOAP to RESTful HTTP mapping), a protocol adapter which maps SOAP messages to RESTful HTTP format.  相似文献   

5.
XML plays an important role as the standard language for representing structured data for the traditional Web, and hence many Web-based knowledge management repositories store data and documents in XML. If semantics about the data are formally represented in an ontology, then it is possible to extract knowledge: This is done as ontology definitions and axioms are applied to XML data to automatically infer knowledge that is not explicitly represented in the repository. Ontologies also play a central role in realizing the burgeoning vision of the semantic Web, wherein data will be more sharable because their semantics will be represented in Web-accessible ontologies. In this paper, we demonstrate how an ontology can be used to extract knowledge from an exemplar XML repository of Shakespeare’s plays. We then implement an architecture for this ontology using de facto languages of the semantic Web including OWL and RuleML, thus preparing the ontology for use in data sharing. It has been predicted that the early adopters of the semantic Web will develop ontologies that leverage XML, provide intra-organizational value such as knowledge extraction capabilities that are irrespective of the semantic Web, and have the potential for inter-organizational data sharing over the semantic Web. The contribution of our proof-of-concept application, KROX, is that it serves as a blueprint for other ontology developers who believe that the growth of the semantic Web will unfold in this manner.
Henry M. KimEmail:
  相似文献   

6.
To populate a data warehouse specifically designed for Web data, i.e. web warehouse, it is imperative to harness relevant documents from the Web. In this paper, we describe a query mechanism called coupling query to glean relevant Web data in the context of our web warehousing system called Warehouse Of Web Data (WHOWEDA). Coupling query may be used for querying both HTML and XML documents. Some of the important features of our query mechanism are ability to query metadata, content, internal and external (hyperlink) structure of Web documents based on partial knowledge, ability to express constraints on tag attributes and tagless segment of data, ability to express conjunctive as well as disjunctive query conditions compactly, ability to control execution of a web query and preservation of the topological structure of hyperlinked documents in the query results. We also discuss how to formulate query graphically and in textual form using coupling graph and coupling text, respectively.  相似文献   

7.
针对目前数据交换方式在解决交换信息语义异构方面存在的不足,在XML技术的基础上,提出一种基于本体和Web Services的数据交换平台。首先给出一种数据交换平台的系统框架,对关键技术进行研究,提取各异构数据源的数据构造XML Schema文件,然后采用本体技术对其进行语义标记,形成带有语义信息的模式文件,最后通过对XML Schema文件进行模式匹配和映射,生成转换方案。实例效果表明该数据交换平台能有效地解决语义异构问题,并通过Web服务调用各业务系统实现数据交换和共享。  相似文献   

8.
Word sense disambiguation (WSD) is traditionally considered an AI-hard problem. A break-through in this field would have a significant impact on many relevant Web-based applications, such as Web information retrieval, improved access to Web services, information extraction, etc. Early approaches to WSD, based on knowledge representation techniques, have been replaced in the past few years by more robust machine learning and statistical techniques. The results of recent comparative evaluations of WSD systems, however, show that these methods have inherent limitations. On the other hand, the increasing availability of large-scale, rich lexical knowledge resources seems to provide new challenges to knowledge-based approaches. In this paper, we present a method, called structural semantic interconnections (SSI), which creates structural specifications of the possible senses for each word in a context and selects the best hypothesis according to a grammar G, describing relations between sense specifications. Sense specifications are created from several available lexical resources that we integrated in part manually, in part with the help of automatic procedures. The SSI algorithm has been applied to different semantic disambiguation problems, like automatic ontology population, disambiguation of sentences in generic texts, disambiguation of words in glossary definitions. Evaluation experiments have been performed on specific knowledge domains (e.g., tourism, computer networks, enterprise interoperability), as well as on standard disambiguation test sets.  相似文献   

9.
《Information Systems》2002,27(7):459-486
XML is spreading out as a standard for semistructured documents on the Web, so the possibility of querying XML documents which are linked by XML links is becoming a goal to achieve. In this paper we present XML-GLrec, an extended version of the graphical query language for XML documents XML-GL. XML-GL allows to extract and restructure information from XML specified WWW documents. We extend XML-GL in the following directions: (i) XML-GLrec allows to represent XML simple links, so that it is possible to query whole XML specified WWW sites in a simple and intuitive way; (ii) XML-GLrec improves the expressive power of XML-GL, where only transitive closure can be expressed, by allowing generic recursion; (iii) finally, we permit the user to specify queries in an easier fashion, by allowing sequences of nested query, in the same way as in SQL.  相似文献   

10.
The Semantic Web is the next step of the current Web where information will become more machine-understandable to support effective data discovery and integration. Hierarchical schemas, either in the form of tree-like structures (e.g., DTDs, XML schemas), or in the form of hierarchies on a category/subcategory basis (e.g., thematic hierarchies of portal catalogs), play an important role in this task. They are used to enrich semantically the available information. Up to now, hierarchical schemas have been treated rather as sets of individual elements, acting as semantic guides for browsing or querying data. Under that view, queries like “find the part of a portal catalog which is not present in another catalog” can be answered only in a procedural way, specifying which nodes to select and how to get them. For this reason, we argue that hierarchical schemas should be treated as full-fledged objects so as to allow for their manipulation. This work proposes models and operators to manipulate the structural information of hierarchies, considering them as first-class citizens. First, we explore the algebraic properties of trees representing hierarchies, and define a lattice algebraic structure on them. Then, turning this structure into a boolean algebra, we present the operators S-union, S-intersection and S-difference to support structural manipulation of hierarchies. These operators have certain algebraic properties to provide clear semantics and assist the transformation, simplification and optimization of sequences of operations using laws similar to those of set theory. Also, we identify the conditions under which this framework is applicable. Finally, we demonstrate an application of our framework for manipulating hierarchical schemas on tree-like hierarchies encoded as RDF/s files.  相似文献   

11.
The eXtensible Markup Language (XML) has reached a wide acceptance as the relevant standardization for representing and exchanging data on the Web. Unfortunately, XML covers the syntactic level but lacks semantics, and thus cannot be directly used for the Semantic Web. Currently, finding a way to utilize XML data for the Semantic Web is challenging research. As we have known that ontology can formally represent shared domain knowledge and enable semantics interoperability. Therefore, in this paper, we investigate how to represent and reason about XML with ontologies. Firstly, we give formalized representations of XML data sources, including Document Type Definitions (DTDs), XML Schemas, and XML documents. On this basis, we propose formal approaches for transforming the XML data sources into ontologies, and we also discuss the correctness of the transformations and provide several transformation examples. Furthermore, following the proposed approaches, we implement a prototype tool that can automatically transform XML into ontologies. Finally, we apply the transformed ontologies for reasoning about XML, so that some reasoning problems of XML may be checked by the existing ontology reasoners.  相似文献   

12.
As web users disseminate more of their personal information on the web, the possibility of these users becoming victims of lateral surveillance and identity theft increases. Therefore web resources containing this personal information, which we refer to as identity web references must be found and disambiguated to produce a unary set of web resources which refer to a given person. Such is the scale of the web that forcing web users to monitor their identity web references is not feasible, therefore automated approaches are required. However, automated approaches require background knowledge about the person whose identity web references are to be disambiguated. Within this paper we present a detailed approach to monitor the web presence of a given individual by obtaining background knowledge from Web 2.0 platforms to support automated disambiguation processes. We present a methodology for generating this background knowledge by exporting data from multiple Web 2.0 platforms as RDF data models and combining these models together for use as seed data. We present two disambiguation techniques; the first using a semi-supervised machine learning technique known as Self-training and the second using a graph-based technique known as Random Walks, we explain how the semantics of data supports the intrinsic functionalities of these techniques. We compare the performance of our presented disambiguation techniques against several baseline measures including human processing of the same data. We achieve an average precision level of 0.935 for Self-training and an average f-measure level of 0.705 for Random Walks in both cases outperforming several baselines measures.  相似文献   

13.
Extensible Markup Language (XML) is a common standard for data representation and exchange over the Web. Considering the increasing need for managing data on the Web, integration techniques are required to access heterogeneous XML sources. In this paper, we describe a unification method for heterogeneous XML schemata. The input to the unification method is a set of object-oriented-based canonical schemata that conceptually abstract local Document Type Definitions of the involved sources. The unification process applies specific algorithms and rules to the concepts of the canonical schemata to generate a preliminary ontology. Further adjustments on this preliminary ontology generate a reference ontology that acts as a front-end for user queries to the XML sources.  相似文献   

14.
Network embedding aims to encode nodes into a low-dimensional space with the structure and inherent properties of the networks preserved. It is an upstream technique for network analyses such as link prediction and node clustering. Most existing efforts are devoted to homogeneous or heterogeneous plain networks. However, networks in real-world scenarios are usually heterogeneous and not plain, i.e., they contain multi-type nodes/links and diverse node attributes. We refer such kind of networks with both heterogeneities and attributes as attributed heterogeneous networks (AHNs). Embedding AHNs faces two challenges: (1) how to fuse heterogeneous information sources including network structures, semantic information and node attributes; (2) how to capture uncertainty of node embeddings caused by diverse attributes. To tackle these challenges, we propose a unified embedding model which represents each node in an AHN with a Gaussian distribution (AHNG). AHNG fuses multi-type nodes/links and diverse attributes through a two-layer neural network and captures the uncertainty by embedding nodes as Gaussian distributions. Furthermore, the incorporation of node attributes makes AHNG inductive, embedding previously unseen nodes or isolated nodes without additional training. Extensive experiments on a large real-world dataset validate the effectiveness and efficiency of the proposed model.  相似文献   

15.
《Computer Networks》1999,31(11-16):1155-1169
An important application of XML is the interchange of electronic data (EDI) between multiple data sources on the Web. As XML data proliferates on the Web, applications will need to integrate and aggregate data from multiple source and clean and transform data to facilitate exchange. Data extraction, conversion, transformation, and integration are all well-understood database problems, and their solutions rely on a query language. We present a query language for XML, called XML-QL, which we argue is suitable for performing the above tasks. XML-QL is a declarative, `relational complete' query language and is simple enough that it can be optimized. XML-QL can extract data from existing XML documents and construct new XML documents.  相似文献   

16.
Machine-to-machine (M2M) communication is a crucial technology for collaborative manufacturing automation in the Industrial Internet of Things (IIoT)-empowered industrial networks. The new decentralized manufacturing automation paradigm features ubiquitous communication and interoperable interactions between machines. However, peer-to-peer (P2P) interoperable communications at the semantic level between industrial machines is a challenge. To address this challenge, we introduce a concept of Semantic-aware Cyber-Physical Systems (SCPSs) based on which manufacturing devices can establish semantic M2M communications. In this work, we propose a generic system architecture of SCPS and its enabling technologies. Our proposed system architecture adds a semantic layer and a communication layer to the conventional cyber-physical system (CPS) in order to maximize compatibility with the diverse CPS implementation architecture. With Semantic Web technologies as the backbone of the semantic layer, SCPSs can exchange semantic messages with maximum interoperability following the same understanding of the manufacturing context. A pilot implementation of the presented work is illustrated with a proof-of-concept case study between two semantic-aware cyber-physical machine tools. The semantic communication provided by the SCPS architecture makes ubiquitous M2M communication in a network of manufacturing devices environment possible, laying the foundation for collaborative manufacturing automation for achieving smart manufacturing. Another case study focusing on decentralized production control between machines in a workshop also proved the merits of semantic-aware M2M communication technologies.  相似文献   

17.
现有的XML关键字查询算法,通常只考虑节点间的结构信息,以包含关键字匹配节点的子树作为查询的结果,而节点间的语义相关性一直没有被充分利用。这也是导致现有查询算法的结果中普遍含有大量语义无关的冗余信息的主要原因。在该文中,我们首先对查询关键字的环境语义及节点间的语义相关性进行了定义,在此基础上,提出了一种新的关键字查询算法,寻找语义相关单元作为关键字查询的结果。这样获得的查询结果,一方面不含语义无关的冗余信息,另一方面也与用户的查询意图更加匹配。实验表明,该文提出的算法在查询效率和精确性上都有较大改进。  相似文献   

18.
词义消歧是一项具有挑战性的自然语言处理难题。作为词义消歧中的一种优秀的半监督消歧算法,遗传蚁群词义消歧算法能快速进行全文词义消歧。该算法采用了一种局部上下文的图模型来表示语义关系,以此进行词义消歧。然而,在消歧过程中却丢失了全局语义信息,出现了消歧结果冲突的问题,导致算法精度降低。因此, 提出了一种基于全局领域和短期记忆因子改进的图模型来表示语义以解决这个问题。该图模型引入了全局领域信息,增强了图对全局语义信息的处理能力。同时根据人的短期记忆原理,在模型中引入了短期记忆因子,增强了语义间的线性关系,避免了消歧结果冲突对词义消歧的影响。大量实验结果表明:与经典词义消歧算法相比,所提的改进图模型提高了词义消歧的精度。  相似文献   

19.
An aspect-oriented programming (AOP) based approach is proposed to perform context-aware service composition on the fly. It realises context-aware composition by semantically weaving context into Web service composition. The context weaver algorithm is implemented and illustrated. The proposed semantic weaving allows Web services to be composed as the context changes.  相似文献   

20.
Path queries have been extensively used to query semistructured data, such as the Web and XML documents. In this paper we introduce weighted path queries, an extension of path queries enabling several classes of optimization problems (such as the computation of shortest paths) to be easily expressed. Weighted path queries are based on the notion of weighted regular expression, i.e., a regular expression whose symbols are associated to a weight. We characterize the problem of answering weighted path queries and provide an algorithm for computing their answer. We also show how weighted path queries can be effectively embedded into query languages for XML data to express in a simple and compact form several meaningful research problems.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号