首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
随着语义Web的发展,越来越多的RDF数据发布到Web上,需要一个可以提供存储和查询功能的数据管理系统来对海量的RDF数据进行管理。针对上述问题,设计并实现了一种大规模RDF语义数据的分布式存储方案。该方案通过RDF数据装载和预处理,可以有效地管理海量的RDF数据,并通过构建索引可以有效地对大规模RDF数据进行查询。工作包括底层的RDF存储方案的设计与实现,数据的预处理与装载。同时,设计了一系列实验来评估和对比不同节点数目的Cassandra集群之间的性能,数据采用的是从DBpedia获得的13 million行RDF的数据集。实验结果显示,方案对大规模RDF语义数据的存储和查询具有性能优势。  相似文献   

2.
Semantic Web applications share a large portion of development effort with database-driven Web applications. Existing approaches for development of these database-driven applications cannot be directly applied to Semantic Web data due to differences in the underlying data model. We develop a mapping approach that embeds Semantic Web data into object-oriented languages and thereby enables reuse of existing Web application frameworks.We analyse the relation between the Semantic Web and the Web, and survey the typical data access patterns in Semantic Web applications. We discuss the mismatch between object-oriented programming languages and Semantic Web data, for example in the semantics of class membership, inheritance relations, and object conformance to schemas.We present ActiveRDF, an object-oriented API for managing RDF data that offers full manipulation and querying of RDF data, does not rely on a schema and fully conforms to RDF(S) semantics. ActiveRDF can be used with different RDF data stores: adapters have been implemented to generic SPARQL endpoints, Sesame, Jena, Redland and YARS and new adapters can be added easily. We demonstrate the usage of ActiveRDF and its integration with the popular Ruby on Rails framework which enables rapid development of Semantic Web applications.  相似文献   

3.
RDF数据浏览的研究综述   总被引:1,自引:0,他引:1  
随着语义网的快速发展,目前Web上语义网数据已经达到相当的规模,成为重要的信息和知识来源.因此,RDF数据浏览的研究开始得到广泛关注.通过对比传统Web信息浏览和RDF数据浏览两个问题,指出RDF数据浏览的5个重要问题:确定浏览子图的模式、数据的收集、大规模数据的处理、数据的组织方式以及数据的呈现方式.基于这些挑战,我们调研了多个系统和不同的解决方案.最后,总结了目前的研究现况,讨论存在的挑战,并提出未来的研究方向.  相似文献   

4.
RDF is the data interchange layer for the Semantic Web. In order to manage the increasing amount of RDF data, an RDF repository should provide not only the necessary scalability and efficiency, but also sufficient inference capabilities. Though existing RDF repositories have made progress towards these goals, there is still ample space for improving the overall performance. In this paper, we propose a native RDF repository, System Π, to pursue a better tradeoff among system scalability, query efficiency, and inference capabilities. System Π takes a hypergraph representation for RDF as the data model for its persistent storage, which effectively avoids the costs of data model transformation when accessing RDF data. Based on this native storage scheme, a set of efficient semantic query processing techniques are designed. First, several indices are built to accelerate RDF data access including a value index, a labeling scheme for transitive closure computation, and three triple indices. Second, we propose a hybrid inference strategy under the pD * semantics to support inference for OWL-Lite with a relatively low computational complexity. Finally, we extend the SPARQL algebra to explicitly express inference semantics in logical query plan by defining some new algebra operators. In addition, MD5 hash value of URI and schema level cache are introduced as practical implementation techniques. The results of performance evaluation on the LUBM benchmark and a real data set show that System Π has a better combined metric value than other comparable systems.  相似文献   

5.
The Semantic Web: the roles of XML and RDF   总被引:2,自引:0,他引:2  
XML and RDF are the current standards for establishing semantic interoperability on the Web, but XML addresses only document structure. RDF better facilitates interoperation because it provides a data model that can be extended to address sophisticated ontology representation techniques. We explain the role of ontologies in the architecture of the Semantic Web. We then briefly summarize key elements of XML and RDF, showing why using XML as a tool for semantic interoperability will be ineffective in the long run. We argue that a further representation and inference layer is needed on top of the Web's current layers, and to establish such a layer, we propose a general method for encoding ontology representation languages into RDF/RDF schema. We illustrate the extension method by applying it to Ontology Interchange Language, an ontology representation and inference language  相似文献   

6.
Since the beginning of the Semantic Web initiative, significant efforts have been invested in finding efficient ways to publish, store, and query metadata on the Web. RDF and SPARQL have become the standard data model and query language, respectively, to describe resources on the Web. Large amounts of RDF data are now available either as stand-alone datasets or as metadata over semi-structured (typically XML) documents. The ability to apply RDF annotations over XML data emphasizes the need to represent and query data and metadata simultaneously. We propose XR, a novel hybrid data model capturing the structural aspects of XML data and the semantics of RDF, also enabling us to reason about XML data. Our model is general enough to describe pure XML or RDF datasets, as well as RDF-annotated XML data, where any XML node can act as a resource. This data model comes with the XRQ query language that combines features of both XQuery and SPARQL. To demonstrate the feasibility of this hybrid XML-RDF data management setting, and to validate its interest, we have developed an XR platform on top of well-known data management systems for XML and RDF. In particular, the platform features several XRQ query processing algorithms, whose performance is experimentally compared.  相似文献   

7.
8.
张祥  葛唯益  瞿裕忠 《软件学报》2009,20(10):2834-3843
随着语义网中RDF数据的大量涌现,语义搜索引擎为用户搜索RDF数据带来了便利.但是,如何自动地发现包含语义网信息资源的站点,并高效地在语义网站点中收集语义网信息资源,一直是语义搜索引擎所面临的问题.首先介绍了语义网站点的链接模型.该模型刻画了语义网站点、语义网信息资源、RDF模型和语义网实体之间的关系.基于该模型讨论了语义网实体的归属问题,并进一步定义了语义网站点的发现规则;另外,从站点链接模型出发,定义了语义网站点依赖图,并给出了对语义网站点进行排序的算法.将相关算法在一个真实的语义搜索引擎中进行了初步测试.实验结果表明,所提出的方法可以有效地发现语义网站点并对站点进行排序.  相似文献   

9.
10.
语义Web作为数据之网不断汇集并组织Web信息,相关应用因此面临着对语义Web所含大规模RDF数据高效访问的挑战.利用并行处理技术提高性能是一种解决之道,其核心是RDF数据的放置策略和并行查询处理.已有工作尚未系统研究RDF数据放置策略的分类与特点,及其对查询处理性能的影响.分析了RDF数据上各类数据放置幕略及其对查询处理性能影响,通过LUBM测试基准结果分析评价了典型的RDF并行处理策略(数据放置策略及相应并行查询处理)的实际性能,为提出更有效的并行处理策略奠定了基础.  相似文献   

11.
The Semantic Web envisions a World Wide Web in which data is described with rich semantics and applications can pose complex queries. To this point, researchers have defined new languages for specifying meanings for concepts and developed techniques for reasoning about them, using RDF as the data model. To flourish, the Semantic Web needs to provide interoperability—both between sites with different terminologies and with existing data and the applications operating on them. To achieve this, we are faced with two problems. First, most of the world’s data is available not in RDF but in XML; XML and the applications consuming it rely not only on the domain structure of the data, but also on its document structure. Hence, to provide interoperability between such sources, we must map between both their domain structures and their document structures. Second, data management practitioners often prefer to exchange data through local point-to-point data translations, rather than mapping to common mediated schemas or ontologies.This paper describes the Piazza system, which addresses these challenges. Piazza offers a language for mediating between data sources on the Semantic Web, and it maps both the domain structure and document structure. Piazza also enables interoperation of XML data with RDF data that is accompanied by rich OWL ontologies. Mappings in Piazza are provided at a local scale between small sets of nodes, and our query answering algorithm is able to chain sets mappings together to obtain relevant data from across the Piazza network. We also describe an implemented scenario in Piazza and the lessons we learned from it.  相似文献   

12.
From the Semantic Web’s inception, a number of concurrent initiatives have given rise to multiple segments: large semantic datasets, exposed by query endpoints; online Semantic Web documents, in the form of RDF files; and semantically annotated web content (e.g., using RDFa), semantic sources in their own right. In various mobile application scenarios, online semantic data has proven to be useful. While query endpoints are most commonly exploited, they are mainly useful to expose large semantic datasets. Alternatively, mobile RDF stores are utilized to query local semantic data, but this requires the design-time identification and replication of relevant data. Instead, we present a mobile query service that supports on-the-fly and integrated querying of semantic data, originating from a largely unused portion of the Semantic Web, comprising online RDF files and semantics embedded in annotated webpages. To that end, our solution performs dynamic identification, retrieval and caching of query-relevant semantic data. We explore several data identification and caching alternatives, and investigate the utility of source metadata in optimizing these tasks. Further, we introduce a novel cache replacement strategy, fine-tuned to the described query dataset, and include explicit support for the Open World Assumption. An extensive experimental validation evaluates the query service and its alternative components.  相似文献   

13.
In the Semantic Web vision of the World Wide Web, content will not only be accessible to humans but will also be available in machine interpretable form as ontological knowledge bases. Ontological knowledge bases enable formal querying and reasoning and, consequently, a main research focus has been the investigation of how deductive reasoning can be utilized in ontological representations to enable more advanced applications. However, purely logic methods have not yet proven to be very effective for several reasons: First, there still is the unsolved problem of scalability of reasoning to Web scale. Second, logical reasoning has problems with uncertain information, which is abundant on Semantic Web data due to its distributed and heterogeneous nature. Third, the construction of ontological knowledge bases suitable for advanced reasoning techniques is complex, which ultimately results in a lack of such expressive real-world data sets with large amounts of instance data. From another perspective, the more expressive structured representations open up new opportunities for data mining, knowledge extraction and machine learning techniques. If moving towards the idea that part of the knowledge already lies in the data, inductive methods appear promising, in particular since inductive methods can inherently handle noisy, inconsistent, uncertain and missing data. While there has been broad coverage of inducing concept structures from less structured sources (text, Web pages), like in ontology learning, given the problems mentioned above, we focus on new methods for dealing with Semantic Web knowledge bases, relying on statistical inference on their standard representations. We argue that machine learning research has to offer a wide variety of methods applicable to different expressivity levels of Semantic Web knowledge bases: ranging from weakly expressive but widely available knowledge bases in RDF to highly expressive first-order knowledge bases, this paper surveys statistical approaches to mining the Semantic Web. We specifically cover similarity and distance-based methods, kernel machines, multivariate prediction models, relational graphical models and first-order probabilistic learning approaches and discuss their applicability to Semantic Web representations. Finally we present selected experiments which were conducted on Semantic Web mining tasks for some of the algorithms presented before. This is intended to show the breadth and general potential of this exiting new research and application area for data mining.  相似文献   

14.
Semantic Web technologies must integrate with Web 2.0 services for both to leverage each others strengths. We argue that the REST-based design methodologies [R.T. Fielding, R.N. Taylor, Principled design of the modern web architecture, ACM Trans. Internet Technol. (TOIT) 2 (2) (2002) 115–150] of the web present the ideal mechanism through which to align the publication of semantic data with the existing web architecture. We present the design and implementation of two solutions that combine REST-based design and RDF [D. Beckett (Ed.), RDF/XML Syntax Specification (Revised), W3C Recommendation, February 10, 2004] data access: one solution for integrating existing web services and one server-side solution for creating RDF REST services. Both of these solutions enable SPARQL [E. Prud’hommeaux, A. Seaborne (Eds.), SPARQL Query Language for RDF, W3C Working Draft, March 26, 2007] to be a unifying data access layer for aligning the Semantic Web and Web 2.0.  相似文献   

15.
Many RDF systems support reasoning with Datalog rules via materialisation, where all conclusions of RDF data and the rules are precomputed and explicitly stored in a preprocessing step. As the amount of RDF data used in applications keeps increasing, processing large datasets often requires distributing the data in a cluster of shared-nothing servers. While numerous distributed query answering techniques are known, distributed materialisation is less well understood. In this paper, we present several techniques that facilitate scalable materialisation in distributed RDF systems. First, we present a new distributed materialisation algorithm that aims to minimise communication and synchronisation in the cluster. Second, we present two new algorithms for partitioning RDF data, both of which aim to produce tightly connected partitions, but without loading complete datasets into memory. We evaluate our materialisation algorithm against two state-of-the-art distributed Datalog systems and show that our technique offers competitive performance, particularly when the rules are complex. Moreover, we analyse in depth the effects of data partitioning on reasoning performance and show that our techniques offer performance comparable or superior to the state of the art min-cut partitioning, but computing the partitions requires considerably less time and memory.  相似文献   

16.
17.
18.
Web search engines need to provide high throughput and short query latency. Recent results show that pipelined query processing over a term-wise partitioned inverted index may have superior throughput. However, the query processing latency and scalability with respect to the collections size are the main challenges associated with this method. In this paper, we evaluate the effect of inverted index skipping on the performance of pipelined query processing. Further, we introduce a novel idea of using Max-Score pruning within pipelined query processing and a new term assignment heuristic, partitioning by Max-Score. Our current results indicate a significant improvement over the state-of-the-art approach and lead to several further optimizations which include dynamic load balancing, intra-query concurrent processing and a hybrid combination between pipelined and non-pipelined execution. Lastly, we show how the state of term-wise partitioning relates to the industry standard document-wise partitioning. Even though there are situations pipelined query processing is advantegous, document-wise partitioning is still the road to follow.  相似文献   

19.
The evidenced fact that “Linking is as powerful as computing” in a dynamic web context has lead to evaluating Turing completeness for hypertext systems based on their linking model. The same evaluation can be applied to the Semantic Web domain too. RDF is the default data model of the Semantic Web links, so the evaluation comes back to whether or not RDF can support the required computational power at the linking level. RDF represents semantic relationships with explicitly naming the participating triples, however the enumeration is only one method amongst many for representing relations, and not always the most efficient or viable. In this paper we firstly consider that Turing completeness of binary-linked hypertext is realized if and only if the links are dynamic (functional). Ashman’s Binary Relation Model (BRM) showed that binary relations can most usefully be represented with Mili’s pE (predicate-expression) representation, and Moreau and Hall concluded that hypertext systems which use the pE representation as the basis for their linking (relation) activities are Turing-complete. Secondly we consider that RDF –as it is- is a static version of a general ternary relations model, called TRM. We then conclude that the current computing power of the Semantic Web depends on the dynamicity supported by its underlying TRM. The value of this is firstly that RDF’s triples can be considered within a framework and compared to alternatives, such as the TRM version of pE, designated pfE (predicate-function-expression). Secondly, that a system whose relations are represented with pfE is likewise going to be Turing-complete. Thus moving from RDF to a pfE representation of relations would give far greater power and flexibility within the Semantic Web applications.  相似文献   

20.
语义Web中RDF元数据的存储与管理   总被引:1,自引:0,他引:1  
吴琴霞  张志鸿 《微计算机信息》2007,23(33):144-145,132
语义Web的实现首要解决的问题就是对资源的描述,RDF是描述信息资源的基础,管理和存储RDF数据成为必须要解决的问题。如果把RDF数据存储在关系数据库中,就可以有效地利用现有的数据库资源来管理RDF数据。本文用垂直Scheme的格式来构造RDF数据存储表,通过模式映射把RDF数据映射成RDF数据存储表中的记录;此外又给出了用RDF视图查询RDF数据的方法为实现语义查询打下了基础。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号