首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 250 毫秒
1.
In this paper we study queries over relational databases with integrity constraints (ICs). The main problem we analyze is OWA query answering, i.e., query answering over a database with ICs under open-world assumption. The kinds of ICs that we consider are inclusion dependencies and functional dependencies, in particular key dependencies; the query languages we consider are conjunctive queries and unions of conjunctive queries. We present results about the decidability of OWA query answering under ICs. In particular, we study OWA query answering both over finite databases and over unrestricted databases, and identify the cases in which such a problem is finitely controllable, i.e., when OWA query answering over finite databases coincides with OWA query answering over unrestricted databases. Moreover, we are able to easily turn the above results into new results about implication of ICs and query containment under ICs, due to the deep relationship between OWA query answering and these two classical problems in database theory. In particular, we close two long-standing open problems in query containment, since we prove finite controllability of containment of conjunctive queries both under arbitrary inclusion dependencies and under key and foreign key dependencies. The results of our investigation are very relevant in many research areas which have recently dealt with databases under an incomplete information assumption: e.g., data integration, data exchange, view-based information access, ontology-based information systems, and peer data management systems.  相似文献   

2.
Navigational features have been largely recognized as fundamental for graph database query languages. This fact has motivated several authors to propose RDF query languages with navigational capabilities. In this paper, we propose the query language nSPARQL that uses nested regular expressions to navigate RDF data. We study some of the fundamental properties of nSPARQL and nested regular expressions concerning expressiveness and complexity of evaluation. Regarding expressiveness, we show that nSPARQL is expressive enough to answer queries considering the semantics of the RDFS vocabulary by directly traversing the input graph. We also show that nesting is necessary in nSPARQL to obtain this last result, and we study the expressiveness of the combination of nested regular expressions and SPARQL operators. Regarding complexity of evaluation, we prove that given an RDF graph G and a nested regular expression E, this problem can be solved in time O(|G||E|).  相似文献   

3.
4.
The availability of large amounts of open, distributed, and structured semantic data on the web has no precedent in the history of computer science. In recent years, there have been important advances in semantic search and question answering over RDF data. In particular, natural language interfaces to online semantic data have the advantage that they can exploit the expressive power of Semantic Web data models and query languages, while at the same time hiding their complexity from the user. However, despite the increasing interest in this area, there are no evaluations so far that systematically evaluate this kind of systems, in contrast to traditional question answering and search interfaces to document spaces. To address this gap, we have set up a series of evaluation challenges for question answering over linked data. The main goal of the challenge was to get insight into the strengths, capabilities, and current shortcomings of question answering systems as interfaces to query linked data sources, as well as benchmarking how these interaction paradigms can deal with the fact that the amount of RDF data available on the web is very large and heterogeneous with respect to the vocabularies and schemas used. Here, we report on the results from the first and second of such evaluation campaigns. We also discuss how the second evaluation addressed some of the issues and limitations which arose from the first one, as well as the open issues to be addressed in future competitions.  相似文献   

5.
The SPARQL LeftJoin abstract operator is not distributive over Union; this limits the algebraic manipulation of graph patterns, which in turn restricts the ability to create query plans for distributed processing or query optimization. In this paper, we present semQA, an algebraic extension for the SPARQL query language for RDF, which overcomes this issue by transforming graph patterns through the use of an idempotent disjunction operator Or as a substitute for Union. This permits the application of a set of equivalences that transform a query into distinct forms. We further present an algorithm to derive the solution set of the original query from the solution set of a query where Union has been substituted by Or. We also analyze the combined complexity of SPARQL, proving it to be NP-complete. It is also shown that the SPARQL query language is not, in the general case, fixed-parameter tractable. Experimental results are presented to validate the query evaluation methodology presented in this paper against the SPARQL standard, to corroborate the complexity analysis, and to illustrate the gains in processing cost reduction that can be obtained through the application of semQA.  相似文献   

6.
Since the beginning of the Semantic Web initiative, significant efforts have been invested in finding efficient ways to publish, store, and query metadata on the Web. RDF and SPARQL have become the standard data model and query language, respectively, to describe resources on the Web. Large amounts of RDF data are now available either as stand-alone datasets or as metadata over semi-structured (typically XML) documents. The ability to apply RDF annotations over XML data emphasizes the need to represent and query data and metadata simultaneously. We propose XR, a novel hybrid data model capturing the structural aspects of XML data and the semantics of RDF, also enabling us to reason about XML data. Our model is general enough to describe pure XML or RDF datasets, as well as RDF-annotated XML data, where any XML node can act as a resource. This data model comes with the XRQ query language that combines features of both XQuery and SPARQL. To demonstrate the feasibility of this hybrid XML-RDF data management setting, and to validate its interest, we have developed an XR platform on top of well-known data management systems for XML and RDF. In particular, the platform features several XRQ query processing algorithms, whose performance is experimentally compared.  相似文献   

7.
Starting from the XQuery language we define XBind, an XML analog of relational conjunctive queries as well as a related class of XML integrity constraints (dependencies). We identify a fragment of XBind for which containment is decidable, in fact Π2p-complete, and a further fragment for which containment is NP-complete. We extend the containment algorithm to take XML dependencies into account. We give an algorithm for the reformulation of XBind queries under combinations of GAV and LAV XQuery views, as well as additional dependencies. We prove a completeness theorem which guarantees that under certain conditions, our algorithm will find a minimal reformulation if one exists. Moreover, we identify conditions when this algorithm achieves optimal complexity bounds. Our results on containment and reformulation depend on certain restrictions on the query and constraint languages. We calibrate the results by showing that lifting these restrictions significantly changes the complexity of the problems.  相似文献   

8.
We describe a new connection between the problem of finding rigid matrices, as posed by Valiant (MFCS 1977), and the problem of proving lower bounds for linear locally correctable codes. Our result shows that proving linear lower bounds on locally correctable codes with super-logarithmic query complexity will give new constructions of rigid matrices. The interest in constructing rigid matrices is their connection to circuit lower bounds.  相似文献   

9.
Conjunctive queries (CQs) are at the core of query languages encountered in many logic-based research fields such as AI, or database systems. The majority of existing work assumes set semantics but often in real applications the manipulation of duplicate tuples is required. One of the major problems that arises as part of advanced features of query optimization, data integration, query reformulation and many other research topics is testing for containment of such queries. In this work, we investigate the complexity of query containment problem for CQs under bag semantics (i.e. duplicate tuples are allowed in both the database and the results of queries) and under bag-set semantics (i.e. duplicates are allowed in the result of the queries but not in the database). We derive complexity results for these problems for five major subclasses of CQs; and we also find necessary conditions for CQ query containment. The general case of these problems remains open.  相似文献   

10.
11.
RDF is the data interchange layer for the Semantic Web. In order to manage the increasing amount of RDF data, an RDF repository should provide not only the necessary scalability and efficiency, but also sufficient inference capabilities. Though existing RDF repositories have made progress towards these goals, there is still ample space for improving the overall performance. In this paper, we propose a native RDF repository, System Π, to pursue a better tradeoff among system scalability, query efficiency, and inference capabilities. System Π takes a hypergraph representation for RDF as the data model for its persistent storage, which effectively avoids the costs of data model transformation when accessing RDF data. Based on this native storage scheme, a set of efficient semantic query processing techniques are designed. First, several indices are built to accelerate RDF data access including a value index, a labeling scheme for transitive closure computation, and three triple indices. Second, we propose a hybrid inference strategy under the pD * semantics to support inference for OWL-Lite with a relatively low computational complexity. Finally, we extend the SPARQL algebra to explicitly express inference semantics in logical query plan by defining some new algebra operators. In addition, MD5 hash value of URI and schema level cache are introduced as practical implementation techniques. The results of performance evaluation on the LUBM benchmark and a real data set show that System Π has a better combined metric value than other comparable systems.  相似文献   

12.
In this paper, we introduce a fuzzy language to extract information from the web extending the web query language WebSQL [1]. These extensions are based on two observations: the inadequacy of traditional Boolean query languages for web documents, and the need to move beyond the notion of query providing just a set of answers in order to provide a better data presentation through answers' restructuring. In order to address the first issue, we consider fuzzy sets to express imprecision in data, queries and answers. In our case, data imprecision comes from the data classification provided by several search engines. Query imprecision occurs in weighting values provided at query definition time. Answer imprecision allows to filter and rank the answers. To address the second point, we provide an answer restructuring language to model the restructuring phase that follows the query phase. The restructuring language allows creation/deletion of links and page creation. Thus several answer organizations are possible as a result to the same query. The resulting language extends in a uniform framework WebSQL. Then we provide a mapping for the language constructs into an extended relational algebra called SAMEW[2] expressing similarity-based queries over imprecisely classified data, queries involving navigation among web pages and answer restructurings. Finally, we study the optimization of similarity-based queries using equivalence and containment rules holding for SAMEWand presenting several algorithms for query evaluation.  相似文献   

13.
We present a new method for proving strong lower bounds in communication complexity. This method is based on the notion of the conditional information complexity of a function which is the minimum amount of information about the inputs that has to be revealed by a communication protocol for the function. While conditional information complexity is a lower bound on communication complexity, we show that it also admits a direct sum theorem. Direct sum decomposition reduces our task to that of proving conditional information complexity lower bounds for simple problems (such as the AND of two bits). For the latter, we develop novel techniques based on Hellinger distance and its generalizations.Our paradigm leads to two main results:(1) An improved lower bound for the multi-party set-disjointness problem in the general communication complexity model, and a nearly optimal lower bound in the one-way communication model. As a consequence, we show that for any real k>2, approximating the kth frequency moment in the data stream model requires essentially Ω(n1−2/k) space; this resolves a conjecture of Alon et al. (J. Comput. System Sci. 58(1) (1999) 137).(2) A lower bound for the Lp approximation problem in the general communication model; this solves an open problem of Saks and Sun (in: Proceedings of the 34th Annual ACM Symposium on Theory of Computing (STOC), 2002, pp. 360-369). As a consequence, we show that for p>2, approximating the Lp norm to within a factor of nε in the data stream model with constant number of passes requires Ω(n1−4ε−2/p) space.  相似文献   

14.
The Entity-Relationship (ER) model is a fundamental tool for database design, recently extended and employed in knowledge representation and reasoning due to its expressiveness and comprehensibility. We address the problem of answering conjunctive queries under constraints representing schemata expressed in an extended version of the Entity-Relationship model. This extended model, called ER+, comprises is-a constraints among entities and relationships, plus functional and mandatory participation constraints. In particular, it allows for arbitrary permutations of the roles in is-a among relationships. A key notion that ensures high tractability in ER+ schemata is separability, i.e., the absence of interaction between the functional participation constraints and the other constructs of ER+. We provide a precise syntactic characterization of separable ER+ schemata by means of a necessary and sufficient condition. We present a complete complexity analysis of the conjunctive query answering problem under separable ER+ schemata, and also under several sublanguages of ER+. We show that the addition of so-called negative constraints does not increase the complexity of query answering. With such constraints, our model properly generalizes the most widely adopted tractable ontology languages, including those in the DL-Lite family.  相似文献   

15.
RDF查询语言到SQL语言的转换原理及其实现方法   总被引:2,自引:0,他引:2  
RDF查询语言的优点是具有语义性,缺点是对于海量信息的存储和查找的效率都很低.而关系数据库对海量信息的存储和查找的效率皆很高,但是其查询语言SQL却缺乏语义信息.为了使信息查询既有RDF的语义性又有关系数据库的高性能,提出将RDF查询语言到SQL语言的转换原理,并在此基础上实现一个对用户透明的、建立在关系数据库之上的RDF查询引擎.其优点是:可以利用关系数据库来存储和查询RDF信息,提高其海量存储和查找效率;对存储在不同的关系数据库中的关系数据,能够利用RDF的查找特性进行异质数据库之间的信息交换及信息融合.  相似文献   

16.
Semistructured data occur in situations where information lacks a homogeneous structure and is incomplete. Yet, up to now the incompleteness of information has not been reflected by special features of query languages. Our goal is to investigate the principles of queries that allow for incomplete answers. We do not present, however, a concrete query language. Queries over classical structured data models contain a number of variables and constraints on these variables. An answer is a binding of the variables by elements of the database such that the constraints are satisfied. In the present paper, we loosen this concept in so far as we allow also answers that are partial; that is, not all variables in the query are bound by such an answer. Partial answers make it necessary to refine the model of query evaluation. The first modification relates to the satisfaction of constraints: in some circumstances we consider constraints involving unbound variables as satisfied. Second, in order to prevent a proliferation of answers, we only accept answers that are maximal in the sense that there are no assignments that bind more variables and satisfy the constraints of the query. Our model of query evaluation consists of two phases, a search phase and a filter phase. Semistructured databases are essentially labeled directed graphs. In the search phase, we use a query graph containing variables to match a maximal portion of the database graph. We investigate three different semantics for query graphs, which give rise to three variants of matching. For each variant, we provide algorithms and complexity results. In the filter phase, the maximal matchings resulting from the search phase are subjected to constraints, which may be weak or strong. Strong constraints require all their variables to be bound, while weak constraints do not. We describe a polynomial algorithm for evaluating a special type of queries with filter constraints, and assess the complexity of evaluating other queries for several kinds of constraints. In the final part, we investigate the containment problem for queries consisting only of search constraints under the different semantics.  相似文献   

17.
以RDF结构为基础的数据网的发展中,高效数据检索成为关键问题之一。形式化查询语言(如SPARQL)因其语法的复杂性及查询本体的相关性阻碍其效用的发挥,迫切需要新的方法或工具实现以自然语言为基础(如关键字检索)的检索。形式化查询语言是检索这类结构化数据的有效方式,用户习惯自然语言为基础的检索方式。因而如何自动将关键词为基础的检索方式转换成以形式化查询为基础的检索方式是实现数据网的重要一环。关联数据的自然语言查询方法自动将自然语言查询转换成SPARQL查询,提高系统的有效性和效率。文中在抽象转换度量模型的基础上,以本体为基础构建查询语义图及实现语义消歧,构建SPARQL查询。实验结果表明,该方法具有更高的召回率、精度及更低的时间消耗。  相似文献   

18.
19.
In this paper we give tight quantum query complexity bounds of some important linear algebra problems. We prove Θ(n2) quantum query bounds for verify the determinant, rank, matrix inverse and the matrix power problem.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号