首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Bridging the gap between OWL and relational databases   总被引:2,自引:0,他引:2  
Despite similarities between the Web Ontology Language (OWL) and schema languages traditionally used in relational databases, systems based on these languages exhibit quite different behavior in practice. The schema statements in relational databases are usually interpreted as integrity constraints and are used to check whether the data is structured according to the schema. OWL allows for axioms that resemble integrity constraints; however, these axioms are interpreted under the standard first-order semantics and not as checks. This often leads to confusion and is inappropriate in certain data-centric applications. To explain the source of this confusion, in this paper we compare OWL and relational databases w.r.t. their schema languages and basic computational problems. Based on this comparison, we extend OWL with integrity constraints that capture the intuition behind similar statements in relational databases. We show that, if the integrity constraints are satisfied, they need not be considered while answering a broad range of positive queries. Finally, we discuss several algorithms for checking integrity constraint satisfaction, each of which is suitable to different types of OWL knowledge bases.  相似文献   

2.
In this paper we study queries over relational databases with integrity constraints (ICs). The main problem we analyze is OWA query answering, i.e., query answering over a database with ICs under open-world assumption. The kinds of ICs that we consider are inclusion dependencies and functional dependencies, in particular key dependencies; the query languages we consider are conjunctive queries and unions of conjunctive queries. We present results about the decidability of OWA query answering under ICs. In particular, we study OWA query answering both over finite databases and over unrestricted databases, and identify the cases in which such a problem is finitely controllable, i.e., when OWA query answering over finite databases coincides with OWA query answering over unrestricted databases. Moreover, we are able to easily turn the above results into new results about implication of ICs and query containment under ICs, due to the deep relationship between OWA query answering and these two classical problems in database theory. In particular, we close two long-standing open problems in query containment, since we prove finite controllability of containment of conjunctive queries both under arbitrary inclusion dependencies and under key and foreign key dependencies. The results of our investigation are very relevant in many research areas which have recently dealt with databases under an incomplete information assumption: e.g., data integration, data exchange, view-based information access, ontology-based information systems, and peer data management systems.  相似文献   

3.
This paper presents a framework for querying inconsistent databases in the presence of functional dependencies. Most of the works dealing with the problem of extracting reliable information from inconsistent databases are based on the notion of repair, a minimal set of tuple insertions and deletions which leads the database to a consistent state (called repaired database), and the notion of consistent query answer, a query answer that can be obtained from every repaired database. In this work, both the notion of repair and query answer differ from the original ones. In the presence of functional dependencies, tuple deletions are the only operations that are performed in order to restore the consistency of an inconsistent database. However, deleting a tuple to remove an integrity violation potentially eliminates useful information in that tuple. In order to cope with this problem, we adopt a notion of repair, based on tuple updates, which allows us to better preserve information in the source database. A drawback of the notion of consistent query answer is that it does not allow us to discriminate among non-consistent answers, namely answers which can be obtained from a non-empty proper subset of the repaired databases. To obtain more informative query answers, we propose the notion of probabilistic query answer, that is query answers are tuples associated with probabilities. This new semantics of query answering over inconsistent databases allows us to give a measure of uncertainty to query answers. We show that the problem of computing probabilistic query answers is FP #P -complete. We also propose a technique for computing probabilistic answers to arbitrary relational algebra queries.  相似文献   

4.
XML document may contain inconsistencies that violate predefined integrity constraints, which causes the data inconsistency problem. In this paper, we consider how to get the consistent data from an inconsistent XML document. There are two basic concepts for this problem: Repair is the data consistent with the integrity constraints, and also minimally differs from the original one. Consistent data is the data common for every possible repair. First we give a general constraint model for XML, which can express the commonly discussed integrity constraints, including functional dependencies, keys and multivalued dependencies. Next we provide a repair framework for inconsistent XML document with three basic update operations: node insertion, node deletion and node value modification. Following this approach, we introduce the concept of repair for inconsistent XML document, discuss the chase method to generate repairs, and prove some important properties of the chase. Finally we give a method to obtain the greatest lower bound of all possible repairs, which is sufficient for consistent data. We also implement prototypes of our method, and evaluate our framework and algorithms in the experiment.  相似文献   

5.
Minimal-change integrity maintenance using tuple deletions   总被引:3,自引:0,他引:3  
We address the problem of minimal-change integrity maintenance in the context of integrity constraints in relational databases. We assume that integrity-restoration actions are limited to tuple deletions. We focus on two basic computational issues: repair checking (is a database instance a repair of a given database?) and consistent query answers [in: ACM Symposium on Principles of Database Systems (PODS), 1999, 68] (is a tuple an answer to a given query in every repair of a given database?). We study the computational complexity of both problems, delineating the boundary between the tractable and the intractable cases. We consider denial constraints, general functional and inclusion dependencies, as well as key and foreign key constraints. Our results shed light on the computational feasibility of minimal-change integrity maintenance. The tractable cases should lead to practical implementations. The intractability results highlight the inherent limitations of any integrity enforcement mechanism, e.g., triggers or referential constraint actions, as a way of performing minimal-change integrity maintenance.  相似文献   

6.
Preference queries are relational algebra or SQL queries that contain occurrences of the winnow operator (find the most preferred tuples in a given relation). Such queries are parameterized by specific preference relations. Semantic optimization techniques make use of integrity constraints holding in the database. In the context of semantic optimization of preference queries, we identify two fundamental properties: containment of preference relations relative to integrity constraints and satisfaction of order axioms relative to integrity constraints. We show numerous applications of those notions to preference query evaluation and optimization. As integrity constraints, we consider constraint-generating dependencies, a class generalizing functional dependencies. We demonstrate that the problems of containment and satisfaction of order axioms can be captured as specific instances of constraint-generating dependency entailment. This makes it possible to formulate necessary and sufficient conditions for the applicability of our techniques as constraint validity problems. We characterize the computational complexity of such problems.  相似文献   

7.
We introduce in this paper a class of constraints for describing how an XML document can evolve, namely XML update constraints. For these constraints, we study the implication problem, giving algorithms and complexity results for constraints of varying expressive power. Besides classical constraint implication, we also consider an instance-based approach in which we take into account data. More precisely, we study implication with respect to a current tree instance, resulting from a series of unknown updates. The main motivation of our work is reasoning about data integrity under update restrictions in contexts where owners may lose control over their data, such as in publishing or exchange.  相似文献   

8.
Integrity constraints were initially defined to verify the correctness of the data that is stored in a database. They were used to restrict the modifications that can be applied to a database. However, there are many other applications in which integrity constraints can play an important role. For example, the semantic query optimization method developed by Chakravarthy, Grant, and Minker for definite deductive databases uses integrity constraints during query processing to prevent the exploration of search space that is bound to fail. In this paper, we generalize the semantic query optimization method to apply to negated atoms. The generalized method is referred to assemantic compilation. This exploration has led to two significant results. First, semantic compilation provides an alternative search space for negative query literals. The alternative search space can find answers in cases for which negation-as-finite-failure and constructive negation cannot. Second, we show how semantic compilation can be used to transform a disjunctive database with or without functions and denial constraints without negation into a new disjunctive database that complies with the integrity constraints.  相似文献   

9.
In this paper, we address the problem of managing inconsistent databases, i.e., databases violating integrity constraints. We propose a general logic framework for computing repairs and consistent answers over inconsistent databases. A repair for a possibly inconsistent database is a minimal set of insert and delete operations which makes the database consistent, whereas a consistent answer is a set of tuples derived from the database, satisfying all integrity constraints. In our framework, different types of rules defining general integrity constraints, repair constraints (i.e., rules defining conditions on the insertion or deletion of atoms), and prioritized constraints (i.e., rules defining priorities among updates and repairs) are considered. We propose a technique based on the rewriting of constraints into (prioritized) extended disjunctive rules with two different forms of negation (negation as failure and classical negation). The disjunctive program can be used for two different purposes: to compute "repairs" for the database and produce consistent answers, i.e., a maximal set of atoms which do not violate the constraints. We show that our technique is sound, complete (each preferred stable model defines a repair and each repair is derived from a preferred stable model), and more general than techniques previously proposed.  相似文献   

10.
《Information Systems》1997,22(8):423-446
In today's technologically diverse corporate environment, it is common to find several different databases being used to accomplish the organization's operational data management functions. Providing interoperability among these databases is important to the successful operation of the organization. One approach to providing interoperability among heterogeneous database systems, is to define one or more schemas which represent a coherent view of the underlying databases. In the past, most approaches have used schematic knowledge about the underlying databases to generate integrated representations of the databases. In this paper we present a seven step methodology for utilizing integrity constraint knowledge from heterogeneous databases. Specifically, we describe how we can generate a set of integrity constraints applicable at the integrated level from constraints specified on local databases. We introduce the concept of constraint-based relationships between objects in heterogeneous databases and describe the role that these relationships play in integrity constraint integration. Finally, we describe how the integrated set of constraints generated using our methodology can be used to facilitate semantic query processing in a heterogeneous database environment.  相似文献   

11.
The flexibility of XML data model allows a more natural representation of uncertain data compared with the relational model. Matching twig pattern against XML data is a fundamental problem in querying information from XML documents. For a probabilistic XML document, each twig answer has a probabilistic value because of the uncertainty of data. The twig answers that have small probabilistic value are useless to the users, and usually users only want to get the answers with the k largest probabilistic values. To this end, existing algorithms for ordinary XML documents cannot be directly applicable due to the need for handling probability distributional nodes and efficient calculation of top-k probabilities of answers in probabilistic XML. In this paper, we address the problem of finding twig answers with top-k probabilistic values against probabilistic XML documents directly. We propose a new encoding scheme called PEDewey for probabilistic XML in this paper. Based on this encoding scheme, we then design two algorithms for finding answers of top-k probabilities for twig queries. One is called ProTJFast, to process probabilistic XML data based on element streams in document order, and the other is called PTopKTwig, based on the element streams ordered by the path probability values. Experiments have been conducted to study the performance of these algorithms.  相似文献   

12.
We address the probabilistic generalization of weighted flow time on parallel machines. We present some results for situations which ask for “long-term robust” schedules of n jobs (tasks) on m parallel machines (processors): on any given day, only a random subset of jobs needs to be processed. The goal is to design robust a priori schedules (before we know which jobs need to be processed) which, on a long-term horizon, are optimal (or near optimal) with respect to total weighted flow time. The originality of this work is that probabilities are explicitly associated with data such that further classical properties of a task (processing time and weight) we consider a probability of presence. After motivating this investigation we analyze the computational complexity, analytical properties, and solution procedures for these problems. Special care is also devoted to assess experimentally the performance of a priori strategies.  相似文献   

13.
For problems when decisions are taken prior to observing the realization of underlying random events, probabilistic constraints are an important modeling tool if reliability is a concern. A key concept to numerically dealing with probabilistic constraints is that of p-efficient points. By adopting a dual point of view, we develop a solution framework that includes and extends various existing formulations. The unifying approach is built on the basis of a recent generation of bundle methods called with on-demand accuracy, characterized by its versatility and flexibility. Numerical results for several difficult problems confirm the interest of the approach.  相似文献   

14.
The goal of Knowledge Compilation is to represent a Boolean expression in a format in which it can answer a range of “online-queries” in PTIME. The online-query of main interest to us is model counting, because of its application to query evaluation on probabilistic databases, but other online-queries can be supported as well such as testing for equivalence, testing for implication, etc. In this paper we study the following problem: given a database query q, decide whether its lineage can be compiled efficiently into a given target language. We consider four target languages, of strictly increasing expressive power (when the size of compilation is restricted to be polynomial in the data size): read-once Boolean formulae, OBDD, FBDD and d-DNNF. For each target, we study the class of database queries that admit polynomial size representation: these queries can also be evaluated in PTIME over probabilistic databases. When queries are restricted to conjunctive queries without self-joins, it was known that these four classes collapse to the class of hierarchical queries, which is also the class of PTIME queries over probabilistic databases. Our main result in this paper is that, in the case of Unions of Conjunctive Queries (UCQ), these classes form a strict hierarchy. Thus, unlike conjunctive queries without self-joins, the expressive power of UCQ differs considerably with respect to these target compilation languages. Moreover, we give a complete characterization of the first two target languages, based on the query’s syntax.  相似文献   

15.
We consider the problem of retrieving consistent answers over databases that might be inconsistent with respect to a set of integrity constraints. In particular, we concentrate on sets of constraints that consist of key dependencies, and we give an algorithm that computes the consistent answers for a large and practical class of conjunctive queries. Given a query q, the algorithm returns a first-order query Q (called a query rewriting) such that for every (potentially inconsistent) database I, the consistent answers for q can be obtained by evaluating Q directly on I.  相似文献   

16.
In this paper, we extend job scheduling models to include aspects of history-dependent scheduling, where setup times for a job are affected by the aggregate activities of all predecessors of that job. Traditional approaches to machine scheduling typically address objectives and constraints that govern the relative sequence of jobs being executed using available resources. This paper optimises the operations of multiple unrelated resources to address sequential and history-dependent job scheduling constraints along with time window restrictions. We denote this consolidated problem as the general precedence scheduling problem (GPSP). We present several applications of the GPSP and show that many problems in the literature can be represented as special cases of history-dependent scheduling. We design new ways to model this class of problems and then proceed to formulate it as an integer program. We develop specialized algorithms to solve such problems. An extensive computational analysis over a diverse family of problem data instances demonstrates the efficacy of the novel approaches and algorithms introduced in this paper.  相似文献   

17.
Control-loop performance assessment methods have been evolving over the past two decades, with many different monitor algorithms being used to single out specific problems and determine the operating mode. However, a change in operating mode may affect multiple monitors, resulting in the possibility of conflicting assessments. Data-driven Bayesian methods were previously proposed which use multiple monitors to yield probabilistic assessments; however, training data for Bayesian methods requires complete knowledge of underlying operational modes. This paper proposes an approach based on proportionality parameters θ to address the problem of incomplete mode information in the training data; values in θ can be used to fill in missing information, and by varying θ one can determine the boundaries on a probabilistic diagnosis. Two diagnostic approaches are considered: the first type is direct probability approach, which can only be applied when historical data on the operation mode is sufficient and representative. The second type is the likelihood approach which can be applied to more general cases, including when the historical data is too limited to adequately represent mode frequency. In order to represent mode frequency, the likelihood approach takes into account prior probabilities of operating modes. The proposed methods are evaluated in two simulated chemical processes.  相似文献   

18.
ABSTRACT

Digital watermarking for relational databases emerged as a candidate solution to provide copyright protection and tamper detection and to maintain the integrity of data. Sectors such as defense, finance, and human resources require reliable schemes, especially for data alteration and integrity checking. Many watermarking techniques have been proposed in the literature to address the above-mentioned problems subject to their limitations on numeric attributes. However, these techniques do not yield satisfactory results in the case of small numeric values. To overcome these limitations, we have proposed a watermarking scheme in which the third most significant bit of numeric and nonnumeric attributes is replaced with the watermark bit without loss of originality. The proposed scheme must be suitable for the above-mentioned sectors, including commercial banks where fraud and forgery cases are common.  相似文献   

19.
Semantic integrity constraints are used for enforcing the integrity of the database as well as for improving the efficiency of the database utilization. Although semantic integrity constraints are usually much more static as compared to the data itself, changes in the data semantics may necessitate corresponding changes in the constraint base. In this paper we address the problems related with maintaining a consistent and non-redundant set of constraints satisfied by the database in the case of updates to the constraint base. We consider implication constraints as semantic integrity constraints. The constraints are represented as conjunctions of inequalities. We present a methodology to determine whether a constraint is redundant or contradictory with respect to a set of constraints. The methodology is based on the partitioning of the constraint base which improves the efficiency of algorithms that check whether a constraint is redundant or contradictory with respect to a constraint base. Received August 19, 1993 / Accepted July 7, 1997  相似文献   

20.
A unified approach to ranking in probabilistic databases   总被引:1,自引:0,他引:1  
Ranking is a fundamental operation in data analysis and decision support and plays an even more crucial role if the dataset being explored exhibits uncertainty. This has led to much work in understanding how to rank the tuples in a probabilistic dataset in recent years. In this article, we present a unified approach to ranking and top-k query processing in probabilistic databases by viewing it as a multi-criterion optimization problem and by deriving a set of features that capture the key properties of a probabilistic dataset that dictate the ranked result. We contend that a single, specific ranking function may not suffice for probabilistic databases, and we instead propose two parameterized ranking functions, called PRF ω and PRF e, that generalize or can approximate many of the previously proposed ranking functions. We present novel generating functions-based algorithms for efficiently ranking large datasets according to these ranking functions, even if the datasets exhibit complex correlations modeled using probabilistic and/xor trees or Markov networks. We further propose that the parameters of the ranking function be learned from user preferences, and we develop an approach to learn those parameters. Finally, we present a comprehensive experimental study that illustrates the effectiveness of our parameterized ranking functions, especially PRF e, at approximating other ranking functions and the scalability of our proposed algorithms for exact or approximate ranking.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号