首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
An approach to estimating effectiveness of index usage when searching semistructured databases consisting of OEM documents is presented. In addition to the estimation of the hierarchy optimality from the standpoint of calculation of conjunctive regular path queries, this approach allows one to take into account arbitrary distributions of query probabilities. Algorithms for index construction are given, and estimates of their complexity are obtained. These estimates clearly demonstrate efficiency of the approach and practical applicability of the algorithms suggested.  相似文献   

2.
In this paper we study the problem of providing controlled access to confidential data stored in semistructured databases. More specifically, we focus on privacy violations via data inferences that occur when domain knowledge is combined with non-private data. We propose a formal model, called Privacy Information Flow Model, to represent the information flow and the privacy requirements. These privacy requirements are enforced by the Privacy Mediator. Privacy Mediator guarantees that users are not be able to logically entail information that violates the privacy requirements. We present an inference algorithm that is sound and complete. The inference algorithm is developed for a tree-like, semistructured data model, selection-projection queries, and domain knowledge, represented as Horn-clause constraints.  相似文献   

3.
The large volume and nature of data available to the casual users and programs motivate the increasing interest of the database community in studying flexible and efficient techniques for extracting and querying semistructured data. On the other hand, efficient methods have been discovered for solving the so-called model-checking problem for some modal logics. The aim of this paper is to show how some of these methods can be used for querying semistructured data. For doing that we show that semistructured data can be naturally seen as Kripke Transition Systems. To keep the presentation independent of a specific language, we introduce a graphical query language that includes some of the features of the query languages based on graphs and patterns. We show how to associate CTL formulas to queries of this language. This allows us to see the problems of solving a query as an instance of the model-checking problem for CTL that can be solved in polynomial time. We have tested the method by using a model-checker, and have studied the applicability of the method to some existing languages for semistructured databases.  相似文献   

4.
Traditional database query languages such as datalog and SQL allow the user to specify only mandatory requirements on the data to be retrieved from a database. In many applications, it may be natural to express not only mandatory requirements but also preferences on the data to be retrieved. Lacroix and Lavency10) extended SQL with a notion of preference and showed how the resulting query language could still be translated into the domain relational calculus. We explore the use of preference in databases in the setting of datalog. We introduce the formalism of preference datalog programs (PDPs) as preference logic programs without uninterpreted function symbols for this purpose. PDPs extend datalog not only with constructs to specify which predicate is to be optimized and the criterion for optimization but also with constructs to specify which predicate to be relaxed and the criterion to be used for relaxation. We can show that all of the soft requirements in Reference10) can be directly encoded in PDP. We first develop anaively-pruned bottom-up evaluation procedure that is sound and complete for computing answers to normal and relaxation queries when the PDPs are stratified, we then show how the evaluation scheme can be extended to the case when the programs are not necessarily stratified, and finally we develop an extension of themagic templates method for datalog14) that constructs an equivalent but more efficient program for bottom-up evaluation. Kannan Govindarajan, Ph.D.: He obtained his bachelors degree in Computer Science and Engineering from the Indian Institute of Technology, Madras, and he completed his Ph.D. degree in Computer Science from the State University of New York at Buffalo. His dissertation research was on optimization and relaxation techniques for logic languages. His interests lie in the areas of programming languages, databases, and distributed systems. He currently leads the trading community effort in the E-speak Operation in Hewlett Packard Company. Prior to that, he was a member of the Java Products Group in Oracle Corporation. Bharat Jayaraman, Ph.D.: He is a Professor in the Department of Computer Science at the State University of New York at Buffalo. He obtained his bachelors degree in Electronics from the Indian Institute of Technology, Madras (1975), and his Ph.D. from the University of Utah (1981). His research interests are in programming languages and declarative modeling of complex systems. Dr. Jayaraman has published over 50 papers in refereed conferences and journals. He has served on the program committees of several conferences in the area of programming languages, and he is presently on the Editorial Board of the Journal of Functional and Logic Programming. Surya Mantha, Ph.D.: He is a manager in the Communications and Software Services Group of Pittiglio Rabin Todd & McGrath (PRTM), a management consulting firm serving high technology industries. He obtained a bachelors degree in Computer Science and Engineering from the Indian Institute of Technology, Kanpur, an MBA in Finance and Competitive Strategy from the University of Rochester, and a Ph.D. in Computer Science from the University of Utah (1991). His research interests are in the modeling of complex business processes, inter-enterprise application integration, and business strategy. Dr. Mantha has two US patents, and has published over 10 research papers. Prior to joining PRTM, he was a researcher and manager in the Architecture and Document Services Technology Center at Xerox Corporation in Rochester, New York.  相似文献   

5.
We study queries over databases with external functions, from a language-independent perspective. The input and output types of the external functions can be atomic values, flat relations, nested relations, etc. We propose a new notion of data-independence for queries on databases with external functions, which extends naturally the notion of generic queries on relational databases without external functions. In contrast to previous such notions, ours can also be applied to queries expressed in query languages with iterations. Next, we propose two natural notions of computability for queries over databases with external functions, and prove that they are equivalent, under reasonable assumptions. Thus, our definition of computability is robust. Finally, based on this equivalence result, we give examples of complete query languages with external functions. A byproduct of the equivalence result is the fact that Relational Machines (Abiteboul and V. Vianu, 1991; Abiteboul et al., 1992) are complete on nested relations: they are known not to be complete on flat relations.  相似文献   

6.
We show that some relational queries, which we call quantified queries are not well supported in distributed environments. We give a formal definition of quantified queries, propose a language in which to express said queries and provide a procedure to compute answers in this new language in the context of distributed databases. The proposed language is made up of high-level, declarative operators (called generalised quantifiers), and therefore it can be used in combination with several distributed frameworks. Our approach is designed to be as general as possible; it assumes horizontally partitioned relations, but nothing else, so no data placement or replication is used. We present an implementation and algorithms for the new language, propose some basic optimisations and give experimental results which show that the new approach is indeed quite efficient and scales well.  相似文献   

7.
Multimedia databases have emerged to cope up with the huge amount of multimedia data, which comes up as a result of technological advancement. However, more intelligent techniques are required to satisfy different query requirements of multimedia users. This study extends the query capability of a multimedia database through the integration of a fuzzy rule‐based system. In addition to fuzzy semantic rules, which deduce new information from the data stored in the database, fuzzy spatial and temporal relations, which are inherent to multimedia applications, are defined in the rule‐based system. Users can formulate fuzzy semantic, spatial, temporal, and spatiotemporal queries, resulting in the deduction of new information using the rules defined in the rule‐based system. With some practical examples, the paper presents how a fuzzy rule‐based system integrated to a fuzzy multimedia database improves the query capabilities of the database system intelligently. © 2011 Wiley Periodicals, Inc.  相似文献   

8.
A reduced cover set of the set of full reducer semijoin programs for an acyclic query graph for a distributed database system is given. An algorithm is presented that determines the minimum cost full reducer program. The computational complexity of finding the optimal full reducer for a single relation is of the same order as that of finding the optimal full reducer for all relations. The optimization algorithm is able to handle query graphs where more than one attribute is common between the relations. A method for determining the optimum profitable semijoin program is presented. A low-cost algorithm which determines a near-optimal profitable semijoin program is outlined. This is done by converting a semijoin program into a partial order graph. This graph also allows one to maximize the concurrent processing of the semijoins. It is shown that the minimum response time is given by the largest cost path of the partial order graph. This reducibility is used as a post optimizer for the SSD-1 query optimization algorithm. It is shown that the least upper bound on the length of any profitable semijoin program is N(N-1) for a query graph of N nodes  相似文献   

9.
In this paper, we identify a novel and interesting type of queries, contextual ranking queries, which return the ranks of query tuples among some context tuples given in the queries. Contextual ranking queries are useful for olap and decision support applications in non-traditional data exploration. They provide a mechanism to quickly identify where tuples stand within the context. In this paper, we extend the sql language to express contextual ranking queries and propose a general partition-based framework for processing them. In this framework, we use a novel method that utilizes bitmap indices built on ranking functions. This method can efficiently identify a small number of candidate tuples, thus achieves lower cost than alternative methods. We analytically investigate the advantages and drawbacks of these methods, according to a preliminary cost model. Experimental results suggest that the algorithm using bitmap indices on ranking functions can be substantially more efficient than other methods.  相似文献   

10.
The execution of logic queries in a distributed database environment is studied. Conventional optimization strategies, such as the early evaluation of selection conditions and the clustering of processing to manipulate and exchange large sets of tuples, are redefined in view of the additional difficulties due to logic queries, in particular to recursive rules. In order to allow efficient processing of these logic queries, several program transformation techniques that attempt to minimize distribution costs based on the idea of semijoins and generalized semijoins in conventional databases are presented. Although local computation of semijoins is not possible for the general case, classes of programs are indicated for which these transformations succeed in producing set-oriented computation. Processes evaluating the recursive program in a distributed network are described, and an efficient method for testing the termination of the computation is developed. The approach is compared with sequential as well as dataflow-oriented evaluation  相似文献   

11.
It has been previously proposed that a query to a database of time signals can be accelerated by searching over partial data in Fourier space. It is proposed here that such queries can be accelerated further by employing composite Fourier filtering. In particular, many fractional power filters are trained on sets of vectors from the database. In single comparisons information about the entire set of vectors is available. Query times are shortened mainly due to the ability to examine several vectors in the database simultaneously.  相似文献   

12.
Although spatio-temporal databases have received considerable attention recently, there has been little work on processing range sum queries on the historical records of moving objects despite their importance. Since the direct access to a huge amount of data to answer range sum queries incurs prohibitive computation cost, materialization techniques based on existing index structures are suggested. A simple but effective solution is to apply the materialization technique to the MVR-tree known as the most efficient structure for window queries with spatio-temporal conditions. Aggregate structures based on other index structures such as the HR-tree and the 3DR-tree do not provide satisfactory query performance. In this paper, we propose a new index structure called the Adaptively Partitioned Aggregate R-Tree (APART) and query processing algorithms to efficiently process range sum queries in many situations. Our experimental results show that the performance of the APART is typically 1.3 times better than that of its competitor for a wide range of scenarios.  相似文献   

13.
Efficient fuzzy ranking queries in uncertain databases   总被引:1,自引:1,他引:0  
Recently, uncertain data have received dramatic attention along with technical advances on geographical tracking, sensor network and RFID etc. Also, ranking queries over uncertain data has become a research focus of uncertain data management. With dramatically growing applications of fuzzy set theory, lots of queries involving fuzzy conditions appear nowadays. These fuzzy conditions are widely applied for querying over uncertain data. For instance, in the weather monitoring system, weather data are inherent uncertainty due to some measurement errors. Weather data depicting heavy rain are desired, where ??heavy?? is ambiguous in the fuzzy query. However, fuzzy queries cannot ensure returning expected results from uncertain databases. In this paper, we study a novel kind of ranking queries, Fuzzy Ranking queries (FRanking queries) which extend the traditional notion of ranking queries. FRanking queries are able to handle fuzzy queries submitted by users and return k results which are the most likely to satisfy fuzzy queries in uncertain databases. Due to fuzzy query conditions, the ranks of tuples cannot be evaluated by existing ranking functions. We propose Fuzzy Ranking Function to calculate tuples?? ranks in uncertain databases for both attribute-level and tuple-level uncertainty models. Our ranking function take both the uncertainty and fuzzy semantics into account. FRanking queries are formally defined based on Fuzzy Ranking Function. In the processing of answering FRanking queries, we present a pruning method which safely prunes unnecessary tuples to reduce the search space. To further improve the efficiency, we design an efficient algorithm, namely Incremental Membership Algorithm (IMA) which efficiently answers FRanking queries by evaluating the ranks of incremental tuples under each threshold for the fuzzy set. We demonstrate the effectiveness and efficiency of our methods through the theoretical analysis and experiments with synthetic and real datasets.  相似文献   

14.
This paper presents some applications of partial evaluation method to a query optimization in deductive database. A Horn clause transformation is used for the partial evaluation of a query in an intensional database, and its application to multiple query processing is discussed. Three strategies are presented for the compatible case, ordered case and crossed case. In each case, partial evaluation is used to preprocess the intensional database in order to obtain subqueries which direct access to an extensional database.  相似文献   

15.
Answering queries in indefinite systems is a difficult problem both computationally, since it involves non-Horn clauses and factoring, and conceptually, concerning producing beliefs for formulas not derivable from the system. to provide a basis for reasonable beliefs, we propose new criteria as an alternative to the Full Information Principle. Then an approach to producing stable beliefs, called Plausible World Assumption (PWA), is introduced. It is shown how a set of non-Horn clauses can be transformed into a set of so called singleton-head-rules such that evaluation of a given query is reduced to processing of a set of Horn clauses relevant to the query. Finally, algorithms are presented for computing facts and beliefs for atomic queries in accord with the PWA. This method is shown to be more efficient than the known techniques for query evaluation in indefinite systems.  相似文献   

16.
It is known that standard query languages for constraint databases lack the power to express connectivity properties. Such properties are important in the context of geographical databases, where one naturally wishes to ask queries about connectivity (What are the connected components of a given set?) or reachability (Is there a path from A to B that lies entirely in a given region?). No existing constraint query languages that allow closed-form evaluation can express these properties. In the first part of the paper, we show that, in principle, there is no obstacle to getting closed languages that can express connectivity and reachability queries. In fact, we show that adding any topological property to standard languages like FO+Lin and FO+Poly results in a closed language. In the second part of the paper, we look for tractable closed languages for expressing reachability and connectivity queries. We introduce path logic, which allows one to state properties of paths with respect to given regions. We show that it is closed, has polynomial time data complexity for linear and polynomial constraints, and can express a large number of reachability properties beyond simple connectivity. Query evaluation in the logic involves obtaining a discrete abstraction of a continuous path, and model-checking of temporal formulae on the discrete structure.  相似文献   

17.
Due to its great benefits over many database applications, skyline queries have received formidable concern in the last decades. Skyline queries attempt to assist users by identifying the set of data items which represents the best results that meet the conditions of a given query. Most of the existing skyline techniques concentrate on identifying skylines over a single relation. However, in distributed databases, the process of skyline queries required accessing multiple relations which might be located at different sites. Consequently, data items from these multiple relations need to be joined and thus transferring these data items from one site to another is unavoidable. Moreover, the previous techniques also assume that the values of dimensions for every data item are presented (complete) which is not always true as some values may be missing. In this paper, we proposed an approach for processing skyline queries in incomplete distributed databases. The approach derives skylines from multiple relations where dominated data items are removed before joining the relations to reduce the processing time and the network cost. The experimental results illustrate that our proposed approach outperforms the previous approaches in terms of processing time and network cost.  相似文献   

18.
《Information Systems》1999,24(7):569-595
This paper introduces and studies the relational meta algebra, a statically typed extension of the relational algebra to allow for meta programming in databases. In this meta algebra one can manipulate database relations involving not only stored data values (as in classical relational databases) but also stored relational algebra expressions. Topics discussed include modeling of advanced database applications involving “procedural data” ; desirability as well as limitations of a strict typing discipline in this context; equivalence with a first-order calculus; and global expressive power and non-redundancy of the proposed formalism.  相似文献   

19.
Query processing in the uncertain database has become increasingly important due to the wide existence of uncertain data in many real applications. Different from handling precise data, the uncertain query processing needs to consider the data uncertainty and answer queries with confidence guarantees. In this paper, we formulate and tackle an important query, namely probabilistic inverse ranking (PIR) query, which retrieves possible ranks of a given query object in an uncertain database with confidence above a probability threshold. We present effective pruning methods to reduce the PIR search space, which can be seamlessly integrated into an efficient query procedure. Moreover, we tackle the problem of PIR query processing in high dimensional spaces, which reduces high dimensional uncertain data to a lower dimensional space. Furthermore, we study three interesting and useful aggregate PIR queries, that is, MAX, top-m, and AVG? PIRs. Moreover, we also study an important query type, PIR with uncertain query object (namely UQ-PIR), and design specific rules to facilitate the pruning. Extensive experiments have demonstrated the efficiency and effectiveness of our proposed approaches over both real and synthetic data sets, under various experimental settings.  相似文献   

20.
We give a general framework for approximate query processing in semistructured databases. We focus on regular path queries, which are the integral part of most of the query languages for semistructured databases. To enable approximations, we allow the regular path queries to be distorted. The distortions are expressed in the system by using weighted regular expressions, which correspond to weighted regular transducers. After defining the notion of weighted approximate answers we show how to compute them in order of their proximity to the query. In the new approximate setting, query containment has to be redefined in order to take into account the quantitative proximity information in the query answers. For this, we define the approximate containment, and its variants k-containment and reliable contain-ment. Then, we give an optimal algorithm for deciding the k-containment. Regarding the reliable approximate containment, we show that it is polynomial time equivalent to the notorious limitedness problem in distance automata.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号