期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Filtering Data Streams for Entity-Based Continuous Queries

Cheng Reynold Kao Ben Kwan Alan Prabhakar Sunil Tu Yicheng 《Knowledge and Data Engineering, IEEE Transactions on》2010,22(2):234-248

The idea of allowing query users to relax their correctness requirements in order to improve performance of a data stream management system (e.g., location-based services and sensor networks) has been recently studied. By exploiting the maximum error (or tolerance) allowed in query answers, algorithms for reducing the use of system resources have been developed. In most of these works, however, query tolerance is expressed as a numerical value, which may be difficult to specify. We observe that in many situations, users may not be concerned with the actual value of an answer, but rather which object satisfies a query (e.g., "who is my nearest neighbor?”). In particular, an entity-based query returns only the names of objects that satisfy the query. For these queries, it is possible to specify a tolerance that is "nonvalue-based.” In this paper, we study fraction-based tolerance, a type of nonvalue-based tolerance, where a user specifies the maximum fractions of a query answer that can be false positives and false negatives. We develop fraction-based tolerance for two major classes of entity-based queries: 1) nonrank-based query (e.g., range queries) and 2) rank-based query (e.g., k-nearest-neighbor queries). These definitions provide users with an alternative to specify the maximum tolerance allowed in their answers. We further investigate how these definitions can be exploited in a distributed stream environment. We design adaptive filter algorithms that allow updates be dropped conditionally at the data stream sources without affecting the overall query correctness. Extensive experimental results show that our protocols reduce the use of network and energy resources significantly. 相似文献

2.

How to Comprehend Queries Functionally

Torsten Grust Marc H. Scholl 《Journal of Intelligent Information Systems》1999,12(2-3):191-218

Compilers and optimizers for declarative query languages use some form of intermediate language to represent user-level queries. The advent of compositional query languages for orthogonal type systems (e.g., OQL) calls for internal query representations beyond extensions of relational algebra. This work adopts a view of query processing which is greatly influenced by ideas from the functional programming domain. A uniform formal framework is presented which covers all query translation phases, including user-level query language compilation, query optimization, and execution plan generation. We pursue the type-based design—based on initial algebras—of a core functional language which is then developed into an intermediate representation that fits the needs of advanced query processing. Based on the principle of structural recursion we extend the language by monad comprehensions (which provide us with a calculus-style sublanguage that proves to be useful during the optimization of nested queries) and combinators (abstractions of the query operators implemented by the underlying target query engine). Due to its functional nature, the language is susceptible to program transformation techniques that were developed by the functional programming as well as the functional data model communities. We show how database query processing can substantially benefit from these techniques. 相似文献

3.

基于模式集成语义的查询处理 总被引：1，自引：0，他引：1

石祥滨张斌于戈郑怀远《软件学报》1998,9(5):321-326

在采用面向对象模型作为公共数据模型的多数据库系统中，基于模式集成语义的查询处理不仅要实现针对集成模式查询到针对输出模式查询的转换，而且要从语义上尽可能减少回答用户查询所需数据，保证对象引用的正确性.为了达到这个目标,提出了一些新的概念及基于模式集成语义的查询处理规则和路径表达式的查询处理方法. 相似文献

4.

Incomplete deductive databases 总被引：1，自引：0，他引：1

Tomasz Imielinski 《Annals of Mathematics and Artificial Intelligence》1991,3(2-4):259-293

We investigate the complexity of query processing in databases which have both incompletely specified data and deductive rules. The paper is divided into two parts: in the first we consider databases in which incompletely specified data occurs only in the database intension; in the second we consider databases in which incomplete information is represented only in database extension. We prove that, in general, the query processing problem for databases with incomplete intensions is undecidable. A number of classes of rules for which all conjunctive queries can be processed in polynomial time is then characterized. For databases with incomplete extensions we prove a number of CoNP completeness results. For instance, we demonstrate that processing disjunctions which are restricted to individual columns of database predicates can, in general, be as bad as processing arbitrary disjunctions (i.e. CoNP-complete). This falsifies the conjecture that such limited disjunctions could be computationally beneficial. We also show two simple examples of situations in which query processing is guaranteed to be polynomial. These situations are linked to certain assumptions about database updates.Finally, we provide a summary of the data complexity of queries depending on the type of database extension, intension, query sublanguage and Open World vs Closed World assumption.Research supported by NSF grant DCR 85-04140.More precisely, we can say this only in the presence of the closed world assumption [18]. 相似文献

5.

Diagnosing and correcting design inconsistencies in source code with logical abduction

Sergio Castro Angela Lozano 《Science of Computer Programming》2011,76(12):1113-1129

Correcting design decay in source code is not a trivial task. Diagnosing and subsequently correcting inconsistencies between a software system’s code and its design rules (e.g., database queries are only allowed in the persistence layer) and coding conventions can be complex, time-consuming and error-prone. Providing support for this process is therefore highly desirable, but of a far greater complexity than suggesting basic corrective actions for simplistic implementation problems (like the “declare a local variable for non-declared variable” suggested by Eclipse).We present an abductive reasoning approach to inconsistency correction that consists of (1) a means for developers to document and verify a system’s design and coding rules, (2) an abductive logic reasoner that hypothesizes possible causes of inconsistencies between the system’s code and the documented rules and (3) a library of corrective actions for each hypothesized cause. This work builds on our previous work, where we expressed design rules as equality relationships between sets of source code artifacts (e.g., the set of methods in the persistence layer is the same as the set of methods that query the database). In this paper, we generalize our approach to design rules expressed as user-defined binary relationships between two sets of source code artifacts (e.g., every state changing method should invoke a persistence method).We illustrate our approach on the design of IntensiVE, a tool suite that enables defining sets of source code artifacts intensionally (by means of logic queries) and verifying relationships between such sets. 相似文献

6.

Trip planning queries with location privacy in spatial databases

Subarna Chowdhury Soma Tanzima Hashem Muhammad Aamir Cheema Samiha Samrose 《World Wide Web》2017,20(2):205-236

Privacy has become a major concern for the users of location-based services (LBSs) and researchers have focused on protecting user privacy for different location-based queries. In this paper, we propose techniques to protect location privacy of users for trip planning (TP) queries, a novel type of query in spatial databases. A TP query enables a user to plan a trip with the minimum travel distance, where the trip starts from a source location, goes through a sequence of points of interest (POIs) (e.g., restaurant, shopping center), and ends at a destination location. Due to privacy concerns, users may not wish to disclose their exact locations to the location-based service provider (LSP). In this paper, we present the first comprehensive solution for processing TP queries without disclosing a user’s actual source and destination locations to the LSP. Our system protects the user’s privacy by sending either a false location or a cloaked location of the user to the LSP but provides exact results of the TP queries. We develop a novel technique to refine the search space as an elliptical region using geometric properties, which is the key idea behind the efficiency of our algorithms. To further reduce the processing overhead while computing a trip from a large POI database, we present an approximation algorithm for privacy preserving TP queries. Extensive experiments show that the proposed algorithms evaluate TP queries in real time with the desired level of location privacy. 相似文献

7.

Access to indexed hierarchical databases using a relational querylanguage

Chung C.-W. McCloskey K.E. 《Knowledge and Data Engineering, IEEE Transactions on》1993,5(1):155-161

An efficient means of accessing indexed hierarchical databases using a relational query language is presented. The purpose is to achieve an effective sharing of heterogeneous distributed databases. Translation of hierarchical data to an equivalent relational data definition, translation of a relational query language statement to an equivalent program that can be processed by a hierarchical database management system, and automatic selection of secondary indexes of hierarchical databases are investigated. A major portion of the result has been implemented, and the performance of the implemented system is analyzed. The performance of the system is satisfactory for a wide range of test data and test queries. It is shown that the utilization of the secondary index significantly enhances the efficiency in accessing hierarchical databases 相似文献

8.

Ultrawrap: SPARQL execution on relational data

《Journal of Web Semantics》2013

The Semantic Web’s promise of web-wide data integration requires the inclusion of legacy relational databases,¹ i.e. the execution of SPARQL queries on RDF representation of the legacy relational data. We explore a hypothesis: existing commercial relational databases already subsume the algorithms and optimizations needed to support effective SPARQL execution on existing relationally stored data. The experiment is embodied in a system, Ultrawrap, that encodes a logical representation of the database as an RDF graph using SQL views and a simple syntactic translation of SPARQL queries to SQL queries on those views. Thus, in the course of executing a SPARQL query, the SQL optimizer uses the SQL views that represent a mapping of relational data to RDF, and optimizes its execution. In contrast, related research is predicated on incorporating optimizing transforms as part of the SPARQL to SQL translation, and/or executing some of the queries outside the underlying SQL environment.Ultrawrap is evaluated using two existing benchmark suites that derive their RDF data from relational data through a Relational Database to RDF (RDB2RDF) Direct Mapping and repeated for each of the three major relational database management systems. Empirical analysis reveals two existing relational query optimizations that, if applied to the SQL produced from a simple syntactic translations of SPARQL queries (with bound predicate arguments) to SQL, consistently yield query execution time that is comparable to that of SQL queries written directly for the relational representation of the data. The analysis further reveals the two optimizations are not uniquely required to achieve a successful wrapper system. The evidence suggests effective wrappers will be those that are designed to complement the optimizer of the target database. 相似文献

9.

A bi-labeling based XPath processing system

Yi Chen Susan B. Davidson Yifeng Zheng 《Information Systems》2010

We present BLAS, a Bi-LAbeling based XPath processing System. BLAS uses two labeling schemes to speed up query processing: P-labeling for processing consecutive child (or parent) axis traversals, and D-labeling for processing descendant (or ancestor) axis traversals. XML data are stored in labeled form and indexed. Algorithms are presented for translating XPath queries to SQL expressions. BLAS reduces the number of joins in the SQL query translated from a given XPath query and reduces the number of disk accesses required to execute the SQL query compared with the traditional XPath processing using D-labeling alone. We also propose an approximate P-labeling scheme and the corresponding query translation algorithm to handle XML data trees that contain a large number of distinct tag names, and/or are very deep. This extension captures a spectrum of XPath-to-SQL query translation schemes, ranging from existing schemes that do not use P-labels to the one that uses exact P-labels. Experimental results demonstrate the efficiency of the BLAS system. 相似文献

10.

Rewriting rules to permeate complex similarity and fuzzy queries within a relational database system

Penzo W. 《Knowledge and Data Engineering, IEEE Transactions on》2005,17(2):255-270

In recent years, the availability of complex data repositories (e.g., multimedia, genomic, semistructured databases) has paved the way to new potentials as to data querying. In this scenario, similarity and fuzzy techniques have proven to be successful principles for effective data retrieval. However, most proposals are domain specific and lack of a general and integrated approach to deal with generalized complex queries, i.e., queries where multiple conditions are expressed, possibly on complex as well as on traditional data. To overcome such limitations, much work has been devoted to the development of middleware systems to support query processing on multiple repositories. On a similar line, We present a formal framework to permeate complex similarity and fuzzy queries within a relational database system. As an example, we focus on multimedia data, which is represented in an integrated view with common database data. We have designed an application layer that relies on an algebraic query language, extended with MM-tailored operators, and that maps complex similarity and fuzzy queries to standard SQL statements that can be processed by a relational database system, exploiting standard facilities of modern extensible RDBMS. To show the applicability of our proposal, we implemented a prototype that provides the user with rich query capabilities, ranging from traditional database queries to complex queries gathering a mixture of Boolean, similarity, and fuzzy predicates on the data. 相似文献

11.

GeMDA: A Multidimensional Data Partitioning Technique for Multiprocessor Database Systems

Yu-Lung Lo Kien A. Hua Honesty C. Young 《Distributed and Parallel Databases》2001,9(3):211-236

Several studies have repeatedly demonstrated that both the performance and scalability of a shared-nothing parallel database system depend on the physical layout of data across the processing nodes of the system. Today, data is allocated in these systems using horizontal partitioning strategies. This approach has a number of drawbacks. If a query involves the partitioning attribute, then typically only a small number of the processing nodes can be used to speedup the execution of this query. On the other hand, if the predicate of a selection query includes an attribute other than the partitioning attribute, then the entire data space must be searched. Again, this results in waste of computing resources. In recent years, several multidimensional data declustering techniques have been proposed to address these problems. However, these schemes are too restrictive (e.g., FX, ECC, etc.), or optimized for a certain type of queries (e.g., DM, HCAM, etc.). In this paper, we introduce a new technique which is flexible, and performs well for general queries. We prove its optimality properties, and present experimental results showing that our scheme outperforms DM and HCAM by a significant margin. 相似文献

12.

Semantics preserving SPARQL-to-SQL translation 总被引：2，自引：0，他引：2

Artem Shiyong Farshad 《Data & Knowledge Engineering》2009,68(10):973-1000

Most existing RDF stores, which serve as metadata repositories on the Semantic Web, use an RDBMS as a backend to manage RDF data. This motivates us to study the problem of translating SPARQL queries into equivalent SQL queries, which further can be optimized and evaluated by the relational query engine and their results can be returned as SPARQL query solutions. The main contributions of our research are: (i) We formalize a relational algebra based semantics of SPARQL, which bridges the gap between SPARQL and SQL query languages, and prove that our semantics is equivalent to the mapping-based semantics of SPARQL; (ii) Based on this semantics, we propose the first provably semantics preserving SPARQL-to-SQL translation for SPARQL triple patterns, basic graph patterns, optional graph patterns, alternative graph patterns, and value constraints; (iii) Our translation algorithm is generic and can be directly applied to existing RDBMS-based RDF stores; and (iv) We outline a number of simplifications for the SPARQL-to-SQL translation to generate simpler and more efficient SQL queries and extend our defined semantics and translation to support the bag semantics of a SPARQL query solution. The experimental study showed that our proposed generic translation can serve as a good alternative to existing schema dependent translations in terms of efficient query evaluation and/or ensured query result correctness. 相似文献

13.

Schema mapping and query translation in heterogeneous P2P XML databases

Angela Bonifati Elaine Chang Terence Ho Laks V. S. Lakshmanan Rachel Pottinger Yongik Chung 《The VLDB Journal The International Journal on Very Large Data Bases》2010,19(2):231-256

Peers in a peer-to-peer data management system often have heterogeneous schemas and no mediated global schema. To translate queries across peers, we assume each peer provides correspondences between its schema and a small number of other peer schemas. We focus on query reformulation in the presence of heterogeneous XML schemas, including data–metadata conflicts. We develop an algorithm for inferring precise mapping rules from informal schema correspondences. We define the semantics of query answering in this setting and develop query translation algorithm. Our translation handles an expressive fragment of XQuery and works both along and against the direction of mapping rules. We describe the HePToX heterogeneous P2P XML data management system which incorporates our results. We report the results of extensive experiments on HePToX on both synthetic and real datasets. We demonstrate our system utility and scalability on different P2P distributions. 相似文献

14.

异构数据库加解密系统的关键技术研究与实现 总被引：2，自引：0，他引：2

郝文宁赵恩来刘玉栋黄亚刘军涛《计算机应用》2010,30(9):2339-2343

对数据进行加密是保护信息机密性的一种有效途径,针对一般加解密系统的异构数据库兼容性差以及密文查询方式单一的问题,提出了一种新的数据库加密方式：以领域元数据为支撑,采用对象关系映射模型屏蔽异构数据库,通过构建密文索引来实现灵活多样的密文查询;设计并实现了异构数据库加解密系统。实验结果和理论分析表明：系统能够支持多种类型数据库的加解密,提供多种方式的密文查询,并提高了数据库加密的安全性。相似文献

15.

Rapidly finding CAD features using database optimization

《Computer aided design》2015

Automatic feature recognition aids downstream processes such as engineering analysis and manufacturing planning. Not all features can be defined in advance; a declarative approach allows engineers to specify new features without having to design algorithms to find them. Naive translation of declarations leads to executable algorithms with high time complexity. Database queries are also expressed declaratively; there is a large literature on optimizing query plans for efficient execution of database queries. Our earlier work investigated applying such technology to feature recognition, using a testbed interfacing a database system (SQLite) to a CAD modeler (CADfix). Feature declarations were translated into SQL queries which are then executed.The current paper extends this approach, using the PostgreSQL database, and provides several new insights: (i) query optimization works quite differently in these two databases, (ii) with care, an approach to query translation can be devised that works well for both databases, and (iii) when finding various simple common features, linear time performance can be achieved with respect to model size, with acceptable times for real industrial models. Further results also show how (i) lazy evaluation can be used to reduce the work performed by the CAD modeler, and (ii) estimating the time taken to compute various geometric operations can further improve the query plan. Experimental results are presented to validate our main conclusions. 相似文献

16.

Query verification schemes for cloud-hosted databases: a brief survey

Faizal Riaz-ud-Din Robin Doss 《International Journal of Parallel, Emergent and Distributed Systems》2016,31(6):543-561

Database query verification schemes provide correctness guarantees for database queries. Typically such guarantees are required and advisable where queries are executed on untrusted servers. This need to verify query results, even though they may have been executed on one’s own database, is something new that has arisen with the advent of cloud services. The traditional model of hosting one’s own databases on one’s own servers did not require such verification because the hardware and software were both entirely within one’s control, and therefore fully trusted. However, with the economical and technological benefits of cloud services beckoning, many are now considering outsourcing both data and execution of database queries to the cloud, despite obvious risks. This survey paper provides an overview into the field of database query verification and explores the current state of the art in terms of query execution and correctness guarantees provided for query results. We also provide indications towards future work in the area. 相似文献

17.

Correcting queries for XML

Sara Cohen Tali Brodianskiy 《Information Systems》2009,34(8):690-710

It has been observed that queries over XML data sources are often unsatisfiable. Unsatisfiability may stem from several different sources, e.g., the user may be insufficiently familiar with the labels appearing the documents, or may not be intimately aware of the hierarchical structure of the documents. To deal with query and document mismatches, previous research has considered returning answers that maximally satisfy (in some sense) the query, instead of only returning strictly satisfying answers. However, this breaks the golden database rule that only strictly satisfying answers are returned when querying. Indeed, the relationship between the query and answers is no longer clear, when unsatisfying answers are returned. To reinstate the golden database rule, this article proposes a framework for automatically correcting queries over XML. This framework generates similar satisfiable queries, when the user query is unsatisfiable. The user can then choose a satisfiable query of interest, and receive exactly satisfying answers to this query. 相似文献

18.

The price of validity in dynamic networks

《Journal of Computer and System Sciences》2007,73(3):245-264

Massive-scale self-administered networks like Peer-to-Peer and Sensor Networks have data distributed across thousands of participant hosts. These networks are highly dynamic with short-lived hosts being the norm rather than an exception. In recent years, researchers have investigated best-effort algorithms to efficiently process aggregate queries (e.g., sum, count, average, minimum and maximum) on these networks. Unfortunately, query semantics for best-effort algorithms are ill-defined, making it hard to reason about guarantees associated with the result returned. In this paper, we specify a correctness condition, Single-Site Validity, with respect to which the above algorithms are best-effort. We present a class of algorithms that guarantee validity in dynamic networks. Experiments on real-life and synthetic network topologies validate performance of our algorithms, revealing the hitherto unknown price of validity. 相似文献

19.

Evaluating refined queries in top-k retrieval systems 总被引：2，自引：0，他引：2

Kaushik Chakrabarti Ortega-Binderberger M. Mehrotra S. Porkaew K. 《Knowledge and Data Engineering, IEEE Transactions on》2004,16(2):256-270

In many applications, users specify target values for certain attributes/features without requiring exact matches to these values in return. Instead, the result is typically a ranked list of "top k" objects that best match the specified feature values. User subjectivity is an important aspect of such queries, i.e., which objects are relevant to the user and which are not depends on the perception of the user. Due to the subjective nature of top-k queries, the answers returned by the system to an user query often do not satisfy the users need right away, either because the weights and the distance functions associated with the features do not accurately capture the users perception or because the specified target values do not fully capture her information need or both. In such cases, the user would like to refine the query and resubmit it in order to get back a better set of answers. While there has been a lot of research on query refinement models, there is no work that we are aware of on supporting refinement of top-k queries efficiently in a database system. Done naively, each "refined" query can be treated as a "starting" query and evaluated from scratch. We explore alternative approaches that significantly improve the cost of evaluating refined queries by exploiting the observation that the refined queries are not modified drastically from one iteration to another. Our experiments over a real-life multimedia data set show that the proposed techniques save more than 80 percent of the execution cost of refined queries over the naive approach and is more than an order of magnitude faster than a simple sequential scan. 相似文献

20.

Normalization and optimization of schema mappings

Georg Gottlob Reinhard Pichler Vadim Savenkov 《The VLDB Journal The International Journal on Very Large Data Bases》2011,20(2):277-302

Schema mappings are high-level specifications that describe the relationship between database schemas. They are an important tool in several areas of database research, notably in data integration and data exchange. However, a concrete theory of schema mapping optimization including the formulation of optimality criteria and the construction of algorithms for computing optimal schema mappings is completely lacking to date. The goal of this work is to fill this gap. We start by presenting a system of rewrite rules to minimize sets of source-to-target tuple-generating dependencies. Moreover, we show that the result of this minimization is unique up to variable renaming. Hence, our optimization also yields a schema mapping normalization. By appropriately extending our rewrite rule system, we also provide a normalization of schema mappings containing equality-generating target dependencies. An important application of such a normalization is in the area of defining the semantics of query answering in data exchange, since several definitions in this area depend on the concrete syntactic representation of the mappings. This is, in particular, the case for queries with negated atoms and for aggregate queries. The normalization of schema mappings allows us to eliminate the effect of the concrete syntactic representation of the mapping from the semantics of query answering. We discuss in detail how our results can be fruitfully applied to aggregate queries. 相似文献