共查询到20条相似文献,搜索用时 0 毫秒
1.
Kian-Lee Tan Cheng Hian Goh Beng Chin Ooi 《The VLDB Journal The International Journal on Very Large Data Bases》2000,9(3):261-278
In many decision-making scenarios, decision makers require rapid feedback to their queries, which typically involve aggregates.
The traditional blocking execution model can no longer meet the demands of these users. One promising approach in the literature, called online aggregation, evaluates an aggregation query progressively as follows: as soon as certain data have been evaluated, approximate answers
are produced with their respective running confidence intervals; as more data are examined, the answers and their corresponding
running confidence intervals are refined. In this paper, we extend this approach to handle nested queries with aggregates
(i.e., at least one inner query block is an aggregate query) by providing users with (approximate) answers progressively as
the inner aggregation query blocks are evaluated. We address the new issues pose by nested queries. In particular, the answer
space begins with a superset of the final answers and is refined as the aggregates from the inner query blocks are refined.
For the intermediary answers to be meaningful, they have to be interpreted with the aggregates from the inner queries. We
also propose a multi-threaded model in evaluating such queries: each query block is assigned to a thread, and the threads can be evaluated concurrently and independently.
The time slice across the threads is nondeterministic in the sense that the user controls the relative rate at which these subqueries are being evaluated. For enumerative nested queries, we propose a priority-based evaluation strategy to present answers that are certainly in the final answer
space first, before presenting those whose validity may be affected as the inner query aggregates are refined. We implemented
a prototype system using Java and evaluated our system. Results for nested queries with a level and multiple levels of nesting
are reported. Our results show the effectiveness of the proposed mechanisms in providing progressive feedback that reduces
the initial waiting time of users significantly without sacrificing the quality of the answers.
Received April 25, 2000 / Accepted June 27, 2000 相似文献
2.
We optimize relational queries using connection hypergraphs (CHGs). All operations including value-passing between SQL blocks
can be set-oriented. By introducing partial evaluations, reordering operations can be achieved for nested queries. For a query
using views, we merge CHGs for the views and the query into one CHG and then apply query optimization. Furthermore, we may
simulate magic sets methods elegantly in a CHG. Sideways information-passing strategies (SIPS) in a CHG amount to partial
evaluations of SIPS paths. We introduce the maximum SIPS strategy, which performs SIPS for all bindings and all SIPS paths
for a query. The new method has several advantages. First, the maximum SIPS strategy can be more efficient than the previous
SIPS based on simple heuristics. Second, it is conceptually simple and easy to implement. Third, the processing strategies
may be incorporated with the search space for query execution plans, which is a proven optimization strategy introduced by
System R. Fourth, it provides a general framework of query optimization and may potentially be used to optimize next-generation
database systems.
Received September 1, 1993 / Accepted January 8, 1996 相似文献
3.
Optimizing multiple dimensional queries simultaneously in multidimensional databases 总被引:1,自引:0,他引:1
Weifa Liang Maria E. Orlowska Jeffrey X. Yu 《The VLDB Journal The International Journal on Very Large Data Bases》2000,8(3-4):319-338
Some significant progress related to multidimensional data analysis has been achieved in the past few years, including the
design of fast algorithms for computing datacubes, selecting some precomputed group-bys to materialize, and designing efficient
storage structures for multidimensional data. However, little work has been carried out on multidimensional query optimization
issues. Particularly the response time (or evaluation cost) for answering several related dimensional queries simultaneously
is crucial to the OLAP applications. Recently, Zhao et al. first exploited this problem by presenting three heuristic algorithms.
In this paper we first consider in detail two cases of the problem in which all the queries are either hash-based star joins
or index-based star joins only. In the case of the hash-based star join, we devise a polynomial approximation algorithm which
delivers a plan whose evaluation cost is $ O(n^{\epsilon }$) times the optimal, where n is the number of queries and is a fixed constant with . We also present an exponential algorithm which delivers a plan with the optimal evaluation cost. In the case of the index-based
star join, we present a heuristic algorithm which delivers a plan whose evaluation cost is n times the optimal, and an exponential algorithm which delivers a plan with the optimal evaluation cost. We then consider
a general case in which both hash-based star-join and index-based star-join queries are included. For this case, we give a
possible improvement on the work of Zhao et al., based on an analysis of their solutions. We also develop another heuristic
and an exact algorithm for the problem. We finally conduct a performance study by implementing our algorithms. The experimental
results demonstrate that the solutions delivered for the restricted cases are always within two times of the optimal, which
confirms our theoretical upper bounds. Actually these experiments produce much better results than our theoretical estimates.
To the best of our knowledge, this is the only development of polynomial algorithms for the first two cases which are able
to deliver plans with deterministic performance guarantees in terms of the qualities of the plans generated. The previous
approaches including that of [ZDNS98] may generate a feasible plan for the problem in these two cases, but they do not provide
any performance guarantee, i.e., the plans generated by their algorithms can be arbitrarily far from the optimal one.
Received: July 21, 1998 / Accepted: August 26, 1999 相似文献
4.
Chiang Lee Chi-Sheng Shih Yaw-Huei Chen 《The VLDB Journal The International Journal on Very Large Data Bases》2001,9(4):327-343
Traditional algorithms for optimizing the execution order of joins are no more valid when selections and projections involve
methods and become very expensive operations. Selections and projections could be even more costly than joins such that they
are pulled above joins, rather than pushed down in a query tree. In this paper, we take a fundamental look at how to approach
query optimization from a top-down design perspective, rather than trying to force one model to fit into another. We present
a graph model which is designed to characterize execution plans. Each edge and each vertex of the graph is assigned a weight
to model execution plans. We also design algorithms that use these weights to optimize the execution order of operations.
A cost model of these algorithms is developed. Experiments are conducted on the basis of this cost model. The results show
that our algorithms are superior to similar work proposed in the literature.
Received 20 April 1999 / Accepted 9 August 2000 Published online 20 April 2001 相似文献
5.
Abstract. Providing a customized result set based upon a user preference is the ultimate objective of many content-based image retrieval
systems. There are two main challenges in meeting this objective: First, there is a gap between the physical characteristics
of digital images and the semantic meaning of the images. Secondly, different people may have different perceptions on the
same set of images. To address both these challenges, we propose a model, named Yoda, that conceptualizes content-based querying
as the task of soft classifying images into classes. These classes can overlap, and their members are different for different
users. The “soft” classification is hence performed for each and every image feature, including both physical and semantic
features. Subsequently, each image will be ranked based on the weighted aggregation of its classification memberships. The
weights are user-dependent, and hence different users would obtain different result sets for the same query. Yoda employs
a fuzzy-logic based aggregation function for ranking images. We show that, in addition to some performance benefits, fuzzy
aggregation is less sensitive to noise and can support disjunctive queries as compared to weighted-average aggregation used
by other content-based image retrieval systems. Finally, since Yoda heavily relies on user-dependent weights (i.e., user profiles)
for the aggregation task, we utilize the users' relevance feedback to improve the profiles using genetic algorithms (GA).
Our learning mechanism requires fewer user interactions, and results in a faster convergence to the user's preferences as
compared to other learning techniques.
Correspondence to: Y.-S. Chen (E-mail: yishinc@usc.edu)
This research has been funded in part by NSF grants EEC-9529152 (IMSC ERC) and IIS-0082826, NIH-NLM R01-LM07061, DARPA and
USAF under agreement nr. F30602-99-1-0524, and unrestricted cash gifts from NCR, Microsoft, and Okawa Foundation. 相似文献
6.
Boris Chidlovskii Uwe M. Borghoff 《The VLDB Journal The International Journal on Very Large Data Bases》2000,9(1):2-17
Abstract. In meta-searchers accessing distributed Web-based information repositories, performance is a major issue. Efficient query
processing requires an appropriate caching mechanism. Unfortunately, standard page-based as well as tuple-based caching mechanisms
designed for conventional databases are not efficient on the Web, where keyword-based querying is often the only way to retrieve
data. In this work, we study the problem of semantic caching of Web queries and develop a caching mechanism for conjunctive
Web queries based on signature files. Our algorithms cope with both relations of semantic containment and intersection between a query and the corresponding cache
items. We also develop the cache replacement strategy to treat situations when cached items differ in size and contribution
when providing partial query answers. We report results of experiments and show how the caching mechanism is realized in the
Knowledge Broker system.
Received June 15, 1999 / Accepted December 24, 1999 相似文献
7.
J. Claussen A. Kemper D. Kossmann C. Wiesner 《The VLDB Journal The International Journal on Very Large Data Bases》2000,9(3):190-213
Decision support queries typically involve several joins, a grouping with aggregation, and/or sorting of the result tuples.
We propose two new classes of query evaluation algorithms that can be used to speed up the execution of such queries. The
algorithms are based on (1) early sorting and (2) early partitioning– or a combination of both. The idea is to push the sorting and/or the partitioning to the leaves, i.e., the base relations,
of the query evaluation plans (QEPs) and thereby avoid sorting or partitioning large intermediate results generated by the
joins. Both early sorting and early partitioning are used in combination with hash-based algorithms for evaluating the join(s)
and the grouping. To enable early sorting, the sort order generated at an early stage of the QEP is retained through an arbitrary
number of so-called order-preserving hash joins. To make early partitioning applicable to a large class of decision support queries, we generalize the so-called hash teams
proposed by Graefe et al. [GBC98]. Hash teams allow to perform several hash-based operations (join and grouping) on the same
attribute in one pass without repartitioning intermediate results. Our generalization consists of indirectly partitioning
the input data. Indirect partitioning means partitioning the input data on an attribute that is not directly needed for the
next hash-based operation, and it involves the construction of bitmaps to approximate the partitioning for the attribute that
is needed in the next hash-based operation. Our performance experiments show that such QEPs based on early sorting, early partitioning, or both in combination perform significantly better than conventional strategies for many common classes of decision support
queries.
Received April 4, 2000 / Accepted June 23, 2000 相似文献
8.
9.
10.
Subspace-based line detection (SLIDE) is a novel approach for straight line fitting that has recently been suggested by Aghajan
and Kailath. It is based on an analogy made between a straight line in an image and a planar propagating wavefront impinging
on an array of sensors. Efficient sensor array processing algorithms are used to detect the parameters of the line. SLIDE
is computationally cheaper than the Hough transform, but it has not been clear whether or not this is a magical free bonus.
In particular, it has not been known how the breakpoints of SLIDE relate to those of the Hough transform. We compare the failure
modes and limitations of the two algorithms and demonstrate that SLIDE is significantly less robust than the Hough transform. 相似文献
11.
Practical volumetric sculpting 总被引:2,自引:1,他引:2
sculpture metaphor for rapid shape prototyping. The sculpted shape is the isosurface of a scalar field spatially sampled. The user can deposit
material wherever he desires in space and then iteratively refine it, using a tool to add, remove, paint, or smooth some material.
We allow the use of free-form tools that can be designed inside the application. We also propose a technique to mimic local
deformations so that we can use the tool as a stamp to make imprints on an existing shape. We focus on the rendering quality
too, exploiting lighting variations and environment textures that simulate good-quality highlights on the surface. Both greatly
enhance the shape estimation, which is a crucial step in this iterative design process, in our opinion. The use of stereo
also greatly eases the understanding of spatial relationships. Our current implementation is based on GLUT and can run the
application both on Unix-based systems, such as Irix and Linux, and on Windows systems. We obtain interactive response times,
strongly related to the size of the tool. The performance issues and limitations are discussed. 相似文献
12.
13.
Published online: 15 March 2002 相似文献
14.
In this paper, we present a novel approach for multimedia data indexing and retrieval that is machine independent and highly
flexible for sharing multimedia data across applications. Traditional multimedia data indexing and retrieval problems have
been attacked using the central data server as the main focus, and most of the indexing and query-processing for retrieval
are highly application dependent. This precludes the use of created indices and query processing mechanisms for multimedia
data which, in general, have a wide variety of uses across applications. The approach proposed in this paper addresses three
issues: 1. multimedia data indexing; 2. inference or query processing; and 3. combining indices and inference or query mechanism
with the data to facilitate machine independence in retrieval and query processing. We emphasize the third issue, as typically
multimedia data are huge in size and requires intra-data indexing. We describe how the proposed approach addresses various
problems faced by the application developers in indexing and retrieval of multimedia data. Finally, we present two applications
developed based on the proposed approach: video indexing; and video content authorization for presentation. 相似文献
15.
Abstract. In this paper we study the ability of shared object types to implement Consensus in asynchronous shared-memory systems where
at most one process may crash. More specifically, we consider the following question: Let and be a set of object types that can be used to solve one-resilient Consensus among n processes. Can always be used to solve one-resilient Consensus among n - 1 processes? We prove that for n = 3 the answer is negative, even if consists only ofdeterministic types. (This strengthens an earlier result by the first author proving the same fact for nondeterministic types.) We also prove that, in contrast, for the answer to the above question is affirmative.
Received: July 1997 / Accepted: May 2000 相似文献
16.
Approximate query processing using wavelets 总被引:7,自引:0,他引:7
Kaushik Chakrabarti Minos Garofalakis Rajeev Rastogi Kyuseok Shim 《The VLDB Journal The International Journal on Very Large Data Bases》2001,10(2-3):199-223
Approximate query processing has emerged as a cost-effective approach for dealing with the huge data volumes and stringent
response-time requirements of today's decision support systems (DSS). Most work in this area, however, has so far been limited
in its query processing scope, typically focusing on specific forms of aggregate queries. Furthermore, conventional approaches
based on sampling or histograms appear to be inherently limited when it comes to approximating the results of complex queries
over high-dimensional DSS data sets. In this paper, we propose the use of multi-dimensional wavelets as an effective tool
for general-purpose approximate query processing in modern, high-dimensional applications. Our approach is based on building
wavelet-coefficient synopses of the data and using these synopses to provide approximate answers to queries. We develop novel query processing algorithms
that operate directly on the wavelet-coefficient synopses of relational tables, allowing us to process arbitrarily complex
queries entirely in the wavelet-coefficient domain. This guarantees extremely fast response times since our approximate query execution engine
can do the bulk of its processing over compact sets of wavelet coefficients, essentially postponing the expansion into relational
tuples until the end-result of the query. We also propose a novel wavelet decomposition algorithm that can build these synopses
in an I/O-efficient manner. Finally, we conduct an extensive experimental study with synthetic as well as real-life data sets
to determine the effectiveness of our wavelet-based approach compared to sampling and histograms. Our results demonstrate
that our techniques: (1) provide approximate answers of better quality than either sampling or histograms; (2) offer query
execution-time speedups of more than two orders of magnitude; and (3) guarantee extremely fast synopsis construction times
that scale linearly with the size of the data.
Received: 7 August 2000 / Accepted: 1 April 2001 Published online: 7 June 2001 相似文献
17.
On fast microscopic browsing of MPEG-compressed video 总被引:1,自引:0,他引:1
Boon-Lock Yeo 《Multimedia Systems》1999,7(4):269-281
MPEG has been established as a compression standard for efficient storage and transmission of digital video. However, users
are limited to VCR-like (and tedious) functionalities when viewing MPEG video. The usefulness of MPEG video is presently limited
by the lack of tools available for fast browsing, manipulation and processing of MPEG video.
In this paper, we first address the problem of rapid access to individual shots and frames in MPEG video. We build upon the
compressed-video-processing framework proposed in [1, 8], and propose new and fast algorithms based on an adaptive mixture
of approximation techniques for extracting spatially reduced image sequence of uniform quality from MPEG video across different frame types and also under different motion activities in the scenes. The algorithms
execute faster than real time on a Pentium personal computer. We demonstrate how the reduced images facilitate fast and convenient
shot- and frame-level video browsing and access, shot-level editing and annotation, without the need for frequent decompression
of MPEG video. We further propose methods for reducing the auxiliary data size associated with the reduced images through
exploitation of spatial and temporal redundancy. We also address how the reduced images lead to computationally efficient algorithms for video analysis based
on intra- and inter-shot processing for video database and browsing applications. The algorithms, tools for browsing and techniques
for video processing presented in this paper have been used by many in IBM Research on more than 30 h of MPEG-1 video for
video browsing and analysis. 相似文献
18.
Query processing over object views of relational data 总被引:2,自引:0,他引:2
Gustav Fahl Tore Risch 《The VLDB Journal The International Journal on Very Large Data Bases》1997,6(4):261-281
This paper presents an approach to object view management for relational databases. Such a view mechanism makes it possible for users to transparently work with data in
a relational database as if it was stored in an object-oriented (OO) database. A query against the object view is translated
to one or several queries against the relational database. The results of these queries are then processed to form an answer
to the initial query. The approach is not restricted to a ‘pure’ object view mechanism for the relational data, since the
object view can also store its own data and methods. Therefore it must be possible to process queries that combine local data
residing in the object view with data retrieved from the relational database. We discuss the key issues when object views
of relational databases are developed, namely: how to map relational structures to sub-type/supertype hierarchies in the view,
how to represent relational database access in OO query plans, how to provide the concept of object identity in the view,
how to handle the fact that the extension of types in the view depends on the state of the relational database, and how to
process and optimize queries against the object view. The results are based on experiences from a running prototype implementation.
Edited by: M.T. ?zsu. Received April 12, 1995 / Accepted April 22, 1996 相似文献
19.
20.
R. Braumandl M. Keidl A. Kemper D. Kossmann A. Kreutz S. Seltzsam K. Stocker 《The VLDB Journal The International Journal on Very Large Data Bases》2001,10(1):48-71
We present the design of ObjectGlobe, a distributed and open query processor for Internet data sources. Today, data is published
on the Internet via Web servers which have, if at all, very localized query processing capabilities. The goal of the ObjectGlobe
project is to establish an open marketplace in which data and query processing capabilities can be distributed and used by any kind of Internet application. Furthermore, ObjectGlobe integrates cycle providers (i.e., machines) which carry out query processing operators. The overall picture is to make it possible to execute a query
with – in principle – unrelated query operators, cycle providers, and data sources. Such an infrastructure can serve as enabling
technology for scalable e-commerce applications, e.g., B2B and B2C market places, to be able to integrate data and data processing
operations of a large number of participants. One of the main challenges in the design of such an open system is to ensure
privacy and security. We discuss the ObjectGlobe security requirements, show how basic components such as the optimizer and
runtime system need to be extended, and present the results of performance experiments that assess the additional cost for
secure distributed query processing. Another challenge is quality of service management so that users can constrain the costs
and running times of their queries.
Received: 30 October 2000 / Accepted: 14 March 2001 Published online: 7 June 2001 相似文献