期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Query processing over object views of relational data 总被引：2，自引：0，他引：2

Gustav Fahl Tore Risch 《The VLDB Journal The International Journal on Very Large Data Bases》1997,6(4):261-281

This paper presents an approach to object view management for relational databases. Such a view mechanism makes it possible for users to transparently work with data in a relational database as if it was stored in an object-oriented (OO) database. A query against the object view is translated to one or several queries against the relational database. The results of these queries are then processed to form an answer to the initial query. The approach is not restricted to a ‘pure’ object view mechanism for the relational data, since the object view can also store its own data and methods. Therefore it must be possible to process queries that combine local data residing in the object view with data retrieved from the relational database. We discuss the key issues when object views of relational databases are developed, namely: how to map relational structures to sub-type/supertype hierarchies in the view, how to represent relational database access in OO query plans, how to provide the concept of object identity in the view, how to handle the fact that the extension of types in the view depends on the state of the relational database, and how to process and optimize queries against the object view. The results are based on experiences from a running prototype implementation. Edited by: M.T. ?zsu. Received April 12, 1995 / Accepted April 22, 1996 相似文献

2.

Approximate query processing using wavelets 总被引：7，自引：0，他引：7

Kaushik Chakrabarti Minos Garofalakis Rajeev Rastogi Kyuseok Shim 《The VLDB Journal The International Journal on Very Large Data Bases》2001,10(2-3):199-223

Approximate query processing has emerged as a cost-effective approach for dealing with the huge data volumes and stringent response-time requirements of today's decision support systems (DSS). Most work in this area, however, has so far been limited in its query processing scope, typically focusing on specific forms of aggregate queries. Furthermore, conventional approaches based on sampling or histograms appear to be inherently limited when it comes to approximating the results of complex queries over high-dimensional DSS data sets. In this paper, we propose the use of multi-dimensional wavelets as an effective tool for general-purpose approximate query processing in modern, high-dimensional applications. Our approach is based on building wavelet-coefficient synopses of the data and using these synopses to provide approximate answers to queries. We develop novel query processing algorithms that operate directly on the wavelet-coefficient synopses of relational tables, allowing us to process arbitrarily complex queries entirely in the wavelet-coefficient domain. This guarantees extremely fast response times since our approximate query execution engine can do the bulk of its processing over compact sets of wavelet coefficients, essentially postponing the expansion into relational tuples until the end-result of the query. We also propose a novel wavelet decomposition algorithm that can build these synopses in an I/O-efficient manner. Finally, we conduct an extensive experimental study with synthetic as well as real-life data sets to determine the effectiveness of our wavelet-based approach compared to sampling and histograms. Our results demonstrate that our techniques: (1) provide approximate answers of better quality than either sampling or histograms; (2) offer query execution-time speedups of more than two orders of magnitude; and (3) guarantee extremely fast synopsis construction times that scale linearly with the size of the data. Received: 7 August 2000 / Accepted: 1 April 2001 Published online: 7 June 2001 相似文献

3.

Query processing and optimization in Oracle Rdb 总被引：2，自引：0，他引：2

Gennady Antoshenkov Mohamed Ziauddin 《The VLDB Journal The International Journal on Very Large Data Bases》1996,5(4):229-237

This paper contains an overview of the technology used in the query processing and optimization component of Oracle Rdb, a relational database management system originally developed by Digital Equipment Corporation and now under development by Oracle Corporation. Oracle Rdb is a production system that supports the most demanding database applications, runs on multiple platforms and in a variety of environments. Edited by C. Mohan / Received August 1994 / Acceped August 1995 相似文献

4.

ObjectGlobe: Ubiquitous query processing on the Internet

R. Braumandl M. Keidl A. Kemper D. Kossmann A. Kreutz S. Seltzsam K. Stocker 《The VLDB Journal The International Journal on Very Large Data Bases》2001,10(1):48-71

We present the design of ObjectGlobe, a distributed and open query processor for Internet data sources. Today, data is published on the Internet via Web servers which have, if at all, very localized query processing capabilities. The goal of the ObjectGlobe project is to establish an open marketplace in which data and query processing capabilities can be distributed and used by any kind of Internet application. Furthermore, ObjectGlobe integrates cycle providers (i.e., machines) which carry out query processing operators. The overall picture is to make it possible to execute a query with – in principle – unrelated query operators, cycle providers, and data sources. Such an infrastructure can serve as enabling technology for scalable e-commerce applications, e.g., B2B and B2C market places, to be able to integrate data and data processing operations of a large number of participants. One of the main challenges in the design of such an open system is to ensure privacy and security. We discuss the ObjectGlobe security requirements, show how basic components such as the optimizer and runtime system need to be extended, and present the results of performance experiments that assess the additional cost for secure distributed query processing. Another challenge is quality of service management so that users can constrain the costs and running times of their queries. Received: 30 October 2000 / Accepted: 14 March 2001 Published online: 7 June 2001 相似文献

5.

Advanced data processing in KRISYS: modeling concepts, implementation techniques, and client/server issues

Stefan Deßloch Theo Härder Nelson Mattos Bernhard Mitschang Joachim Thomas 《The VLDB Journal The International Journal on Very Large Data Bases》1998,7(2):79-95

The increasing power of modern computers is steadily opening up new application domains for advanced data processing such as engineering and knowledge-based applications. To meet their requirements, concepts for advanced data management have been investigated during the last decade, especially in the field of object orientation. Over the last couple of years, the database group at the University of Kaiserslautern has been developing such an advanced database system, the KRISYS prototype. In this article, we report on the results and experiences obtained in the course of this project. The primary objective for the first version of KRISYS was to provide semantic features, such as an expressive data model, a set-oriented query language, deductive as well as active capabilities. The first KRISYS prototype became completely operational in 1989. To evaluate its features and to stabilize its functionality, we started to develop several applications with the system. These experiences marked the starting point for an overall redesign of KRISYS. Major goals were to tune KRISYS and its query-processing facilities to a suitable client/server environment, as well as to provide elaborate mechanisms for consistency control comprising semantic integrity constraints, multi-user synchronization, and failure recovery. The essential aspects of the resulting client/server architecture are embodied by the client-side data management needed to effectively support advanced applications and to gain the required system performance for interactive work. The project stages of KRISYS properly reflect the essential developments that have taken place in the research on advanced database systems over the last years. Hence, the subsequent discussions will bring up a number of important aspects with regard to advanced data processing that are of significant general importance, as well as of general applicability to database systems. Received June 18, 1996 / Accepted November 11, 1997 相似文献

6.

Exploiting early sorting and early partitioning for decision support query processing

J. Claussen A. Kemper D. Kossmann C. Wiesner 《The VLDB Journal The International Journal on Very Large Data Bases》2000,9(3):190-213

Decision support queries typically involve several joins, a grouping with aggregation, and/or sorting of the result tuples. We propose two new classes of query evaluation algorithms that can be used to speed up the execution of such queries. The algorithms are based on (1) early sorting and (2) early partitioning– or a combination of both. The idea is to push the sorting and/or the partitioning to the leaves, i.e., the base relations, of the query evaluation plans (QEPs) and thereby avoid sorting or partitioning large intermediate results generated by the joins. Both early sorting and early partitioning are used in combination with hash-based algorithms for evaluating the join(s) and the grouping. To enable early sorting, the sort order generated at an early stage of the QEP is retained through an arbitrary number of so-called order-preserving hash joins. To make early partitioning applicable to a large class of decision support queries, we generalize the so-called hash teams proposed by Graefe et al. [GBC98]. Hash teams allow to perform several hash-based operations (join and grouping) on the same attribute in one pass without repartitioning intermediate results. Our generalization consists of indirectly partitioning the input data. Indirect partitioning means partitioning the input data on an attribute that is not directly needed for the next hash-based operation, and it involves the construction of bitmaps to approximate the partitioning for the attribute that is needed in the next hash-based operation. Our performance experiments show that such QEPs based on early sorting, early partitioning, or both in combination perform significantly better than conventional strategies for many common classes of decision support queries. Received April 4, 2000 / Accepted June 23, 2000 相似文献

7.

A hierarchical representation of form documents for identification and retrieval

Pınar Duygulu Volkan Atalay 《International Journal on Document Analysis and Recognition》2002,5(1):17-27

In this paper, we present a logical representation for form documents to be used for identification and retrieval. A hierarchical structure is proposed to represent the structure of a form by using lines and the XY-tree approach. The approach is top-down and no domain knowledge such as the preprinted data or filled-in data is used. Geometrical modifications and slight variations are handled by this representation. Logically identical forms are associated to the same or similar hierarchical structure. Identification and the retrieval of similar forms are performed by computing the edit distances between the generated trees. Received: August 21, 2001 / Accepted: November 5, 2001 相似文献

8.

The impact of object technology on commercial transaction processing

Edward E. Cobb 《The VLDB Journal The International Journal on Very Large Data Bases》1997,6(3):173-190

Businesses today are searching for information solutions that enable them to compete in the global marketplace. To minimize risk, these solutions must build on existing investments, permit the best technology to be applied to the problem, and be manageable. Object technology, with its promise of improved productivity and quality in application development, delivers these characteristics but, to date, its deployment in commercial business applications has been limited. One possible reason is the absence of the transaction paradigm, widely used in commercial environments and essential for reliable business applications. For object technology to be a serious contender in the construction of these solutions requires: – technology for transactional objects. In December 1994, the Object Management Group adopted a specification for an object transaction service (OTS). The OTS specifies mechanisms for defining and manipulating transactions. Though derived from the X/Open distributed transaction processing model, OTS contains additional enhancements specifically designed for the object environment. Similar technology from Microsoft appeared at the end of 1995. – methodologies for building new business systems from existing parts. Business process re-engineering is forcing businesses to improve their operations which bring products to market. Workflow computing, when used in conjunction with “object wrappers” provides tools to both define and track execution of business processes which leverage existing applications and infrastructure. – an execution environment which satisfies the requirements of the operational needs of the business. Transaction processing (TP) monitor technology, though widely accepted for mainframe transaction processing, has yet to enjoy similar success in the client/server marketplace. Instead the database vendors, with their extensive tool suites, dominate. As object brokers mature they will require many of the functions of today's TP monitors. Marrying these two technologies can produce a robust execution environment which offers a superior alternative for building and deploying client/server applications. Edited by Andreas Reuter, Received February 1995 / Revised August 1995 / Accepted May 1996 相似文献

9.

Fast joins using join indices 总被引：1，自引：0，他引：1

Zhe Li Kenneth A. Ross 《The VLDB Journal The International Journal on Very Large Data Bases》1999,8(1):1-24

Two new algorithms, “Jive join” and “Slam join,” are proposed for computing the join of two relations using a join index. The algorithms are duals: Jive join range-partitions input relation tuple ids and then processes each partition, while Slam join forms ordered runs of input relation tuple ids and then merges the results. Both algorithms make a single sequential pass through each input relation, in addition to one pass through the join index and two passes through a temporary file, whose size is half that of the join index. Both algorithms require only that the number of blocks in main memory is of the order of the square root of the number of blocks in the smaller relation. By storing intermediate and final join results in a vertically partitioned fashion, our algorithms need to manipulate less data in memory at a given time than other algorithms. The algorithms are resistant to data skew and adaptive to memory fluctuations. Selection conditions can be incorporated into the algorithms. Using a detailed cost model, the algorithms are analyzed and compared with competing algorithms. For large input relations, our algorithms perform significantly better than Valduriez's algorithm, the TID join algorithm, and hash join algorithms. An experimental study is also conducted to validate the analytical results and to demonstrate the performance characteristics of each algorithm in practice. Received July 21, 1997 / Accepted June 8, 1998 相似文献

10.

Dictionary-based order-preserving string compression

Gennady Antoshenkov 《The VLDB Journal The International Journal on Very Large Data Bases》1997,6(1):26-39

As no database exists without indexes, no index implementation exists without order-preserving key compression, in particular, without prefix and tail compression. However, despite the great potentials of making indexes smaller and faster, application of general compression methods to ordered data sets has advanced very little. This paper demonstrates that the fast dictionary-based methods can be applied to order-preserving compression almost with the same freedom as in the general case. The proposed new technology has the same speed and a compression rate only marginally lower than the traditional order-indifferent dictionary encoding. Procedures for encoding and generating the encode tables are described covering such order-related features as ordered data set restrictions, sensitivity and insensitivity to a character position, and one-symbol encoding of each frequent trailing character sequence. The experimental results presented demonstrate five-folded compression on real-life data sets and twelve-folded compression on Wisconsin benchmark text fields. Edited by M.T. Ozsu. Received 1 February 1995 / Accepted 1 November 1995 相似文献

11.

A graphical user interface for Boolean query specification

Steve Jones Shona McInnes Mark S. Staveley 《International Journal on Digital Libraries》1999,2(2-3):207-223

Online information repositories commonly provide keyword search facilities through textual query languages based on Boolean logic. However, there is evidence to suggest that the syntactic demands of such languages can lead to user errors and adversely affect the time that it takes users to form queries. Users also face difficulties because of the conflict in semantics between AND and OR when used in Boolean logic and English language. Analysis of usage logs for the New Zealand Digital Library (NZDL) show that few Boolean queries contain more than three terms, use of the intersection operator dominates and that query refinement is common. We suggest that graphical query languages, in particular Venn-like diagrams, can alleviate the problems that users experience when forming Boolean expressions with textual languages. A study of the utility of Venn diagrams for query specification indicates that with little or no training users can interpret and form Venn-like diagrams in a consistent manner which accurately correspond to Boolean expressions. We describe VQuery, a Venn-diagram based user interface to the New Zealand Digital Library (NZDL). In a study which compared VQuery with a standard textual Boolean interface, users took significantly longer to form queries and produced more erroneous queries when using VQuery. We discuss the implications of these results and suggest directions for future work. Received: 15 December 1997 / Revised: June 1999 相似文献

12.

Using citing information to understand the logical structure of document images

Shuhua Wang Yang Cao Shijie Cai 《International Journal on Document Analysis and Recognition》2001,4(1):27-34

The most noticeable characteristic of a construction tender document is that its hierarchical architecture is not obviously expressed but is implied in the citing information. Currently available methods cannot deal with such documents. In this paper, the intra-page and inter-page relationships are analyzed in detail. The creation of citing relationships is essential to extracting the logical structure of tender documents. The hierarchy of tender documents naturally leads to extracting and displaying the logical structure as tree structure. This method is successfully implemented in VHTender, and is the key to the efficiency and flexibility of the whole system. Received February 28, 2000 / Revised October 20, 2000 相似文献

13.

Algebraic query optimisation for database programming languages 总被引：1，自引：0，他引：1

Alexandra Poulovassilis Carol Small 《The VLDB Journal The International Journal on Very Large Data Bases》1996,5(2):119-132

A major challenge still facing the designers and implementors of database programming languages (DBPLs) is that of query optimisation. We investigate algebraic query optimisation techniques for DBPLs in the context of a purely declarative functional language that supports sets as first-class objects. Since the language is computationally complete issues such as non-termination of expressions and construction of infinite data structures can be investigated, whilst its declarative nature allows the issue of side effects to be avoided and a richer set of equivalences to be developed. The language has a well-defined semantics which permits us to reason formally about the properties of expressions, such as their equivalence with other expressions and their termination. The support of a set bulk data type enables much prior work on the optimisation of relational languages to be utilised. In the paper we first give the syntax of our archetypal DBPL and briefly discuss its semantics. We then define a small but powerful algebra of operators over the set data type, provide some key equivalences for expressions in these operators, and list transformation principles for optimising expressions. Along the way, we identify some caveats to well-known equivalences for non-deductive database languages. We next extend our language with two higher level constructs commonly found in functional DBPLs: set comprehensions and functions with known inverses. Some key equivalences for these constructs are provided, as are transformation principles for expressions in them. Finally, we investigate extending our equivalences for the set operators to the analogous operators over bags. Although developed and formally proved in the context of a functional language, our findings are directly applicable to other DBPLs of similar expressiveness. Edited by Matthias Jarke, Jorge Bocca, Carlo Zaniolo. Received September 15, 1994 / Accepted September 1, 1995 相似文献

14.

MIL primitives for querying a fragmented world

Peter A. Boncz Martin L. Kersten 《The VLDB Journal The International Journal on Very Large Data Bases》1999,8(2):101-119

In query-intensive database application areas, like decision support and data mining, systems that use vertical fragmentation have a significant performance advantage. In order to support relational or object oriented applications on top of such a fragmented data model, a flexible yet powerful intermediate language is needed. This problem has been successfully tackled in Monet, a modern extensible database kernel developed by our group. We focus on the design choices made in the Monet interpreter language (MIL), its algebraic query language, and outline how its concept of tactical optimization enhances and simplifies the optimization of complex queries. Finally, we summarize the experience gained in Monet by creating a highly efficient implementation of MIL. Received November 10, 1998 / Accepted March 22, 1999 相似文献

15.

The Advanced Video Information System: data structures and query processing

Sibel Adalı K. Selçuk Candan Su-Shing Chen Kutluhan Erol V.S. Subrahmanian 《Multimedia Systems》1996,4(4):172-186

We describe how video data can be organized and structured so as to facilitate efficient querying. We develop a formal model for video data and show how spatial data structures, suitably modified, provide an elegant way of storing such data. We develop algorithms to process various kinds of video queries and show that, in most cases, the complexity of these algorithms is linear. A prototype system, called the Advanced Video Information System (AVIS), based on these concepts, has been designed at the University of Maryland. 相似文献

16.

Industrial bank check processing: the A2iA CheckReaderTM

Nikolai Gorski Valery Anisimov Emmanuel Augustin Olivier Baret Sergey Maximov 《International Journal on Document Analysis and Recognition》2001,3(4):196-206

This paper presents the current state of the A2iA CheckReaderTM – a commercial bank check recognition system. The system is designed to process the flow of payment documents associated with the check clearing process: checks themselves, deposit slips, money orders, cash tickets, etc. It processes document images and recognizes document amounts whatever their style and type – cursive, hand- or machine printed – expressed as numerals or as phrases. The system is adapted to read payment documents issued in different English- or French-speaking countries. It is currently in use at more than 100 large sites in five countries and processes daily over 10 million documents. The average read rate at the document level varies from 65 to 85% with a misread rate corresponding to that of a human operator (1%). Received October 13, 2000 / Revised December 4, 2000 相似文献

17.

A logical view of structured files

Serge Abiteboul Sophie Cluet Tova Milo 《The VLDB Journal The International Journal on Very Large Data Bases》1998,7(2):96-114

Structured data stored in files can benefit from standard database technology. In particular, we show here how such data can be queried and updated using declarative database languages. We introduce the notion of structuring schema, which consists of a grammar annotated with database programs. Based on a structuring schema, a file can be viewed as a database structure, queried and updated as such. For queries, we show that almost standard database optimization techniques can be used to answer queries without having to construct the entire database. For updates, we study in depth the propagation to the file of an update specified on the database view of this file. The problem is not feasible in general and we present a number of negative results. The positive results consist of techniques that allow to propagate updates efficiently under some reasonable locality conditions on the structuring schemas. Received November 1, 1995 / Accepted June 20, 1997 相似文献

18.

Supporting efficient multimedia database exploration

Wen-Syan Li K.Selçuk Candan Kyoji Hirata Yoshinori Hara 《The VLDB Journal The International Journal on Very Large Data Bases》2001,9(4):312-326

Due to the fuzziness of query specification and media matching, multimedia retrieval is conducted by way of exploration. It is essential to provide feedback so that users can visualize query reformulation alternatives and database content distribution. Since media matching is an expensive task, another issue is how to efficiently support exploration so that the system is not overloaded by perpetual query reformulation. In this paper, we present a uniform framework to represent statistical information of both semantics and visual metadata for images in the databases. We propose the concept of query verification, which evaluates queries using statistics, and provides users with feedback, including the strictness and reformulation alternatives of each query condition as well as estimated numbers of matches. With query verification, the system increases the efficiency of the multimedia database exploration for both users and the system. Such statistical information is also utilized to support progressive query processing and query relaxation. Received: 9 June 1998/ Accepted: 21 July 2000 Published online: 4 May 2001 相似文献

19.

Plausible clocks: constant size logical clocks for distributed systems

Francisco J. Torres-Rojas Mustaque Ahamad 《Distributed Computing》1999,12(4):179-195

Summary. In a Distributed System with N sites, the precise detection of causal relationships between events can only be done with vector clocks of size N. This gives rise to scalability and efficiency problems for logical clocks that can be used to order events accurately. In this paper we propose a class of logical clocks called plausible clocks that can be implemented with a number of components not affected by the size of the system and yet they provide good ordering accuracy. We develop rules to combine plausible clocks to produce more accurate clocks. Several examples of plausible clocks and their combination are presented. Using a simulation model, we evaluate the performance of these clocks. We also present examples of applications where constant size clocks can be used. Received: January 1997 / Accepted: January 1999 相似文献

20.

Using AVL trees for fault-tolerant group key management 总被引：1，自引：0，他引：1

Ohad Rodeh Kenneth P. Birman Danny Dolev 《International Journal of Information Security》2002,1(2):84-99

In this paper we describe an efficient algorithm for the management of group keys for group communication systems. Our algorithm is based on the notion of key graphs, previously used for managing keys in large Internet-protocol multicast groups. The standard protocol requires a centralized key server that has knowledge of the full key graph. Our protocol does not delegate this role to any one process. Rather, members enlist in a collaborative effort to create the group key graph. The key graph contains n keys, of which each member learns log₂n of them. We show how to balance the key graph, a result that is applicable to the centralized protocol. We also show how to optimize our distributed protocol, and provide a performance study of its capabilities. Published online: 26 October 2001 相似文献