共查询到20条相似文献,搜索用时 593 毫秒
1.
Ada Wai-chee Fu Polly Mei-shuen Chan Yin-Ling Cheung Yiu Sang Moon 《The VLDB Journal The International Journal on Very Large Data Bases》2000,9(2):154-173
Abstract. For some multimedia applications, it has been found that domain objects cannot be represented as feature vectors in a multidimensional
space. Instead, pair-wise distances between data objects are the only input. To support content-based retrieval, one approach
maps each object to a k-dimensional (k-d) point and tries to preserve the distances among the points. Then, existing spatial access index methods such as the R-trees
and KD-trees can support fast searching on the resulting k-d points. However, information loss is inevitable with such an approach since the distances between data objects can only
be preserved to a certain extent. Here we investigate the use of a distance-based indexing method. In particular, we apply
the vantage point tree (vp-tree) method. There are two important problems for the vp-tree method that warrant further investigation,
the n-nearest neighbors search and the updating mechanisms. We study an n-nearest neighbors search algorithm for the vp-tree, which is shown by experiments to scale up well with the size of the dataset
and the desired number of nearest neighbors, n. Experiments also show that the searching in the vp-tree is more efficient than that for the -tree and the M-tree. Next, we propose solutions for the update problem for the vp-tree, and show by experiments that the algorithms are
efficient and effective. Finally, we investigate the problem of selecting vantage-point, propose a few alternative methods,
and study their impact on the number of distance computation.
Received June 9, 1998 / Accepted January 31, 2000 相似文献
2.
Bogdan S. Chlebus Leszek Gasieniec Alan Gibbons Andrzej Pelc Wojciech Rytter 《Distributed Computing》2002,15(1):27-38
We consider the problem of distributed deterministic broadcasting in radio networks of unknown topology and size. The network
is synchronous. If a node u can be reached from two nodes which send messages in the same round, none of the messages is received by u. Such messages block each other and node u either hears the noise of interference of messages, enabling it to detect a collision, or does not hear anything at all, depending on the model. We assume that nodes know neither the topology nor the size of
the network, nor even their immediate neighborhood. The initial knowledge of every node is limited to its own label. Such
networks are called ad hoc multi-hop networks. We study the time of deterministic broadcasting under this scenario.
For the model without collision detection, we develop a linear-time broadcasting algorithm for symmetric graphs, which is
optimal, and an algorithm for arbitrary n-node graphs, working in time . Next we show that broadcasting with acknowledgement is not possible in this model at all.
For the model with collision detection, we develop efficient algorithms for broadcasting and for acknowledged broadcasting
in strongly connected graphs.
Received: January 2000 / Accepted: June 2001 相似文献
3.
Distance-based outliers: algorithms and applications 总被引:20,自引:0,他引:20
Edwin M. Knorr Raymond T. Ng Vladimir Tucakov 《The VLDB Journal The International Journal on Very Large Data Bases》2000,8(3-4):237-253
This paper deals with finding outliers (exceptions) in large, multidimensional datasets. The identification of outliers can
lead to the discovery of truly unexpected knowledge in areas such as electronic commerce, credit card fraud, and even the
analysis of performance statistics of professional athletes. Existing methods that we have seen for finding outliers can only
deal efficiently with two dimensions/attributes of a dataset. In this paper, we study the notion of DB (distance-based) outliers. Specifically, we show that (i) outlier detection can be done efficiently for large datasets, and for k-dimensional datasets with large values of k (e.g., ); and (ii), outlier detection is a meaningful and important knowledge discovery task.
First, we present two simple algorithms, both having a complexity of , k being the dimensionality and N being the number of objects in the dataset. These algorithms readily support datasets with many more than two attributes.
Second, we present an optimized cell-based algorithm that has a complexity that is linear with respect to N, but exponential with respect to k. We provide experimental results indicating that this algorithm significantly outperforms the two simple algorithms for . Third, for datasets that are mainly disk-resident, we present another version of the cell-based algorithm that guarantees
at most three passes over a dataset. Again, experimental results show that this algorithm is by far the best for . Finally, we discuss our work on three real-life applications, including one on spatio-temporal data (e.g., a video surveillance
application), in order to confirm the relevance and broad applicability of DB outliers.
Received February 15, 1999 / Accepted August 1, 1999 相似文献
4.
In this paper, we present a correlation scheme that incorporates a color ring-projection representation for the automatic
inspection of defects in textured surfaces. The proposed color ring projection transforms a 2-D color image into a 1-D color
pattern as a function of radius. For a search window of width W, data dimensionality is reduced from in the 2-D image to O(W) in the 1-D ring-projection space. The complexity of computing a correlation function is significantly reduced accordingly.
Since the color ring-projection representation is invariant to rotation, the proposed method can be applied for both isotropic
and oriented textures at arbitrary orientations. Experiments on regular textured surfaces have shown the efficacy of the proposed
method.
Received: 30 March 2000 / Accepted: 24 July 2001
Correspondence to: D.-M. Tsai (e-mail: iedmtsai@saturn.yzu.edu.tw) 相似文献
5.
Optimizing multiple dimensional queries simultaneously in multidimensional databases 总被引:1,自引:0,他引:1
Weifa Liang Maria E. Orlowska Jeffrey X. Yu 《The VLDB Journal The International Journal on Very Large Data Bases》2000,8(3-4):319-338
Some significant progress related to multidimensional data analysis has been achieved in the past few years, including the
design of fast algorithms for computing datacubes, selecting some precomputed group-bys to materialize, and designing efficient
storage structures for multidimensional data. However, little work has been carried out on multidimensional query optimization
issues. Particularly the response time (or evaluation cost) for answering several related dimensional queries simultaneously
is crucial to the OLAP applications. Recently, Zhao et al. first exploited this problem by presenting three heuristic algorithms.
In this paper we first consider in detail two cases of the problem in which all the queries are either hash-based star joins
or index-based star joins only. In the case of the hash-based star join, we devise a polynomial approximation algorithm which
delivers a plan whose evaluation cost is $ O(n^{\epsilon }$) times the optimal, where n is the number of queries and is a fixed constant with . We also present an exponential algorithm which delivers a plan with the optimal evaluation cost. In the case of the index-based
star join, we present a heuristic algorithm which delivers a plan whose evaluation cost is n times the optimal, and an exponential algorithm which delivers a plan with the optimal evaluation cost. We then consider
a general case in which both hash-based star-join and index-based star-join queries are included. For this case, we give a
possible improvement on the work of Zhao et al., based on an analysis of their solutions. We also develop another heuristic
and an exact algorithm for the problem. We finally conduct a performance study by implementing our algorithms. The experimental
results demonstrate that the solutions delivered for the restricted cases are always within two times of the optimal, which
confirms our theoretical upper bounds. Actually these experiments produce much better results than our theoretical estimates.
To the best of our knowledge, this is the only development of polynomial algorithms for the first two cases which are able
to deliver plans with deterministic performance guarantees in terms of the qualities of the plans generated. The previous
approaches including that of [ZDNS98] may generate a feasible plan for the problem in these two cases, but they do not provide
any performance guarantee, i.e., the plans generated by their algorithms can be arbitrarily far from the optimal one.
Received: July 21, 1998 / Accepted: August 26, 1999 相似文献
6.
We consider the problem of scheduling a set of pages on a single broadcast channel using time-multiplexing. In a perfectly periodic schedule, time is divided into equal size slots, and each page is transmitted in a time slot precisely every fixed interval of time (the period of the page). We study the case in which each page i has a given demand probability , and the goal is to design a perfectly periodic schedule that minimizes the average time a random client waits until its
page is transmitted. We seek approximate polynomial solutions. Approximation bounds are obtained by comparing the costs of
a solution provided by an algorithm and a solution to a relaxed (non-integral) version of the problem. A key quantity in our
methodology is a fraction we denote by , that depends on the maximum demand probability: . The best known polynomial algorithm to date guarantees an approximation of . In this paper, we develop a tree-based methodology for perfectly periodic scheduling, and using new techniques, we derive
algorithms with better bounds. For small values, our best algorithm guarantees approximation of . On the other hand, we show that the integrality gap between the cost of any perfectly periodic schedule and the cost of
the fractional problem is at least . We also provide algorithms with good performance guarantees for large values of .
Received: December 2001 / Accepted: September 2002 相似文献
7.
Yuh-Jzer Joung 《Distributed Computing》2002,15(3):155-175
Summary. Group mutual exclusion occurs naturally in situations where a resource can be shared by processes of the same group, but
not by processes of different groups. For example, suppose data is stored in a CD-jukebox. Then when a disc is loaded for
access, users that need data on the disc can concurrently access the disc, while users that need data on a different disc
have to wait until the current disc is unloaded.
The design issues for group mutual exclusion have been modeled as the Congenial Talking Philosophers problem, and solutions for shared-memory models have been proposed [12,14]. As in ordinary mutual exclusion and many other
problems in distributed systems, however, techniques developed for shared memory do not necessary apply to message passing
(and vice versa). So in this paper we investigate solutions for Congenial Talking Philosophers in computer networks where
processes communicate by asynchronous message passing. We first present a solution that is a straightforward adaptation from
Ricart and Agrawala's algorithm for ordinary mutual exclusion. Then we show that the simple modification suffers a severe
performance degradation that could cause the system to behave as though only one process of a group can be in the critical
section at a time. We then present a more efficient and highly concurrent distributed algorithm for the problem, the first
such solution in computer networks.
Received: August 2000 / Accepted: November 2001 相似文献
8.
A bin picking system based on depth from defocus 总被引:3,自引:0,他引:3
It is generally accepted that to develop versatile bin-picking systems capable of grasping and manipulation operations, accurate
3-D information is required. To accomplish this goal, we have developed a fast and precise range sensor based on active depth from defocus (DFD). This sensor is used in conjunction with a three-component vision system, which is able to recognize and evaluate the
attitude of 3-D objects. The first component performs scene segmentation using an edge-based approach. Since edges are used
to detect the object boundaries, a key issue consists of improving the quality of edge detection. The second component attempts
to recognize the object placed on the top of the object pile using a model-driven approach in which the segmented surfaces
are compared with those stored in the model database. Finally, the attitude of the recognized object is evaluated using an
eigenimage approach augmented with range data analysis. The full bin-picking system will be outlined, and a number of experimental
results will be examined.
Received: 2 December 2000 / Accepted: 9 September 2001
Correspondence to: O. Ghita 相似文献
9.
Motion picture films are susceptible to local degradations such as dust spots. Other deteriorations are global such as intensity
and spatial jitter. It is obvious that motion needs to be compensated for before the detection/correction of such local and
dynamic defects. Therefore, we propose a hierarchical motion estimation method ideally suited for high resolution film sequences.
This recursive block-based motion estimator relies on an adaptive search strategy and Radon projections to improve processing
speed. The localization of dust particles then becomes straightforward. Thus, it is achieved by simple inter-frame differences
between the current image and motion compensated successive and preceding frames. However, the detection of spatial and intensity
jitter requires a specific process taking advantage of the high temporal correlation in the image sequence. In this paper,
we present our motion compensation-based algorithms for removing dust spots, spatial and intensity jitter in degraded motion
pictures. Experimental results are presented showing the usefulness of our motion estimator for film restoration at reasonable
computational costs.
Received: 9 July 2000 / Accepted: 13 January 2002
Correspondence to:S. Boukir 相似文献
10.
We present a shared memory algorithm that allows a set of f+1 processes to wait-free “simulate” a larger system of n processes, that may also exhibit up to f stopping failures.
Applying this simulation algorithm to the k-set-agreement problem enables conversion of an arbitrary k-fault-tolerant{\it n}-process solution for the k-set-agreement problem into a wait-free k+1-process solution for the same problem. Since the k+1-processk-set-agreement problem has been shown to have no wait-free solution [5,18,26], this transformation implies that there is no
k-fault-tolerant solution to the n-process k-set-agreement problem, for any n.
More generally, the algorithm satisfies the requirements of a fault-tolerant distributed simulation.\/ The distributed simulation implements a notion of fault-tolerant reducibility\/ between decision problems. This paper defines these notions and gives examples of their application to fundamental distributed
computing problems.
The algorithm is presented and verified in terms of I/O automata. The presentation has a great deal of interesting modularity,
expressed by I/O automaton composition and both forward and backward simulation relations. Composition is used to include
a safe agreement\/ module as a subroutine. Forward and backward simulation relations are used to view the algorithm as implementing a multi-try snapshot\/ strategy.
The main algorithm works in snapshot shared memory systems; a simple modification of the algorithm that works in read/write
shared memory systems is also presented.
Received: February 2001 / Accepted: February 2001 相似文献
11.
Summary. This work considers the problem of performing t tasks in a distributed system of p fault-prone processors. This problem, called do-all herein, was introduced by Dwork, Halpern and Waarts. The solutions presented here are for the model of computation that abstracts
a synchronous message-passing distributed system with processor stop-failures and restarts. We present two new algorithms
based on a new aggressive coordination paradigm by which multiple coordinators may be active as the result of failures. The
first algorithm is tolerant of stop-failures and does not allow restarts. Its available processor steps (work) complexity is and its message complexity is . Unlike prior solutions, our algorithm uses redundant broadcasts when encountering failures and, for p =t and largef, it achieves better work complexity. This algorithm is used as the basis for another algorithm that tolerates stop-failures
and restarts. This new algorithm is the first solution for the do-all problem that efficiently deals with processor restarts. Its available processor steps is , and its message complexity is , wheref is the total number of failures.
Received: October 1998 / Accepted: September 2000 相似文献
12.
Data overload is a generic and tremendously difficult problem that has only grown with each new wave of technological capabilities.
As a generic and persistent problem, three observations are in need of explanation: Why is data overload so difficult to address?
Why has each wave of technology exacerbated, rather than resolved, data overload? How are people, as adaptive responsible
agents in context, able to cope with the challenge of data overload? In this paper, first we examine three different characterisations
that have been offered to capture the nature of the data overload problem and how they lead to different proposed solutions.
As a result, we propose that (a) data overload is difficult because of the context sensitivity problem – meaning lies, not
in data, but in relationships of data to interests and expectations and (b) new waves of technology exacerbate data overload
when they ignore or try to finesse context sensitivity. The paper then summarises the mechanisms of human perception and cognition
that enable people to focus on the relevant subset of the available data despite the fact that what is interesting depends
on context. By focusing attention on the root issues that make data overload a difficult problem and on people’s fundamental
competence, we have identified a set of constraints that all potential solutions must meet. Notable among these constraints
is the idea that organisation precedes selectivity. These constraints point toward regions of the solution space that have
been little explored. In order to place data in context, designers need to display data in a conceptual space that depicts
the relationships, events and contrasts that are informative in a field of practice. 相似文献
13.
K. Selçuk Candan Eric Lemar V.S. Subrahmanian 《The VLDB Journal The International Journal on Very Large Data Bases》2000,9(2):131-153
Abstract. Though there has been extensive work on multimedia databases in the last few years, there is no prevailing notion of a multimedia
view, nor there are techniques to create, manage, and maintain such views. Visualizing the results of a dynamic multimedia
query or materializing a dynamic multimedia view corresponds to assembling and delivering an interactive multimedia presentation
in accordance with the visualization specifications. In this paper, we suggest that a non-interactive multimedia presentation
is a set of virtual objects with associated spatial and temporal presentation constraints. A virtual object is either an object, or the result of a query.
As queries may have different answers at different points in time, scheduling the presentation of such objects is nontrivial.
We then develop a probabilistic model of interactive multimedia presentations, extending the non-interactive model described
earlier. We also develop a probabilistic model of interactive visualization where the probabilities reflect the user profiles,
or the likelihood of certain user interactions. Based on this probabilistic model, we develop three utility-theoretic based
types of prefetching algorithms that anticipate how users will interact with the presentation. These prefetching algorithms
allow efficient visualization of the query results in accordance with the underlying specification. We have built a prototype
system that incorporates these algorithms. We report on the results of experiments conducted on top of this implementation.
Received June 10, 1998 / Accepted November 10, 1999 相似文献
14.
Handling message semantics with Generic Broadcast protocols 总被引:1,自引:0,他引:1
Summary. Message ordering is a fundamental abstraction in distributed systems. However, ordering guarantees are usually purely “syntactic,”
that is, message “semantics” is not taken into consideration despite the fact that in several cases semantic information about
messages could be exploited to avoid ordering messages unnecessarily. In this paper we define the Generic Broadcast problem, which orders messages only if needed, based on the semantics of the messages. The semantic information about messages
is introduced by conflict relations. We show that Reliable Broadcast and Atomic Broadcast are special instances of Generic
Broadcast. The paper also presents two algorithms that solve Generic Broadcast.
Received: August 2000 / Accepted: August 2001 相似文献
15.
I/O scheduling for digital continuous media 总被引:4,自引:0,他引:4
A growing set of applications require access to digital video and audio. In order to provide playback of such continuous
media (CM), scheduling strategies for CM data servers (CMS) are necessary. In some domains, particularly defense and industrial process control, the timing requirements of these applications
are strict and essential to their correct operation. In this paper we develop a scheduling strategy for multiple access to
a CMS such that the timing guarantees are maintained at all times. First, we develop a scheduling strategy for the steady state,
i.e., when there are no changes in playback rate or operation. We derive an optimal Batched SCAN (BSCAN) algorithm that requires minimum buffer space to schedule concurrent accesses. The scheduling strategy incorporates two key
constraints: (1) data fetches from the storage system are assumed to be in integral multiples of the block size, and (2) playback
guarantees are ensured for frame-oriented streams when each frame can span multiple blocks. We discuss modifications to the
scheduling strategy to handle compressed data like motion-JPEG and MPEG.
Second, we develop techniques to handle dynamic changes brought about by VCR-like operations executed by applications. We define a suite of primitive VCR-like operations that can be executed. We show that an unregulated change in the BSCAN schedule, in response to VCR-like operations, will affect playback guarantees. We develop two general techniques to ensure playback guarantees while responding
to VCR-like operations: passive and active accumulation. Using user response time as a metric we show that active accumulation algorithms
outperform passive accumulation algorithms. An optimal response-time algorithm in a class of active accumulation strategies
is derived. The results presented here are validated by extensive simulation studies. 相似文献
16.
Boris Chidlovskii Uwe M. Borghoff 《The VLDB Journal The International Journal on Very Large Data Bases》2000,9(1):2-17
Abstract. In meta-searchers accessing distributed Web-based information repositories, performance is a major issue. Efficient query
processing requires an appropriate caching mechanism. Unfortunately, standard page-based as well as tuple-based caching mechanisms
designed for conventional databases are not efficient on the Web, where keyword-based querying is often the only way to retrieve
data. In this work, we study the problem of semantic caching of Web queries and develop a caching mechanism for conjunctive
Web queries based on signature files. Our algorithms cope with both relations of semantic containment and intersection between a query and the corresponding cache
items. We also develop the cache replacement strategy to treat situations when cached items differ in size and contribution
when providing partial query answers. We report results of experiments and show how the caching mechanism is realized in the
Knowledge Broker system.
Received June 15, 1999 / Accepted December 24, 1999 相似文献
17.
Summary. Long-lived and adaptive implementations of mutual exclusion and renaming in the read/write shared memory model are presented.
An implementation of a task is adaptive if the step complexity of any operation in the implementation is a function of the number of processes that take steps concurrently
with the operation. The renaming algorithm assigns a new unique id in the range to any process whose initial unique name is taken from a set of size N, for an arbitrary N and where k is the number of processes that actually take steps or hold a name while the new name is being acquired. The step complexity
of acquiring a new name is , while the step complexity of releasing a name is 1. The space complexity of the algorithm is where n is an upper bound on the number of processes that may be active at the same time (acquiring or holding new names), which
could be N in the worst case.
Both the system response time and the worst case number of operations per process in the presented mutual-exclusion algorithm
are adaptive. Both algorithms rely on the basic building block of a long-lived and adaptive splitter. While the adaptive-splitter
satisfies a slightly different set of properties than the Moir-Anderson splitter [MA95], it is adaptive and long-lived. In
addition, the new splitter properties enable the construction of a non-blocking long-lived (2k-1)-renaming algorithm (which is optimal in the size of the new name space). We believe that the mechanisms introduced in
our splitter implementation are interesting on their own, and might be used in other adaptive and long-lived constructions.
Received: March 2000 / Accepted July 2001 相似文献
18.
On fast microscopic browsing of MPEG-compressed video 总被引:1,自引:0,他引:1
Boon-Lock Yeo 《Multimedia Systems》1999,7(4):269-281
MPEG has been established as a compression standard for efficient storage and transmission of digital video. However, users
are limited to VCR-like (and tedious) functionalities when viewing MPEG video. The usefulness of MPEG video is presently limited
by the lack of tools available for fast browsing, manipulation and processing of MPEG video.
In this paper, we first address the problem of rapid access to individual shots and frames in MPEG video. We build upon the
compressed-video-processing framework proposed in [1, 8], and propose new and fast algorithms based on an adaptive mixture
of approximation techniques for extracting spatially reduced image sequence of uniform quality from MPEG video across different frame types and also under different motion activities in the scenes. The algorithms
execute faster than real time on a Pentium personal computer. We demonstrate how the reduced images facilitate fast and convenient
shot- and frame-level video browsing and access, shot-level editing and annotation, without the need for frequent decompression
of MPEG video. We further propose methods for reducing the auxiliary data size associated with the reduced images through
exploitation of spatial and temporal redundancy. We also address how the reduced images lead to computationally efficient algorithms for video analysis based
on intra- and inter-shot processing for video database and browsing applications. The algorithms, tools for browsing and techniques
for video processing presented in this paper have been used by many in IBM Research on more than 30 h of MPEG-1 video for
video browsing and analysis. 相似文献
19.
Concurrency control in hierarchical multidatabase systems 总被引:1,自引:0,他引:1
Sharad Mehrotra Henry F. Korth Avi Silberschatz 《The VLDB Journal The International Journal on Very Large Data Bases》1997,6(2):152-172
Over the past decade, significant research has been done towards developing transaction management algorithms for multidatabase
systems. Most of this work assumes a monolithic architecture of the multidatabase system with a single software module that
follows a single transaction management algorithm to ensure the consistency of data stored in the local databases. This monolithic
architecture is not appropriate in a multidatabase environment where the system spans multiple different organizations that
are distributed over various geographically distant locations. In this paper, we propose an alternative multidatabase transaction
management architecture, where the system is hierarchical in nature. Hierarchical architecture has consequences on the design
of transaction management algorithms. An implication of the architecture is that the transaction management algorithms followed
by a multidatabase system must be composable– that is, it must be possible to incorporate individual multidatabase systems as elements in a larger multidatabase system.
We present a hierarchical architecture for a multidatabase environment and develop techniques for concurrency control in such
systems.
Edited by R. Sacks-Davis. Received June 27, 1994 / Accepted September 26, 1995 相似文献
20.
Summary. In a shared-memory distributed system, n independent asynchronous processes communicate by reading and writing to shared variables. An algorithm is adaptive (to total contention) if its step complexity depends only on the actual number, k, of active processes in the execution; this number is unknown in advance and may change in different executions of the algorithm.
Adaptive algorithms are inherently wait-free, providing fault-tolerance in the presence of an arbitrary number of crash failures and different processes' speed.
A wait-free adaptive collect algorithm with O(k) step complexity is presented, together with its applications in wait-free adaptive alg orithms for atomic snapshots, immediate
snapshots and renaming.
Received: August 1999 / Accepted: August 2001 相似文献