期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Efficient histogramming on hypercube SIMD machines

《Computer Vision, Graphics, and Image Processing》1990,49(1):104-120

This paper considers the histogramming problem on hypercube.N-PE hypercube is used to process anN ¹² × N¹²digitized image in which each pixel has a gray-level value between 0 andM − 1. In general,M, the range of gray-level values is much smaller thanN, the number of pixels being processed. Our algorithm generates the histogram of the image inO(logM * logN) time using radix sort and efficient data movement operations. This technique can be implemented on butterfly, shuffle-exchange and fat pyramid organizations. 相似文献

2.

Geometric pattern matching for point sets in the plane under similarity transformations 总被引：1，自引：0，他引：1

Dror Aiger Klara Kedem 《Information Processing Letters》2009,109(16):935-940

We consider the following geometric pattern matching problem: Given two sets of points in the plane, P and Q, and some (arbitrary) δ>0, find a similarity transformation T (translation, rotation and scale) such that h(T(P),Q)<δ, where h(⋅,⋅) is the directional Hausdorff distance with L_∞ as the underlying metric; or report that none exists. We are only interested in the decision problem, not in minimizing the Hausdorff distance, since in the real world, where our applications come from, δ is determined by the practical uncertainty in the position of the points (pixels). Similarity transformations have not been dealt with in the context of the Hausdorff distance and we fill the gap here. We present efficient algorithms for this problem imposing a reasonable separation restriction on the points in the set Q. If the L_∞ distance between every pair of points in Q is at least 8δ, then the problem can be solved in O(mn²logn) time, where m and n are the numbers of points in P and Q respectively. If the L_∞ distance between every pair of points in Q is at least cδ, for some c, 0<c<1, we present a randomized approximate solution with expected runtime O(n²c−4ε−8log⁴mn), where ε>0 controls the approximation. Our approximation is on the size of the subset, B⊆P, such that h(T(B),Q)<δ and |B|>(1−ε)|P| with high probability. 相似文献

3.

Optimal Speed-Up Parallel Image Template Matching Algorithms on Processor Arrays with a Reconfigurable Bus System,

Horng-Ren Tsai Shi-Jinn Horng Shun-Shan Tsai Shung-Shing Lee Tzong-Wann Kao Chia-Ho Chen 《Computer Vision and Image Understanding》1998,71(3):393-412

The image template matching problem is one of the fundamental problems of and has many practical applications in image processing, pattern recognition, and computer vision. It is a useful operation for filtering, edge detection, image registration, and object detection [13]. In this paper, we first design twoO[(M²/p²)log logM] andO[(M²/p²)+(M/p)log logp] time parallel image template matching algorithms on a 3-D processor array with a reconfigurable bus system usingp²N²processors with each processor containingO(1) andO(M/p) restricted memory for 1 ≤p≤M≤N, respectively, for anN×Ndigital image and anM×Mtemplate. By increasing the number of processors, these two proposed algorithms can be run inO(M²/p²) time for speeding up the time complexity usingp²M^1/cN²andp^2+1/cN²processors, respectively, wherecis a constant andc≥1. Furthermore, anO(1) time can be also obtained from these two proposed algorithms by usingM²+1^/cN²processors. These results improve the best known bounds and achieve both optimal and optimal speed-up in their time and processor complexities. 相似文献

4.

Computing the subset partial order for dense families of sets

Amr Elmasry 《Information Processing Letters》2009,109(18):1082-1086

We give an algorithm to compute the subset partial order (called the subset graph) for a family F of sets containing k sets with N elements in total and domain size n. Our algorithm requires O(nk²/logk) time and space on a Pointer Machine. When F is dense, i.e. N=Θ(nk), the algorithm requires O(N²/log²N) time and space. We give a construction for a dense family whose subset graph is of size Θ(N²/log²N), indicating the optimality of our algorithm for dense families. The subset graph can be dynamically maintained when F undergoes set insertions and deletions in O(nk/logk) time per update (that is sub-linear in N for the case of dense families). If we assume words of b?k bits, allow bits to be packed in words, and use bitwise operations, the above running time and space requirements can be reduced by a factor of blog(k/b+1)/logk and b²log(k/b+1)/logk respectively. 相似文献

5.

A novel constant degree and constant congestion DHT scheme for peer-to-peer networks 总被引：3，自引：0，他引：3

LI Dongsheng & LU Xicheng School of Computer National University of Defense Technology Changsha China 《中国科学F辑(英文版)》2005,48(4):421-436

1 Introduction and related work In recent years, peer-to-peer computing has attracted significant attention from both industry field and academic field[1-3]. The core component of many proposed peer-to- peer systems is the distributed hash table (DHT) schemes[4,5] that use a hash table-like interface to publish and look up data objects. Many proposed DHT schemes[6-15] are based on some traditional interconnection to- pology: Chord[6], Tapestry[7,8], Pastry[9] are based on hypercube topolog… 相似文献

6.

Practical constructive schemes for deterministic shared-memory access

A. Pietracaprina F. P. Preparata 《Theory of Computing Systems》1997,30(2):3-37

We present three explicit schemes for distributingM variables amongN memory modules, whereM=Θ(N ^1.5),M = Θ(N ²), andM=Θ(N ³), respectively. Each variable is replicated into a constant number of copies stored in distinct modules. We show thatN processors, directly accessing the memories through a complete interconnection, can read/write any set ofN variables in worst-case timeO (N ^1/3),O(N ^1/2), andO(N ^2/3), respectively for the three schemes. The access times for the last two schemes are optimal with respect to the particular redundancy values used by such schemes. The address computation can be carried out efficiently by each processor without recourse to a complete memory map and requiring onlyO(1) internal storage. 相似文献

7.

A space- and time-efficient local-spin spin lock

Yong-Jik KimJames H. Anderson 《Information Processing Letters》2002,84(1):47-55

A simple code transformation is presented that reduces the space complexity of Yang and Anderson's local-spin mutual exclusion algorithm. In both the original and the transformed algorithm, only atomic read and write instructions are used; each process generates Θ(logN) remote memory references per lock request, where N is the number of processes. The transformed algorithm uses Θ(N) distinct variables, which is clearly optimal. 相似文献

8.

Dynamic Matrix Rank with Partial Lookahead

Telikepalli Kavitha 《Theory of Computing Systems》2014,55(1):229-249

We consider the problem of maintaining information about the rank of a matrix M under changes to its entries. For an n×n matrix M, we show an amortized upper bound of O(n ^ω?1) arithmetic operations per change for this problem, where ω<2.373 is the exponent for matrix multiplication, under the assumption that there is a lookahead of up to Θ(n) locations. That is, we know up to the next Θ(n) locations (i ₁,j ₁),(i ₂,j ₂),…?, whose entries are going to change, in advance; however we do not know the new entries in these locations in advance. We get the new entries in these locations in a dynamic manner. The dynamic matrix rank problem was first studied by Frandsen and Frandsen who showed an upper bound of O(n ^1.575) and a lower bound of Ω(n) for this problem and later Sankowski showed an upper bound of O(n ^1.495) for this problem when allowing randomization and a small probability of error. These algorithms do not assume any lookahead. For the dynamic matrix rank problem with lookahead, Sankowski and Mucha showed a randomized algorithm (with a small probability of error) that is more efficient than these algorithms. 相似文献

9.

Cache-oblivious index for approximate string matching

Wing-Kai Hon Tak-Wah Lam 《Theoretical computer science》2011,412(29):3579-3588

This paper revisits the problem of indexing a text for approximate string matching. Specifically, given a text T of length n and a positive integer k, we want to construct an index of T such that for any input pattern P, we can find all its k-error matches in T efficiently. This problem is well-studied in the internal-memory setting. Here, we extend some of these recent results to external-memory solutions, which are also cache-oblivious. Our first index occupies O((nlog^kn)/B) disk pages and finds all k-error matches with O((|P|+occ)/B+log^knloglog_Bn) I/Os, where B denotes the number of words in a disk page. To the best of our knowledge, this index is the first external-memory data structure that does not require Ω(|P|+occ+poly(logn)) I/Os. The second index reduces the space to O((nlogn)/B) disk pages, and the I/O complexity is O((|P|+occ)/B+log^k(k+1)nloglogn). 相似文献

10.

A new factorization of the mass matrix for optimal serial and parallel calculation of multibody dynamics

Amir Fijany Roy Featherstone 《Multibody System Dynamics》2013,29(2):169-187

This paper describes a new factorization of the inverse of the joint-space inertia matrix M. In this factorization, M ^?1 is directly obtained as the product of a set of sparse matrices wherein, for a serial chain, only the inversion of a block-tridiagonal matrix is needed. In other words, this factorization reduces the inversion of a dense matrix to that of a block-tridiagonal one. As a result, this factorization leads to both an optimal serial and an optimal parallel algorithm, that is, a serial algorithm with a complexity of O(N) and a parallel algorithm with a time complexity of O(logN) on a computer with O(N) processors. The novel feature of this algorithm is that it first calculates the interbody forces. Once these forces are known, the accelerations are easily calculated. We discuss the extension of the algorithm to the task of calculating the forward dynamics of a kinematic tree consisting of a single main chain plus any number of short side branches. We also show that this new factorization of M ^?1 leads to a new factorization of the operational-space inverse inertia, Λ ^?1, in the form of a product involving sparse matrices. We show that this factorization can be exploited for optimal serial and parallel computation of Λ ^?1, that is, a serial algorithm with a complexity of O(N) and a parallel algorithm with a time complexity of O(logN) on a computer with O(N) processors. 相似文献

11.

New bounds for the controller problem

Yuval Emek Amos Korman 《Distributed Computing》2011,24(3-4):177-186

The (M, W)-controller, originally studied by Afek, Awerbuch, Plotkin, and Saks, is a basic distributed tool that provides an abstraction for managing the consumption of a global resource in a distributed dynamic network. The input to the controller arrives online in the form of requests presented at arbitrary nodes. A request presented at node u corresponds to the ??desire?? of some entity to consume one unit of the global resource at u and the controller should handle this request within finite time either by granting it with a permit or by denying it. Initially, M permits (corresponding to M units of the global resource) are stored at a designated root node. Throughout the execution permits can be transported from place to place along the network??s links so that they can be granted to requests presented at various nodes; when a permit is granted to some request, it is eliminated from the network. The fundamental rule of an (M, W)-controller is that a request should not be denied unless it is certain that at least M ? W permits are eventually granted. The most efficient (M, W)-controller known to date has message complexity ${O (N\log^{2} N \log \frac{M}{W + 1})}$ , where N is the number of nodes that ever existed in the network (the dynamic network may undergo node insertions and deletions). In this paper we establish two new lower bounds on the message complexity of the controller problem. We first prove a simple lower bound stating that any (M, W)-controller must send ${\Omega (N \log \frac{M}{W + 1})}$ messages. Second, for the important case when W is proportional to M (this is the common case in most applications), we use a surprising reduction from the (centralized) monotonic labeling problem to show that any (M, W)-controller must send ??(N log N) messages. In fact, under a long lasting conjecture regarding the complexity of the monotonic labeling problem, this lower bound is improved to a tight ??(N log² N). The proof of this lower bound requires that N =?O(M) which turns out to be somewhat inevitable due to a new construction of an (M, M/2)-controller with message complexity O(N log² M). 相似文献

12.

Scalable 2D Convex Hull and Triangulation Algorithms for Coarse Grained Multicomputers

《Journal of Parallel and Distributed Computing》1999,56(1):47-70

In this paper we describe scalable parallel algorithms for building the convex hull and a triangulation ofncoplanar points. These algorithms are designed for thecoarse grained multicomputermodel:pprocessors withO(n/p)⪢O(1) local memory each, connected to some arbitrary interconnection network. They scale over a large range of values ofnandp, assuming only thatn⩾p^1+ε(ε>0) and require timeO((T_sequential/p)+T_s(n, p)), whereT_s(n, p) refers to the time of a global sort ofndata on approcessor machine. Furthermore, they involve only a constant number of global communication rounds. Since computing either 2D convex hull or triangulation requires timeT_sequential=Θ(n log n) these algorithms either run in optimal time,Θ((n log n)/p), or in sort time,T_s(n, p), for the interconnection network in question. These results become optimal whenT_sequential/pdominatesT_s(n, p) or for interconnection networks like the mesh for which optimal sorting algorithms exist. 相似文献

13.

Simulations among multidimensional turing machines

Michael C. Loui 《Theoretical computer science》1982,21(2):145-161

For all d ? 1 and all e >d, every deterministic multihead e-dimensional Turing machine of time complexity T(n) can be simulated on-line by a deterministic multihead d-dimensional Turing machine in time O(T(n)^1+1?d?1?(logT(n))⁰⁽¹⁾). This simulation almost achieves the known lower bound

Ω(T(n)^{1+1?} d?1?e)

on the time required. The simulation is interpreted in terms of dynamic embeddings among arrays with local access. 相似文献

14.

Computing the Map of Geometric Minimal Cuts

Jinhui Xu Lei Xu Evanthia Papadopoulou 《Algorithmica》2014,68(4):805-834

In this paper we consider the following problem of computing a map of geometric minimal cuts (called MGMC problem): Given a graph G=(V,E) and a planar rectilinear embedding of a subgraph H=(V _H,E _H) of G, compute the map of geometric minimal cuts induced by axis-aligned rectangles in the embedding plane. The MGMC problem is motivated by the critical area extraction problem in VLSI designs and finds applications in several other fields. In this paper, we propose a novel approach based on a mix of geometric and graph algorithm techniques for the MGMC problem. Our approach first shows that unlike the classic min-cut problem on graphs, the number of all rectilinear geometric minimal cuts is bounded by a low polynomial, O(n ³). Our algorithm for identifying geometric minimal cuts runs in O(n ³logn(loglogn)³) expected time which can be reduced to O(nlogn(loglogn)³) when the maximum size of the cut is bounded by a constant, where n=|V _H|. Once geometric minimal cuts are identified we show that the problem can be reduced to computing the L _∞ Hausdorff Voronoi diagram of axis aligned rectangles. We present the first output-sensitive algorithm to compute this diagram which runs in O((N+K)log² NloglogN) time and O(Nlog² N) space, where N is the number of rectangles and K is the complexity of the Hausdorff Voronoi diagram. Our approach settles several open problems regarding the MGMC problem. 相似文献

15.

A new quantum claw-finding algorithm for three functions

Kazuo?Iwama Email author Akinori?Kawachi 《New Generation Computing》2003,21(4):319-327

Fork functionsf ₁, ...f _k, ak-tuple (x ₁, ...x _k) such thatf ₁(x ₁)=...=f _k(x _k) is called a claw off ₁, ...,f _k. In this paper, we construct a new quantum claw-finding algorithm for three functions that is efficient when the numberM of intermediate solutions is small. The known quantum claw-finding algorithm for three functions requiresO(N ^7/8 logN) queries to find a claw, but our algorithm requiresO(N ^3/4 logN) queries ifM ≤ √N andO(N ^7/12 M ^1/3 logN) queries otherwise. Thus, our algorithm is more efficient ifM≤N ^7/8. Kazuo Iwama, Ph.D.: Professor of Informatics, Kyoto University, Kyoto 606-8501, Japan. Received BE, ME, and Ph.D. degrees in Electrical Engineering from Kyoto University in 1978, 1980 and 1985, respectively. His research interests include algorithms, complexity theory and quantum computation. Editorial board of Information Processing Letters and Parallel Computing. Council Member of European Association for Theoretical Computer Science (EATCS). Akinori Kawachi: Received B.Eng. and M.Info. from Kyoto University in 2000 and 2002, respectively. His research interests are quantum computation and distributed computation. 相似文献

16.

Fast computation of smallest enclosing circle with center on a query line segment

Arindam Karmakar 《Information Processing Letters》2008,108(6):343-346

Here we propose an efficient algorithm for computing the smallest enclosing circle whose center is constrained to lie on a query line segment. Our algorithm preprocesses a given set of n points P={p₁,p₂,…,pn} such that for any query line or line segment L, it efficiently locates a point c on L that minimizes the maximum distance among the points in P from c. Roy et al. [S. Roy, A. Karmakar, S. Das, S.C. Nandy, Constrained minimum enclosing circle with center on a query line segment, in: Proc. of the 31st Mathematical Foundation of Computer Science, 2006, pp. 765-776] have proposed an algorithm that solves the query problem in O(log²n) time using O(nlogn) preprocessing time and O(n) space. Our algorithm improves the query time to O(logn); but the preprocessing time and space complexities are both O(n²). 相似文献

17.

Parallel integer sorting and simulation amongst CRCW models

Sanjeev Saxena 《Acta Informatica》1996,33(5):607-619

In this paper a general technique for reducing processors in simulation without any increase in time is described. This results in an O(√logn) time algorithm for simulating one step of PRIORITY on TOLERANT with processor-time product of O(n log logn); the same as that for simulating PRIORITY on ARBITRARY. This is used to obtain anO(logn/log logn + √logn (log logm ? log logn)) time algorithm for sortingn integers from the set {0,...,m ? 1},m ≧n, with a processor-time product ofO(n log logm log logn) on a TOLERANT CRCW PRAM. New upper and lower bounds for ordered chaining problem on an allocated COMMON CRCW model are also obtained. The algorithm for ordered chaining takesO(logn/log logn) time on an allocated PRAM of sizen. It is shown that this result is best possible (upto a constant multiplicative factor) by obtaining a lower bound of Ω(r logn/(logr + log logn)) for finding the first (leftmost one) live processor on an allocated-COMMON PRAM of sizen ofr-slow virtual processors (one processor simulatesr processors of allocated PRAM). As a result, for ordered chaining problem, “processor-time product” has to be at least Ω(n logn/log logn) for any poly-logarithmic time algorithm. Algorithm for ordered-chaining problem results in anO(logN/log logN) time algorithm for (stable) sorting ofn integers from the set {0,...,m ? 1} withn-processors on a COMMON CRCW PRAM; hereN = max(n, m). In particular if,m =n ^O(1), then sorting takes Θ(logn/log logn) time on both TOLERANT and COMMON CRCW PRAMs. Processor-time product for TOLERANT isO(n(log logn)²). Algorithm for COMMON usesn processors. 相似文献

18.

Lower bounds on the rotation distance of binary trees

Fabrizio Luccio Antonio Mesa Enriquez 《Information Processing Letters》2010,110(21):934-938

The rotation distanced(S,T) between two binary trees S, T of n vertices is the minimum number of rotations to transform S into T. While it is known that d(S,T)?2n−6, a well-known conjecture states that there are trees for which this bound is sharp for any value of n?11. We are unable to prove the conjecture, but we give here some simple criteria for lower bound evaluation, leading for example to individuate some “regular” tree structures for which d(S,T)=3n/2−O(1), or d(S,T)=5n/3−O(1). 相似文献

19.

Nearly tight bounds on the number of Hamiltonian circuits of the hypercube and generalizations

Tomás Feder Carlos Subi 《Information Processing Letters》2009,109(5):267-272

It has been shown that for every perfect matching M of the d-dimensional n-vertex hypercube, d?2, n=d², there exists a second perfect matching M^′ such that the union of M and M^′ forms a Hamiltonian circuit of the d-dimensional hypercube. We prove a generalization of a special case of this result when there are two dimensions that do not get used by M. It is known that the number Md of perfect matchings of the d-dimensional hypercube satisfies and, in particular, (2d/n)^n/2(n/2)!?Md?(d!)^n/(2d). It has also been shown that the number Hd of Hamiltonian circuits of the hypercube satisfies 1?limd→∞(logHd)/(logMd)?2. We finally strengthen this result to a nearly tight bound ((dlog2/(eloglogd))(1−on(1)))?Hd?(d!)^n/(2d)((d−1)!)^n/(2(d−1))/2 proving that limd→∞(logHd)/(logMd)=2. This means that the bound Hd?Md is improved to a nearly tight , so the number of Hamiltonian circuits in the hypercube is nearly quadratic in the number of perfect matchings. The proofs are based on a result for graphs that are the Cartesian product of squares and arbitrary bipartite regular graphs that have a Hamiltonian cycle. We also study a labeling scheme related to matchings. 相似文献

20.

Parallel Algorithms for Counting and Randomly Generating Integer Partitions

《Journal of Parallel and Distributed Computing》1996,34(1):29-35

This paper presents parallel algorithms for determining the number of partitions of a given integerN, where the partitions may be subject to restrictions, such as being composed of distinct parts, of a given number of parts, and/or of parts belonging to a specified set. We present a series of adaptive algorithms suitable for varying numbers of processors. The fastest of these algorithms computes the number of partitions ofnwith largest part equal tok, for 1 ≤k≤n≤N, in timeO(log²(N)) usingO(N²/logN) processors. Parallel logarithmic time algorithms that generate partitions uniformly at random, using these quantities, are also presented. 相似文献