首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Coalition structure generation is a central problem in characteristic function games. Most algorithmic work to date can be classified into one of three broad categories: anytime algorithms, design-to-time algorithms and heuristic algorithms [5]. This paper focuses on the former two approaches. Both design-to-time and anytime algorithms have pros and cons. While design-to-time algorithms guarantee finding an optimal solution, they must be run to completion in order to generate any solution. Anytime algorithms; however, permit premature termination while providing solutions of ever increasing quality along with quality guarantees. Design-to-time algorithms have a better worst case runtime (O(3 n ) for n agents) compared to the current anytime algorithms (O(n n ) for n agents), but do not provide the flexibility of anytime algorithms. In this paper we present the first design-to-time constant factor approximation algorithms for coalition structure generation that guarantee high quality solutions quickly. We show how our approach can be used as an anytime algorithm, which combines both the worst case runtime of the design-to-time algorithms and the flexibility of the anytime algorithms. This results in the first anytime algorithm for coalition structure generation which has the same worst case time complexity of the current best design-to-time algorithms.  相似文献   

2.
Assume that we wish to expand the product h = fg of two formal power series f and g. Classically, there are two types of algorithms to do this: zealous algorithms first expand f and g up to order n, multiply the results and truncate at order n. Lazy algorithms on the contrary compute the coefficients of f, g and h gradually and they perform no more computations than strictly necessary at each stage. In particular, at the moment we compute the coefficient hiof ziin h, only f0, , fiand g0, , giare known.Lazy algorithms have the advantage that the coefficients of f and g may actually depend on “previous" coefficients of h, as long as they are computed before they are needed in the multiplication, i.e. the coefficients fiand gimay depend on h0, , hi − 1. For this reason, lazy algorithms are extremely useful when solving functional equations in rings of formal power series. However, lazy algorithms have the disadvantage that the classical asymptotically fast multiplication algorithms on polynomials—such as the divide and conquer algorithm and fast Fourier multiplication—cannot be used.In a previous paper, we therefore introduced relaxed algorithms, which share the property concerning the resolution of functional equations with lazy algorithms, but perform slightly more computations than lazy algorithms during the computation of a given coefficient of h. These extra computations anticipate the computations of the next coefficients of h and dramatically improve the asymptotic time complexities of such algorithms.In this paper, we survey several classical and new zealous algorithms for manipulating formal power series, including algorithms for multiplication, division, resolution of differential equations, composition and reversion. Next, we give various relaxed algorithms for these operations. All algorithms are specified in great detail and we prove theoretical time and space complexity bounds. Most algorithms have been experimentally implemented in C++ and we provide benchmarks. We conclude by some suggestions for future developments and a discussion of the fitness of the lazy and relaxed approaches for specific applications.This paper is intended both for those who are interested in the most recent algorithms for the manipulation of formal power series and for those who want to actually implement a power series library into a computer algebra system.  相似文献   

3.
In this paper, we present several new and generalized parallel dense matrix multiplication algorithms of the form C = αAB + β C on two-dimensional process grid topologies. These algorithms can deal with rectangular matrices distributed on rectangular grids. We classify these algorithms coherently into three categories according to the communication primitives used and thus we offer a taxonomy for this family of related algorithms. All these algorithms are represented in the data distribution independent approach and thus do not require a specific data distribution for correctness. The algorithmic compatibility condition result shown here ensures the correctness of the matrix multiplication. We define and extend the data distribution functions and introduce permutation compatibility and algorithmic compatibility. We also discuss a permutation compatible data distribution (modified virtual 2D data distribution). We conclude that no single algorithm always achieves the best performance on different matrix and grid shapes. A practical approach to resolve this dilemma is to use poly-algorithms. We analyze the characteristics of each of these matrix multiplication algorithms and provide initial heuristics for using the poly-algorithm. All these matrix multiplication algorithms have been tested on the IBM SP2 system. The experimental results are presented in order to demonstrate their relative performance characteristics, motivating the combined value of the taxonomy and new algorithms introduced here. © 1997 by John Wiley & Sons, Ltd.  相似文献   

4.
In this paper, we propose new adaptive algorithms for the extraction and tracking of the least (minor) or eventually, principal eigenvectors of a positive Hermitian covariance matrix. The main advantage of our proposed algorithms is their low computational complexity and numerical stability even in the minor component analysis case. The proposed algorithms are considered fast in the sense that their computational cost is O(np) flops per iteration where n is the size of the observation vector and p<n is the number of eigenvectors to estimate.We consider OJA-type minor component algorithms based on the constraint and non-constraint stochastic gradient technique. Using appropriate fast orthogonalization procedures, we introduce new fast algorithms that extract the minor (or principal) eigenvectors and guarantee good numerical stability as well as the orthogonality of their weight matrix at each iteration. In order to have a faster convergence rate, we propose a normalized version of these algorithms by seeking the optimal step-size. Our algorithms behave similarly or even better than other existing algorithms of higher complexity as illustrated by our simulation results.  相似文献   

5.
Ways to sparse representation: An overview   总被引:1,自引:0,他引:1  
Many algorithms have been proposed to find sparse representations over redundant dictionaries or transforms. This paper gives an overview of these algorithms by classifying them into three categories: greedy pursuit algorithms, l p norm regularization based algorithms, and iterative shrinkage algorithms. We summarize their pros and cons as well as their connections. Based on recent evidence, we conclude that the algorithms of the three categories share the same root: l p norm regularized inverse problem. Finally, several topics that deserve further investigation are also discussed. Supported by the Joint Research Fund for Overseas Chinese Young Scholars of the National Natural Science Foundation of China (Grant No. 60528004) and the Key Project of the National Natural Science Foundation of China (Grant No. 60528004)  相似文献   

6.
ABSTRACT

Some of the most popular public key encryption algorithms use exponentiation as their core operation, which can be mostly broken into several modular squaring operations. In this paper, we present GF(p) modular squaring algorithms and efficiently implement them on hardware. We present different algorithms, two for squaring and one for reduction combined with the squaring, to provide a general modular squaring algorithm. The algorithms are implemented through datapaths that uses redundant Carry-Save Adders, making the computation time independent form the operands precision. The proposed algorithms are compared with each other as well as with the existing modular squaring algorithms. The experimental results are obtained by synthesizing the hardware designs for FPGA Virtex5 chip (xc5vlx50ff1153 technology), which showed interesting results and made our ideas very attractive.  相似文献   

7.
We present efficient algorithms for solving several fundamental graph-theoretic problems on a Linear Array with a Reconfigurable Pipelined Bus System (LARPBS), one of the recently proposed models of computation based on optical buses. Our algorithms include finding connected components, minimum spanning forest, biconnected components, bridges and articulation points for an undirected graph. We compute the connected components and minimum spanning forest of a graph in O(log n) time using O(m+n) processors where m and n are the number of edges and vertices in the graph and m=O(n 2) for a dense graph. Both the processor and time complexities of these two algorithms match the complexities of algorithms on the Arbitrary and Priority CRCW PRAM models which are two of the strongest PRAM models. The algorithms for these two problems published by Li et al. [7] have been considered to be the most efficient on the LARPBS model till now. Their algorithm [7] for these two problems require O(log n) time and O(n 3/log n) processors. Hence, our algorithms have the same time complexity but require less processors. Our algorithms for computing biconnected components, bridges and articulation points of a graph run in O(log n) time on an LARPBS with O(n 2) processors. No previous algorithm was known for these latter problems on the LARPBS.  相似文献   

8.
Abstract. Blockwise access to data is a central theme in the design of efficient external memory (EM) algorithms. A second important issue, when more than one disk is present, is fully parallel disk I/ O. In this paper we present a simple, deterministic simulation technique which transforms certain Bulk Synchronous Parallel (BSP) algorithms into efficient parallel EM algorithms. It optimizes blockwise data access and parallel disk I/ O and, at the same time, utilizes multiple processors connected via a communication network or shared memory. We obtain new improved parallel EM algorithms for a large number of problems including sorting, permutation, matrix transpose, several geometric and GIS problems including three-dimensional convex hulls (two-dimensional Voronoi diagrams), and various graph problems. We show that certain parallel algorithms known for the BSP model can be used to obtain EM algorithms that meet well known I /O complexity lower bounds for various problems, including sorting.  相似文献   

9.
Abstract. We investigate a variant of on-line edge-coloring in which there is a fixed number of colors available and the aim is to color as many edges as possible. We prove upper and lower bounds on the performance of different classes of algorithms for the problem. Moreover, we determine the performance of two specific algorithms, First-Fit and Next-Fit . Specifically, algorithms that never reject edges that they are able to color are called fair algorithms. We consider the four combinations of fair/ not fair and deterministic/ randomized. We show that the competitive ratio of deterministic fair algorithms can vary only between approximately 0.4641 and 1/2, and that Next-Fit is worst possible among fair algorithms. Moreover, we show that no algorithm is better than 4/7-competitive. If the graphs are all k -colorable, any fair algorithm is at least 1/2-competitive. Again, this performance is matched by Next-Fit while the competitive ratio for First-Fit is shown to be k/(2k-1) , which is significantly better, as long as k is not too large.  相似文献   

10.
It is well known that the Earliest-Deadline-First (EDF) and the Least-Laxity-First (LLF) algorithms are optimal algorithms for the problem of preemptively scheduling jobs that arrive over time on a single machine to minimize the maximum lateness (1|r j ,pmtn|L max ). It was not previously known what other online algorithms are optimal for this problem. As this problem is fundamental in machine scheduling, it deserves a thorough investigation. In this paper, the concept of compound laxity is introduced, and a complete characterization of all optimal online algorithms for this problem is derived.  相似文献   

11.
Zhou  Jukai  Liu  Tong  Zhu  Jingting 《Multimedia Tools and Applications》2019,78(23):33415-33434

K-means clustering is one of the most popular clustering algorithms and has been embedded in other clustering algorithms, e.g. the last step of spectral clustering. In this paper, we propose two techniques to improve previous k-means clustering algorithm by designing two different adjacent matrices. Extensive experiments on public UCI datasets showed the clustering results of our proposed algorithms significantly outperform three classical clustering algorithms in terms of different evaluation metrics.

  相似文献   

12.
Several classes of sequential algorithms to approximate themaximum acyclic subgraph problem are examined. The equivalentfeedback arc set problem isNP-complete and there are only a few classes of graphs for which it is known to be inP. Thus, approximation algorithms are very important for this problem. Our goal is to determine how effectively the various sequential algorithms parallelize. Of the sequential algorithms we study, natural decision problems based on several of them are provedP-complete. Parallel implementations usingO(log ¦V¦) time and ¦E¦ processors on an EREW PRAM exist for the other algorithms. Interestingly, the parallelizable algorithms appear very similar to some of theinherently sequential algorithms. Thus, for approximating the maximum acyclic subgraph problem small algorithmic changes drastically alter parallel complexity, unlessNC equalsP.  相似文献   

13.
《国际计算机数学杂志》2012,89(1-4):243-267
Large sparse nonsymmetric problems of the form A u = b are frequently solved using restarted conjugate gradient-type algorithms such as the popular GCR and GMRES algorithms. In this study we define a new class of algorithms which generate the same iterates as the standard GMRES algorithm but require as little as half of the computational expense. This performance improvement is obtained by using short economical three-term recurrences to replace the long recurrence used by GMRES. The new algorithms are shown to have good numerical properties in typical cases, and the new algorithms may be easily modified to be as numerically safe as standard GMRES. Numerical experiments with these algorithms are given in Part II, in which we demonstrate the improved performance of the new schemes on different computer architectures.  相似文献   

14.
We introduce a new class of dynamic graph algorithms called quasi-fully dynamic algorithms , which are much more general than backtracking algorithms and are much simpler than fully dynamic algorithms. These algorithms are especially suitable for applications in which a certain core connected portion of the graph remains fixed, and fully dynamic updates occur on the remaining edges in the graph. We present very simple quasi-fully dynamic algorithms with O(log n) worst-case time per operation for 2-edge connectivity and O(log n) amortized time per operation for cycle equivalence. The former is deterministic while the latter is Monte-Carlo-type randomized. For 2-vertex connectivity, we give a deterministic quasi-fully dynamic algorithm with O(log 3 n) amortized time per operation. The quasi-fully dynamic algorithm we present for cycle equivalence (which has several applications in optimizing compilers) is of special interest since the algorithm is quite simple, and no special-purpose incremental or backtracking algorithm is known for this problem. Received October 26, 1998; revised October 1, 1999, and April 15, 2001.  相似文献   

15.
The problem of performing a global combine (summation) operation on a distributed memory computer using a two-dimensional mesh interconnect with wormhole routing is considered. We present algorithms that are asymptotically optimal for short vectors (O(log(p)) for p processing nodes) and for long vectors (O(n) for n data elements per node), as well as hybrid algorithms that are superior for intermediate n. The algorithms are analyzed using detailed performance models that include the effects of link conflicts and other characteristics of the underlying communication system. The models are validated using experimental data from the Intel Touchstone DELTA computer. We show that no one algorithm is optimal for all vector lengths; rather, each of the presented algorithms is superior under some circumstances.  相似文献   

16.
The block-cyclic data distribution is commonly used to organize array elements over the processors of a coarse-grained distributed memory parallel computer. In many scientific applications, the data layout must be reorganized at run-time in order to enhance locality and reduce remote memory access overheads. In this paper we present a general framework for developing array redistribution algorithms. Using this framework, we have developed efficient algorithms that redistribute an array from one block-cyclic layout to another. Block-cyclic redistribution consists of index set computation , wherein the destination locations for individual data blocks are calculated, and data communication , wherein these blocks are exchanged between processors. The framework treats both these operations in a uniform and integrated way. We have developed efficient and distributed algorithms for index set computation that do not require any interprocessor communication. To perform data communication in a conflict-free manner, we have developed direct indirect and hybrid algorithms. In the direct algorithm, a data block is transferred directly to its destination processor. In an indirect algorithm, data blocks are moved from source to destination processors through intermediate relay processors. The hybrid algorithm is a combination of the direct and indirect algorithms. Our framework is based on a generalized circulant matrix formalism of the redistribution problem and a general purpose distributed memory model of the parallel machine. Our algorithms sustain excellent performance over a wide range of problem and machine parameters. We have implemented our algorithms using MPI, to allow for easy portability across different HPC platforms. Experimental results on the IBM SP-2 and the Cray T3D show superior performance over previous approaches. When the block size of the cyclic data layout changes by a factor of K , the redistribution can be performed in O( log K) communication steps. This is true even when K is a prime number. In contrast, previous approaches take O(K) communication steps for redistribution. Our framework can be used for developing scalable redistribution libraries, for efficiently implementing parallelizing compiler directives, and for developing parallel algorithms for various applications. Redistribution algorithms are especially useful in signal processing applications, where the data access patterns change significantly between computational phases. They are also necessary in linear algebra programs, to perform matrix transpose operations. Received June 1, 1997; revised March 10, 1998.  相似文献   

17.
Parallel algorithms for finding polynomial Roots on OTIS-torus   总被引:1,自引:0,他引:1  
We present two parallel algorithms for finding all the roots of an N-degree polynomial equation on an efficient model of Optoelectronic Transpose Interconnection System (OTIS), called OTIS-2D torus. The parallel algorithms are based on the iterative schemes of Durand–Kerner and Ehrlich methods. We show that the algorithm for the Durand–Kerner method requires (N 0.75+0.5N 0.25−1) electronic moves + 2(N 0.5−1) OTIS moves using N processors. The parallel algorithm for Ehrlich method is shown to run in (N 0.75+0.5N 0.25−1) electronic moves + 2(N 0.5−1) OTIS moves with the same number of processors. The algorithms have lower AT cost than the algorithms presented in Jana (Parallel Comput 32:301–312, 2006). The scalability of the algorithms is also discussed.  相似文献   

18.
We consider the total weighted completion time scheduling problem for parallel identical machines and precedence constraints, P| prec|\sum w i C i . This important and broad class of problems is known to be NP-hard, even for restricted special cases, and the best known approximation algorithms have worst-case performance that is far from optimal. However, little is known about the experimental behavior of algorithms for the general problem. This paper represents the first attempt to describe and evaluate comprehensively a range of weighted completion time scheduling algorithms. We first describe a family of combinatorial scheduling algorithms that optimally solve the single-machine problem, and show that they can be used to achieve good performance for the multiple-machine problem. These algorithms are efficient and find schedules that are on average within 1.5\percent of optimal over a large synthetic benchmark consisting of trees, chains, and instances with no precedence constraints. We then present several ways to create feasible schedules from nonintegral solutions to a new linear programming relaxation for the multiple-machine problem. The best of these linear programming-based approaches finds schedules that are within 0.2\percent of optimal over our benchmark. Finally, we describe how the scheduling phase in profile-based program compilation can be expressed as a weighted completion time scheduling problem and apply our algorithms to a set of instances extracted from the SPECint95 compiler benchmark. For these instances with arbitrary precedence constraints, the best linear programming-based approach finds optimal solutions in 78\percent of cases. Our results demonstrate that careful experimentation can help lead the way to high quality algorithms, even for difficult optimization problems. Received October 30, 1998; revised March 28, 2001.  相似文献   

19.
Dehne  Dittrich  Hutchinson 《Algorithmica》2008,36(2):97-122
Abstract. External memory (EM) algorithms are designed for large-scale computational problems in which the size of the internal memory of the computer is only a small fraction of the problem size. Typical EM algorithms are specially crafted for the EM situation. In the past, several attempts have been made to relate the large body of work on parallel algorithms to EM, but with limited success. The combination of EM computing, on multiple disks, with multiprocessor parallelism has been posted as a challenge by the ACM Working Group on Storage I/ O for Large-Scale Computing. In this paper we provide a simulation technique which produces efficient parallel EM algorithms from efficient BSP-like parallel algorithms. The techniques obtained can accommodate one or multiple processors on the EM target machine, each with one or more disks, and they also adapt to the disk blocking factor of the target machine. When applied to existing BSP-like algorithms, our simulation technique produces improved parallel EM algorithms for a large number of problems.  相似文献   

20.
We present two new algorithms, Arc Length and Peer Count, for choosing a peer uniformly at random from the set of all peers in Chord (Proceedings of the ACM SIGCOMM 2001 Technical Conference, 2001). We show analytically that, in expectation, both algorithms have latency O(log n) and send O(log n) messages. Moreover, we show empirically that the average latency and message cost of Arc Length is 10.01log n and that the average latency and message cost of Peer Count is 20.02log n. To the best of our knowledge, these two algorithms are the first fully distributed algorithms for choosing a peer uniformly at random from the set of all peers in a Distributed Hash Table (DHT). Our motivation for studying this problem is threefold: to enable data collection by statistically rigorous sampling methods; to provide support for randomized, distributed algorithms over peer-to-peer networks; and to support the creation and maintenance of random links, and thereby offer a simple means of improving fault-tolerance. Research of S. Lewis, J. Saia and M. Young was partially supported by NSF grant CCR-0313160 and Sandia University Research Program grant No. 191445.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号