首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Some well-known primitive operations, such as compare-and-swap, can be used, together with read and write, to implement any object in a wait-free manner. However, this paper shows that, for a large class of objects, including counters, queues, stacks, and single-writer snapshots, wait-free implementations using only these primitive operations and a large class of other primitive operations cannot be space efficient: the number of base objects required is at least linear in the number of processes that share the implemented object. The same lower bounds are obtained for implementations of starvation-free mutual exclusion using only primitive operations from this class. For wait-free implementations of a closely related class of one-time objects, lower bounds on the tradeoff between time and space are presented.  相似文献   

2.
This paper presents lower bounds on the time and space complexity of implementations that use k-compare&swap (k-CAS) synchronization primitives. We prove that using k-CAS primitives can improve neither the time nor the space complexity of implementations of widely used concurrent objects, such as counter, stack, queue, and collect. Surprisingly, overly restrictive use of k-CAS may even increase the space complexity required by such implementations. We prove a lower bound of Omega (log_2 n) on the round complexity of implementations of a collect object using read, write, and k-CAS, for any k, where n is the number of processes in the system. There is an implementation of collect with O(log_2 n) round complexity that uses only reads and writes. Thus, our lower bound establishes that k-CAS is no stronger than read and write for collect implementation round complexity. For k-CAS operations that return the values of all the objects they access, we prove that the total step complexity of implementing key objects such as counters, stacks, and queues is Omega (n log_k n). We also prove that k-CAS cannot improve the space complexity of implementing many objects (including counter, stack, queue, and single-writer snapshot). An implementation has to use at least n base objects even if k-CAS is allowed, and if all operations (other than read) swap exactly k base objects, then it must use at least k cdot n base objects.  相似文献   

3.
We give matching upper and lower bounds of \(\varTheta(\min(\frac{\log m}{\log \log m},\, n))\) for the individual step complexity of a wait-free m-valued adopt-commit object implemented using multi-writer registers for n anonymous processes. While the upper bound is deterministic, the lower bound holds for randomized adopt-commit objects as well. Our results are based on showing that adopt-commit objects are equivalent, up to small additive constants, to a simpler class of objects that we call conflict detectors. Our anonymous lower bound also applies to the individual step complexity of m-valued wait-free anonymous consensus, even for randomized algorithms with global coins against an oblivious adversary. The upper bound can be used to slightly improve the cost of randomized consensus against an oblivious adversary. For deterministic non-anonymous implementations of adopt-commit objects, we show a lower bound of \(\varOmega(\min(\frac{\log m}{\log \log m}, \frac{\sqrt{\log n}}{\log \log n}))\) and an upper bound of \(O(\min(\frac{\log m}{\log \log m},\, \log n))\) on the worst-case individual step complexity. For randomized non-anonymous implementations, we show that any execution contains at least one process whose steps exceed the deterministic lower bound.  相似文献   

4.
We introduce SubPolyhedra (SubPoly), a new family of numerical abstract domains to infer and propagate linear inequalities. The key insight is that the reduced product of linear equalities and intervals produces powerful yet scalable analyses. Abstract domains in SubPoly are as expressive as Polyhedra, but they drop some of the deductive power to achieve scalability. The cost/precision ratio of abstract domains in the SubPoly family can be fine-tuned according to the precision one wants to retain at join points, and the algorithm used to infer the tighter bounds on intervals. We implemented SubPoly on the top of Clousot{{\tt Clousot}}, a generic abstract interpreter for .Net. Clousot{{\tt .Net.\,Clousot}} with SubPoly analyzes very large and complex code bases in few minutes. SubPoly can efficiently capture linear inequalities among hundreds of variables, a result well beyond the state-of-the-art implementations of Polyhedra.  相似文献   

5.
This paper is an in-depth study of qualitative physical reasoning about one particular scenario: using a box to carry a collection of objects from one place to another. Specifically we consider the plan, plan1 “Load objects uCargo into box oBox one by one; carry oBox from location l1 to location l2”. We present qualitative constraints on the shape, starting position, and material properties of uCargo and oBox and on the characteristics of the motion that suffice to make it virtually certain that plan1 can be successfully executed. We develop a theory, consisting mostly of first-order statements together with two default rules, that supports an inference of the form “If conditions XYZ hold, and the agent attempts to carry out plan1 then presumably he will succeed”. Our theory is elaboration tolerant in the sense that carrying out the analogous inference for carrying objects in boxes with lids, in boxes with small holes, or on trays can reuse much of the same knowledge. The theory integrates reasoning about continuous time, Euclidean space, commonsense dynamics of solid objects, and semantics of partially specified plans.  相似文献   

6.
Summary Thesnapshot object is an important tool for constructing wait-free asynchronous algorithms. We relate the snapshot object to thelattice agreement decision problem. It is shown that any algorithm for solving lattice agreement can be transformed into an implementation of a snapshot object. The overhead cost of this transformation is only a linear number of read and write operations on atomic single-writer multi-reader registers. The transformation uses an unbounded amount of shared memory. We present a deterministic algorithm for lattice agreement that usedO (log2 n) operations on 2-processorTest & Set registers, plusO (n) operations on atomic single-writer multi-reader registers. The shared objects are used by the algorithm in adynamic mode, that is, the identity of the processors that access each of the shared objects is determined dynamically during the execution of the algorithm. By a randomized implementation of 2-processorsTest & Set registers from atomic registers, this algorithm implies a randomized algorthm for lattice agreement that uses an expected number ofO (n) operations on (dynamic) atomic single-writer multi-reader registers. Combined with our transformation this yields implementations of atomic snapshots with the same complexity.Cambridge Research Laboratory, Digital Equipment Corporation Hagit Attiya received the B.Sc. degreeiin Mathematics and Computer Science from the Hebrew University of Jerusalem, in 1981, the M.Sc. and Ph.D. degrees in Computer Science from the Hebrew University of Jerusalem, in 1983 and 1987, respectively. She is presently a senior lecturer at the departtment of Computer Science at the Technion, Israel Institute of Technology. Prior to this, she has been a post-doctoral research associate at the Laboratory for Computer Science at M.I.T. Her general research interests are distributed computation and theoretical computer science. More specific interests include fault-tolerance, timing-based and asynchronous algorithms. Maurice Herlihy received the A.B. degree in Mathematics from Harvard University, and the M.S. and the Ph.D. degrees in Computer Science from M.I.T. From 1984 to 1989 he was a faculty member in the Computer Science Department at Carnegie Mellon University in Pittsburgh, PA. In 1989 he joined the research staff at the Digital Equipment Corporation's Cambridge Research Laboratory in Cambridge MA. Since 1994, he has been on the faculty at the Computer Science Department at Brown University. Dr. Herlihy's research interests encompass practical and theoretical aspects of distributed and concurrent computation. Ophir achman received a B.A. in computer science from the Technion, Haifa, Israel in 1989 and M.Sc. in computer science from the Technion, Haifa, Israel, in 1992. He is now studying for a D.Sc. in computer science at the Technion. His currentarea of research is distributed computing, and in particular, asynchronous shared memory systems.This work appeared in preliminary form in proceedings ofthe 6th International Workshop on Distributed Algorithms [12]. This research was partially supported by grant No. 92-0233 from the United States-Israel Binational Science Foundation (BSF), Jerusalem, Technion V.P.R. funds — B. and G. Greenberg Research Fund (Ottawa), and the fund for the promotion of research in the TechnionPart of the work of this author was performed while visiting DEC Cambridge Research Laboratory  相似文献   

7.
The classical pattern matching paradigm is that of seeking occurrences of one string in another, where both strings are drawn from an alphabet set Σ. In the parameterized pattern matching model, a consistent renaming of symbols from Σ is allowed in a match. The parameterized matching paradigm has proven useful in problems in software engineering, computer vision, and other applications. In classical pattern matching, both the text and pattern are strings. Applications such as searching in xml or searching in hypertext require searching strings in non-linear structures such as trees or graphs. There has been work in the literature on exact and approximate parameterized matching, as well as work on exact and approximate string matching on non-linear structures. In this paper we explore parameterized matching in non-linear structures. We prove that exact parameterized matching on trees can be computed in linear time for alphabets in an O(n)-size integer range, and in time O(nlogm) in general, where n is the tree size and m the pattern length. These bounds are optimal in the comparison model. We also show that exact parameterized matching on directed acyclic graphs (DAGs) is NP-complete.  相似文献   

8.
In this paper we continue previous studies on the computational efficiency of spiking neural P systems, under the assumption that some pre-computed resources of exponential size are given in advance. Specifically, we give a deterministic solution for each of two well known PSPACE-complete problems: QSAT and Q3SAT. In the case of QSAT, the answer to any instance of the problem is computed in a time which is linear with respect to both the number n of Boolean variables and the number m of clauses that compose the instance. As for Q3SAT, the answer is computed in a time which is at most cubic in the number n of Boolean variables.  相似文献   

9.
Let f be a univariate polynomial with real coefficients, fR[X]. Subdivision algorithms based on algebraic techniques (e.g., Sturm or Descartes methods) are widely used for isolating the real roots of f in a given interval. In this paper, we consider a simple subdivision algorithm whose primitives are purely numerical (e.g., function evaluation). The complexity of this algorithm is adaptive because the algorithm makes decisions based on local data. The complexity analysis of adaptive algorithms (and this algorithm in particular) is a new challenge for computer science. In this paper, we compute the size of the subdivision tree for the SqFreeEVAL algorithm.The SqFreeEVAL algorithm is an evaluation-based numerical algorithm which is well-known in several communities. The algorithm itself is simple, but prior attempts to compute its complexity have proven to be quite technical and have yielded sub-optimal results. Our main result is a simple O(d(L+lnd)) bound on the size of the subdivision tree for the SqFreeEVAL algorithm on the benchmark problem of isolating all real roots of an integer polynomial f of degree d and whose coefficients can be written with at most L bits.Our proof uses two amortization-based techniques: first, we use the algebraic amortization technique of the standard Mahler-Davenport root bounds to interpret the integral in terms of d and L. Second, we use a continuous amortization technique based on an integral to bound the size of the subdivision tree. This paper is the first to use the novel analysis technique of continuous amortization to derive state of the art complexity bounds.  相似文献   

10.
Rumor spreading in social networks   总被引:1,自引:0,他引:1  
Social networks are an interesting class of graphs likely to become of increasing importance in the future, not only theoretically, but also for its probable applications to ad hoc and mobile networking. Rumor spreading is one of the basic mechanisms for information dissemination in networks; its relevance stemming from its simplicity of implementation and effectiveness. In this paper, we study the performance of rumor spreading in the classic preferential attachment model of Bollobás et al. which is considered to be a valuable model for social networks. We prove that, in these networks: (a) The standard PUSH-PULL strategy delivers the message to all nodes within O(log2n) rounds with high probability; (b) by themselves, PUSH and PULL require polynomially many rounds. (These results are under the assumption that m, the number of new links added with each new node is at least 2. If m=1 the graph is disconnected with high probability, so no rumor spreading strategy can work.) Our analysis is based on a careful study of some new properties of preferential attachment graphs which could be of independent interest.  相似文献   

11.
We study a simple technique, originally presented by Herlihy (ACM Trans. Program. Lang. Syst. 15(5):745–770, 1993), for executing concurrently, in a wait-free manner, blocks of code that have been programmed for sequential execution and require significant synchronization in order to be performed in parallel. We first present an implementation of this technique, called Sim, which employs a collect object. We describe a simple implementation of a collect object from a single shared object that supports atomic Add (or XOR) in addition to read; this implementation has step complexity O(1). By plugging in to Sim this implementation, Sim exhibits constant step complexity as well. This allows us to derive lower bounds on the step complexity of implementations of several shared objects, like Add, XOR, collect, and snapshot objects, from LL/SC objects. We then present a practical version of Sim, called PSim, which is implemented in a real shared-memory machine. From a theoretical perspective, PSim has worse step complexity than Sim, its theoretical analog; in practice though, we experimentally show that PSim is highly-efficient: it outperforms several state-of-the-art lock-based and lock-free synchronization techniques, and this given that it is wait-free, i.e. that it satisfies a stronger progress condition than all the algorithms that it outperforms. We have used PSim to get highly-efficient wait-free implementations of stacks and queues.  相似文献   

12.
13.
Although the problem of data server placement in parallel and distributed systems has been studied extensively, most of the existing work assumes there is no competition between servers. Hence, their goal is to minimize read, update and storage cost. In this paper, we study the server placement problem in which a new server has to compete with existing servers for user requests. Therefore, in addition to minimizing cost, we also need to maximize the benefit of building a new server.Our major results include three parts. First, for tree-structured systems, we propose an O(|V|3k) time dynamic programming algorithm to find the optimal placement of k extra servers that maximizes the benefit in a tree with |V| nodes. We also propose an O(|V|3) time dynamic programming algorithm to find the optimal placement of extra servers that maximizes the benefit, without any constraint on the number of extra servers. Second, for general connected graphs, we prove that the server placement problems are NP-complete, and present three greedy heuristic algorithms, called Greedy Add, Greedy Remove and Greedy Add-Remove, to solve them. Third, we show that if the number of requests a server can handle (i.e., server capacity) is bounded, the server placement problem is NP-complete even for tree networks. We then derive a variation of the same set of greedy heuristic algorithms, with consideration of server capacity constraint, to solve the problem.Our experiment results demonstrate that the greedy algorithms achieve good results, when compared with the upper bounds found by a linear programming algorithm. Greedy Add performs best in the unconstrained model, yielding a benefit within 12% difference from the theoretical upper bound in average. For the constrained model, Greedy Remove performs best for smaller network sizes, while Greedy Add-Remove performs best for larger network sizes. On average, the heuristic algorithms yield a benefit within 13% difference from the theoretical upper bound in the constrained model.  相似文献   

14.
This new version of TaylUR is based on a completely new core, which now is able to compute the numerical values of all of a complex-valued function's partial derivatives up to an arbitrary order, including mixed partial derivatives.

New version program summary

Program title: TaylURCatalogue identifier: ADXR_v3_0Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADXR_v3_0.htmlProgram obtainable from: CPC Program Library, Queen's University, Belfast, N. IrelandLicensing provisions: GPLv2No. of lines in distributed program, including test data, etc.: 6750No. of bytes in distributed program, including test data, etc.: 19 162Distribution format: tar.gzProgramming language: Fortran 95Computer: Any computer with a conforming Fortran 95 compilerOperating system: Any system with a conforming Fortran 95 compilerClassification: 4.12, 4.14Catalogue identifier of previous version: ADXR_v2_0Journal reference of previous version: Comput. Phys. Comm. 176 (2007) 710Does the new version supersede the previous version?: YesNature of problem: Problems that require potentially high orders of partial derivatives with respect to several variables or derivatives of complex-valued functions, such as e.g. momentum or mass expansions of Feynman diagrams in perturbative QFT, and which previous versions of this TaylUR [1,2] cannot handle due to their lack of support for mixed partial derivatives.Solution method: Arithmetic operators and Fortran intrinsics are overloaded to act correctly on objects of a defined type taylor, which encodes a function along with its first few partial derivatives with respect to the user-defined independent variables. Derivatives of products and composite functions are computed using multivariate forms [3] of Leibniz's rule where ν=(ν1,…,νd), , , Dνf=|ν|f/(ν1x1?νdxd), and μ<ν iff either |μ|<|ν| or |μ|=|ν|,μ1=ν1,…,μk=νk,μk+1<νk+1 for some k∈{0,…,d−1}, and of Fàa di Bruno's formula where the sum is over , . An indexed storage system is used to store the higher-order derivative tensors in a one-dimensional array. The relevant indices (k1,…,ks;λ1,…,λs) and the weights occurring in the sums in Leibniz's and Fàa di Bruno's formula are precomputed at startup and stored in static arrays for later use.Reasons for new version: The earlier version lacked support for mixed partial derivatives, but a number of projects of interest required them.Summary of revisions: The internal representation of a taylor object has changed to a one-dimensional array which contains the partial derivatives in ascending order, and in lexicographic order of the corresponding multiindex within the same order. The necessary mappings between multiindices and indices into the taylor objects' internal array are computed at startup. To support the change to a genuinely multivariate taylor type, the DERIVATIVE function is now implemented via an interface that accepts both the older format derivative(f,mu,n) and also a new format derivative(f,mu(:))=Dμf that allows access to mixed partial derivatives. Another related extension to the functionality of the module is the HESSIAN function that returns the Hessian matrix of second derivatives of its argument. Since the calculation of all mixed partial derivatives can be very costly, and in many cases only some subset is actually needed, a masking facility has been added. Calling the subroutine DEACTIVATE_DERIVATIVE with a multiindex as an argument will deactivate the calculation of the partial derivative belonging to that multiindex, and of all partial derivatives it can feed into. Similarly, calling the subroutine ACTIVATE_DERIVATIVE will activate the calculation of the partial derivative belonging to its argument, and of all partial derivatives that can feed into it. Moreover, it is possible to turn off the computation of mixed derivatives altogether by setting Diagonal_taylors to .TRUE.. It should be noted that any change of Diagonal_taylors or Taylor_order invalidates all existing taylor objects. To aid the better integration of TaylUR into the HPSrc library [4], routines SET_DERIVATIVE and SET_ALL_DERIVATIVES are provided as a means of manually constructing a taylor object with given derivatives.Restrictions: Memory and CPU time constraints may restrict the number of variables and Taylor expansion order that can be achieved. Loss of numerical accuracy due to cancellation may become an issue at very high orders.Unusual features: These are the same as in previous versions, but are enumerated again here for clarity. The complex conjugation operation assumes all independent variables to be real. The functions REAL and AIMAG do not convert to real type, but return a result of type taylor (with the real/imaginary part of each derivative taken) instead. The user-defined functions VALUE, REALVALUE and IMAGVALUE, which return the value of a taylor object as a complex number, and the real and imaginary part of this value, respectively, as a real number are also provided. Fortran 95 intrinsics that are defined only for arguments of real type (ACOS, AINT, ANINT, ASIN, ATAN, ATAN2, CEILING, DIM, FLOOR, INT, LOG10, MAX, MAXLOC, MAXVAL, MIN, MINLOC, MINVAL, MOD, MODULO, NINT, SIGN) will silently take the real part of taylor-valued arguments unless the module variable Real_args_warn is set to .TRUE., in which case they will return a quiet NaN value (if supported by the compiler) when called with a taylor argument whose imaginary part exceeds the module variable Real_args_tol. In those cases where the derivative of a function becomes undefined at certain points (as for ABS, AINT, ANINT, MAX, MIN, MOD, and MODULO), while the value is well defined, the derivative fields will be filled with quiet NaN values (if supported by the compiler).Additional comments: This version of TaylUR is released under the second version of the GNU General Public License (GPLv2). Therefore anyone is free to use or modify the code for their own calculations. As part of the licensing, it is requested that any publications including results from the use of TaylUR or any modification derived from it cite Refs. [1,2] as well as this paper. Finally, users are also requested to communicate to the author details of such publications, as well as of any bugs found or of required or useful modifications made or desired by them.Running time: The running time of TaylUR operations grows rapidly with both the number of variables and the Taylor expansion order. Judicious use of the masking facility to drop unneeded higher derivatives can lead to significant accelerations, as can activation of the Diagonal_taylors variable whenever mixed partial derivatives are not needed.Acknowledgments: The author thanks Alistair Hart for helpful comments and suggestions. This work is supported by the Deutsche Forschungsgemeinschaft in the SFB/TR 09.References:
[1]
G.M. von Hippel, TaylUR, an arbitrary-order diagonal automatic differentiation package for Fortran 95, Comput. Phys. Comm. 174 (2006) 569.
[2]
G.M. von Hippel, New version announcement for TaylUR, an arbitrary-order diagonal automatic differentiation package for Fortran 95, Comput. Phys. Comm. 176 (2007) 710.
[3]
G.M. Constantine, T.H. Savits, A multivariate Faa di Bruno formula with applications, Trans. Amer. Math. Soc. 348 (2) (1996) 503.
[4]
A. Hart, G.M. von Hippel, R.R. Horgan, E.H. Müller, Automated generation of lattice QCD Feynman rules, Comput. Phys. Comm. 180 (2009) 2698, doi:10.1016/j.cpc.2009.04.021, arXiv:0904.0375.
  相似文献   

15.
16.
17.
HiggsBounds is a computer code that tests theoretical predictions of models with arbitrary Higgs sectors against the exclusion bounds obtained from the Higgs searches at LEP and the Tevatron. The included experimental information comprises exclusion bounds at 95% C.L. on topological cross sections. In order to determine which search topology has the highest exclusion power, the program also includes, for each topology, information from the experiments on the expected exclusion bound, which would have been observed in case of a pure background distribution. Using the predictions of the desired model provided by the user as input, HiggsBounds determines the most sensitive channel and tests whether the considered parameter point is excluded at the 95% C.L. HiggsBounds is available as a Fortran 77 and Fortran 90 code. The code can be invoked as a command line version, a subroutine version and an online version. Examples of exclusion bounds obtained with HiggsBounds are discussed for the Standard Model, for a model with a fourth generation of quarks and leptons and for the Minimal Supersymmetric Standard Model with and without CP-violation. The experimental information on the exclusion bounds currently implemented in HiggsBounds will be updated as new results from the Higgs searches become available.

Program summary

Program title: HiggsBoundsCatalogue identifier: AEFF_v1_0Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEFF_v1_0.htmlProgram obtainable from: CPC Program Library, Queen's University, Belfast, N. IrelandLicensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.htmlNo. of lines in distributed program, including test data, etc.: 55 733No. of bytes in distributed program, including test data, etc.: 1 986 213Distribution format: tar.gzProgramming language: Fortran 77, Fortran 90 (two code versions are offered).Computer: HiggsBounds can be built with any compatible Fortran 77 or Fortran 90 compiler. The program has been tested on x86 CPUs running under Linux (Ubuntu 8.04) and with the following compilers: The Portland Group Inc. Fortran compilers (pgf77, pgf90), the GNU project Fortran compilers (g77, gfortran).Operating system: LinuxRAM: minimum of about 6000 kbytes (dependent on the code version)Classification: 11.1External routines: HiggsBounds requires no external routines/libraries. Some sample programs in the distribution require the programs FeynHiggs 2.6.x or CPsuperH2 to be installed (see “Subprograms used”).Subprograms used:
Cat IdTitleReference
ADKT_v2_0FeynHiggsv2.6.5CPC 180(2009)1426
ADSR_v2_0CPsuperH2.0CPC 180(2009)312
Full-size table
  相似文献   

18.
The computation of eigenvalues and eigenvectors of symmetric tridiagonal matrices arises frequently in applications; often as one of the steps in the solution of Hermitian and symmetric eigenproblems. While several accurate and efficient methods for the tridiagonal eigenproblem exist, their corresponding implementations usually target uni-processors or large distributed memory systems. Our new eigensolver MR3-SMP is instead specifically designed for multi-core and many-core general purpose processors, which today have effectively replaced uni-processors. We show that in most cases MR3-SMP is faster and achieves better speedups than state-of-the-art eigensolvers for uni-processors and distributed-memory systems.  相似文献   

19.
We present a software package that guesses formulae for sequences of, for example, rational numbers or rational functions, given the first few terms. We implement an algorithm due to Bernhard Beckermann and George Labahn, together with some enhancements to render our package efficient. Thus we extend and complement Christian Krattenthaler’s program Rate.m, the parts concerned with guessing of Bruno Salvy and Paul Zimmermann’s GFUN, the univariate case of Manuel Kauers’ Guess.m and Manuel Kauers’ and Christoph Koutschan’s qGeneratingFunctions.m.  相似文献   

20.
In online makespan minimization, the jobs characterized by their processing time arrive one-by-one and each has to be assigned to one of the m uniformly related machines. The goal is to minimize the length of the schedule. We prove new combinatorial lower bounds for m=4 and m=5, and computer-assisted lower bounds for m≤11.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号