首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
The wavelet tree has become a very useful data structure to efficiently represent and query large volumes of data in many different domains, from bioinformatics to geographic information systems. One problem with wavelet trees is their construction time. In this paper, we introduce two algorithms that reduce the time complexity of a wavelet tree’s construction by taking advantage of nowadays ubiquitous multicore machines. Our first algorithm constructs all the levels of the wavelet in parallel with O(n) time and \(O(n\lg \sigma + \sigma \lg n)\) bits of working space, where n is the size of the input sequence and \(\sigma \) is the size of the alphabet. Our second algorithm constructs the wavelet tree in a domain decomposition fashion, using our first algorithm in each segment, reaching \(O(\lg n)\) time and \(O(n\lg \sigma + p\sigma \lg n/\lg \sigma )\) bits of extra space, where p is the number of available cores. Both algorithms are practical and report good speedup for large real datasets.  相似文献   

Constructions of quantum caps in projective space PG(r, 4) by recursive methods and computer search are discussed. For each even n satisfying \(n\ge 282\) and each odd z satisfying \(z\ge 275\), a quantum n-cap and a quantum z-cap in \(PG(k-1, 4)\) with suitable k are constructed, and \([[n,n-2k,4]]\) and \([[z,z-2k,4]]\) quantum codes are derived from the constructed quantum n-cap and z-cap, respectively. For \(n\ge 282\) and \(n\ne 286\), 756 and 5040, or \(z\ge 275\), the results on the sizes of quantum caps and quantum codes are new, and all the obtained quantum codes are optimal codes according to the quantum Hamming bound. While constructing quantum caps, we also obtain many large caps in PG(r, 4) for \(r\ge 11\). These results concerning large caps provide improved lower bounds on the maximal sizes of caps in PG(r, 4) for \(r\ge 11\).  相似文献   

We study the following energy-efficient scheduling problem. We are given a set of n jobs which have to be scheduled by a single processor whose speed can be varied dynamically. Each job \(J_j\) is characterized by a processing requirement (work) \(p_j\), a release date \(r_j\), and a deadline \(d_j\). We are also given a budget of energy E which must not be exceeded and our objective is to maximize the throughput (i.e., the number of jobs which are completed on time). We show that the problem can be solved optimally via dynamic programming in \(O(n^4 \log n \log P)\) time when all jobs have the same release date, where P is the sum of the processing requirements of the jobs. For the more general case with agreeable deadlines where the jobs can be ordered so that, for every \(i < j\), it holds that \(r_i \le r_j\) and \(d_i \le d_j\), we propose an optimal dynamic programming algorithm which runs in \(O(n^6 \log n \log P)\) time. In addition, we consider the weighted case where every job \(J_j\) is also associated with a weight \(w_j\) and we are interested in maximizing the weighted throughput (i.e., the total weight of the jobs which are completed on time). For this case, we show that the problem becomes \(\mathcal{NP}\)-hard in the ordinary sense even when all jobs have the same release date and we propose a pseudo-polynomial time algorithm for agreeable instances.  相似文献   

Given a distributed system of \(n\) balls and \(n\) bins, how evenly can we distribute the balls to the bins, minimizing communication? The fastest non-adaptive and symmetric algorithm achieving a constant maximum bin load requires \(\varTheta (\log \log n)\) rounds, and any such algorithm running for \(r\in {\mathcal {O}}(1)\) rounds incurs a bin load of \(\varOmega ((\log n/\log \log n)^{1/r})\). In this work, we explore the fundamental limits of the general problem. We present a simple adaptive symmetric algorithm that achieves a bin load of 2 in \(\log ^* n+{\mathcal {O}}(1)\) communication rounds using \({\mathcal {O}}(n)\) messages in total. Our main result, however, is a matching lower bound of \((1-o(1))\log ^* n\) on the time complexity of symmetric algorithms that guarantee small bin loads. The essential preconditions of the proof are (i) a limit of \({\mathcal {O}}(n)\) on the total number of messages sent by the algorithm and (ii) anonymity of bins, i.e., the port numberings of balls need not be globally consistent. In order to show that our technique yields indeed tight bounds, we provide for each assumption an algorithm violating it, in turn achieving a constant maximum bin load in constant time.  相似文献   

We consider broadcasting in random d-regular graphs by using a simple modification of the random phone call model introduced by Karp et al. (Proceedings of the FOCS ’00, 2000). In the phone call model, in every time step, each node calls a randomly chosen neighbour to establish a communication channel to this node. The communication channels can then be used bi-directionally to transmit messages. We show that, if we allow every node to choose four distinct neighbours instead of one, then the average number of message transmissions per node required to broadcast a message efficiently decreases exponentially. Formally, we present an algorithm that has time complexity \(O(\log n)\) and uses \(O(n\log \log n)\) transmissions per message. In contrast, we show for the standard model that every distributed algorithm in a restricted address-oblivious model that broadcasts a message in time \(O(\log n)\) requires \(\Omega (n \log n{/} \log d)\) message transmissions. Our algorithm efficiently handles limited communication failures, only requires rough estimates of the number of nodes, and is robust against limited changes in the size of the network. Our results have applications in peer-to-peer networks and replicated databases.  相似文献   

Let \(G=(V,E)\) be an unweighted undirected graph with n vertices and m edges, and let \(k>2\) be an integer. We present a routing scheme with a poly-logarithmic header size, that given a source s and a destination t at distance \(\varDelta \) from s, routes a message from s to t on a path whose length is \(O(k\varDelta +m^{1/k})\). The total space used by our routing scheme is \(mn^{O(1/\sqrt{\log n})}\), which is almost linear in the number of edges of the graph. We present also a routing scheme with \(n^{O(1/\sqrt{\log n})}\) header size, and the same stretch (up to constant factors). In this routing scheme, the routing table of every \(v\in V\) is at most \(kn^{O(1/\sqrt{\log n})}deg(v)\), where deg(v) is the degree of v in G. Our results are obtained by combining a general technique of Bernstein (2009), that was presented in the context of dynamic graph algorithms, with several new ideas and observations.  相似文献   

We consider scheduling of unit-length jobs with release times and deadlines, where the objective is to minimize the number of gaps in the schedule. Polynomial-time algorithms for this problem are known, yet they are rather inefficient, with the best algorithm running in time \(O(n^4)\) and requiring \(O(n^3)\) memory. We present a greedy algorithm that approximates the optimum solution within a factor of 2 and show that our analysis is tight. Our algorithm runs in time \(O(n^2 \log n)\) and needs only O(n) memory. In fact, the running time is \(O(n (g^*+1)\log n)\), where \(g^*\) is the minimum number of gaps.  相似文献   

Network cost and fixed-degree characteristic for the graph are important factors to evaluate interconnection networks. In this paper, we propose hierarchical Petersen network (HPN) that is constructed in recursive and hierarchical structure based on a Petersen graph as a basic module. The degree of HPN(n) is 5, and HPN(n) has \(10^n\) nodes and \(2.5 \times 10^n\) edges. And we analyze its basic topological properties, routing algorithm, diameter, spanning tree, broadcasting algorithm and embedding. From the analysis, we prove that the diameter and network cost of HPN(n) are \(3\log _{10}N-1\) and \(15 \log _{10}N-1\), respectively, and it contains a spanning tree with the degree of 4. In addition, we propose link-disjoint one-to-all broadcasting algorithm and show that HPN(n) can be embedded into FP\(_k\) with expansion 1, dilation 2k and congestion 4. For most of the fixed-degree networks proposed, network cost and diameter require \(O(\sqrt{N})\) and the degree of the graph requires O(N). However, HPN(n) requires O(1) for the degree and \(O(\log _{10}N)\) for both diameter and network cost. As a result, the suggested interconnection network in this paper is superior to current fixed-degree and hierarchical networks in terms of network cost, diameter and the degree of the graph.  相似文献   

The problem of two edge-disjoint paths is to identify two paths \(Q_1\) and \(Q_2\) from source \(s \in V\) to target \(t \in V\) without any common arc in a directed connected graph \(G=(V, E)\). In this paper, we present an adaptive stabilizing algorithm for finding a pair of edge-disjoint paths from s to t in G in O(D) rounds with state-space complexity of \(O(log\; n)\) bits per process, where n is the number of nodes and D is the diameter of the graph. The proposed algorithm is optimal with respect to its time complexity, and the total length of the shortest paths. In addition, it can also be used to solve the problem for undirected graphs. Since the proposed algorithm is stabilizing, it does not require initialization and is capable of withstanding transient faults. We view a fault that perturbs the state of the system but not its program as a transient fault. In addition, the proposed algorithm is adaptive since it is capable of dealing with topology changes in the form of addition/removal of arcs and/or nodes as well as changes in the directions of arcs provided that two edge-disjoint paths between s and t exist after the topology change.  相似文献   

In this paper we consider the time complexity of adding two n-bit numbers together within the tile self-assembly model. The (abstract) tile assembly model is a mathematical model of self-assembly in which system components are square tiles with different glue types assigned to tile edges. Assembly is driven by the attachment of singleton tiles to a growing seed assembly when the net force of glue attraction for a tile exceeds some fixed threshold. Within this frame work, we examine the time complexity of computing the sum of two n-bit numbers, where the input numbers are encoded in an initial seed assembly, and the output sum is encoded in the final, terminal assembly of the system. We show that this problem, along with multiplication, has a worst case lower bound of \(\varOmega ( \sqrt{n} )\) in 2D assembly, and \(\varOmega (\root 3 \of {n})\) in 3D assembly. We further design algorithms for both 2D and 3D that meet this bound with worst case run times of \(O(\sqrt{n})\) and \(O(\root 3 \of {n})\) respectively, which beats the previous best known upper bound of O(n). Finally, we consider average case complexity of addition over uniformly distributed n-bit strings and show how we can achieve \(O(\log n)\) average case time with a simultaneous \(O(\sqrt{n})\) worst case run time in 2D. As additional evidence for the speed of our algorithms, we implement our algorithms, along with the simpler O(n) time algorithm, into a probabilistic run-time simulator and compare the timing results.  相似文献   

In many parallel and distributed multiprocessor systems, the processors are connected based on different types of interconnection networks. The topological structure of an interconnection network is typically modeled as a graph. Among the many kinds of network topologies, the crossed cube is one of the most popular. In this paper, we investigate the panpositionable panconnectedness problem with respect to the crossed cube. A graph G is r-panpositionably panconnected if for any three distinct vertices x, y, z of G and for any integer \(l_1\) satisfying \(r \le l_1 \le |V(G)| - r - 1\), there exists a path \(P = [x, P_1, y, P_2, z]\) in G such that (i) \(P_1\) joins x and y with \(l(P_1) = l_1\) and (ii) \(P_2\) joins y and z with \(l(P_2) = l_2\) for any integer \(l_2\) satisfying \(r \le l_2 \le |V(G)| - l_1 - 1\), where |V(G)| is the total number of vertices in G and \(l(P_1)\) (respectively, \(l(P_2)\)) is the length of path \(P_1\) (respectively, \(P_2\)). By mathematical induction, we demonstrate that the n-dimensional crossed cube \(CQ_n\) is n-panpositionably panconnected. This result indicates that the path embedding of joining x and z with a mediate vertex y in \(CQ_n\) is extremely flexible. Moreover, applying our result, crossed cube problems such as panpositionable pancyclicity, panpositionably Hamiltonian connectedness, and panpositionable Hamiltonicity can be solved.  相似文献   

Two mobile agents, starting from different nodes of a network at possibly different times, have to meet at the same node. This problem is known as rendezvous. Agents move in synchronous rounds. Each agent has a distinct integer label from the set \(\{1,\ldots ,L\}\). Two main efficiency measures of rendezvous are its time (the number of rounds until the meeting) and its cost (the total number of edge traversals). We investigate tradeoffs between these two measures. A natural benchmark for both time and cost of rendezvous in a network is the number of edge traversals needed for visiting all nodes of the network, called the exploration time. Hence we express the time and cost of rendezvous as functions of an upper bound E on the time of exploration (where E and a corresponding exploration procedure are known to both agents) and of the size L of the label space. We present two natural rendezvous algorithms. Algorithm Cheap has cost O(E) (and, in fact, a version of this algorithm for the model where the agents start simultaneously has cost exactly E) and time O(EL). Algorithm Fast has both time and cost \(O(E\log L)\). Our main contributions are lower bounds showing that, perhaps surprisingly, these two algorithms capture the tradeoffs between time and cost of rendezvous almost tightly. We show that any deterministic rendezvous algorithm of cost asymptotically E (i.e., of cost \(E+o(E)\)) must have time \(\varOmega (EL)\). On the other hand, we show that any deterministic rendezvous algorithm with time complexity \(O(E\log L)\) must have cost \(\varOmega (E\log L)\).  相似文献   

In the typical model, a discrete-time coined quantum walk searching the 2D grid for a marked vertex achieves a success probability of \(O(1/\log N)\) in \(O(\sqrt{N \log N})\) steps, which with amplitude amplification yields an overall runtime of \(O(\sqrt{N} \log N)\). We show that making the quantum walk lackadaisical or lazy by adding a self-loop of weight 4 / N to each vertex speeds up the search, causing the success probability to reach a constant near 1 in \(O(\sqrt{N \log N})\) steps, thus yielding an \(O(\sqrt{\log N})\) improvement over the typical, loopless algorithm. This improved runtime matches the best known quantum algorithms for this search problem. Our results are based on numerical simulations since the algorithm is not an instance of the abstract search algorithm.  相似文献   

This paper proposes a cost-efficient quantum multiplier–accumulator unit. The paper also presents a fast multiplication algorithm and designs a novel quantum multiplier device based on the proposed algorithm with the optimum time complexity as multiplier is the major device of a multiplier–accumulator unit. We show that the proposed multiplication technique has time complexity \(O((3 {\hbox {log}}_{2}n)+1)\), whereas the best known existing technique has \(O(n{\hbox {log}}_{2} n)\), where n is the number of qubits. In addition, our design proposes three new quantum circuits: a circuit representing a quantum full-adder, a circuit known as quantum ANDing circuit, which performs the ANDing operation and a circuit presenting quantum accumulator. Moreover, the proposed quantum multiplier–accumulator unit is the first ever quantum multiplier–accumulator circuit in the literature till now, which has reduced garbage outputs and ancillary inputs to a great extent. The comparative study shows that the proposed quantum multiplier performs better than the existing multipliers in terms of depth, quantum gates, delays, area and power with the increasing number of qubits. Moreover, we design the proposed quantum multiplier–accumulator unit, which performs better than the existing ones in terms of hardware and delay complexities, e.g., the proposed (\(n\times n\))—qubit quantum multiplier–accumulator unit requires \(O(n^{2})\) hardware and \(O({\hbox {log}}_{2}n)\) delay complexities, whereas the best known existing quantum multiplier–accumulator unit requires \(O(n^{3})\) hardware and \(O((n-1)^{2} +1+n)\) delay complexities. In addition, the proposed design achieves an improvement of 13.04, 60.08 and 27.2% for \(4\times 4\), 7.87, 51.8 and 27.1% for \(8\times 8\), 4.24, 52.14 and 27% for \(16\times 16\), 2.19, 52.15 and 27.26% for \(32 \times 32\) and 0.78, 52.18 and 27.28% for \(128 \times 128\)-qubit multiplications over the best known existing approach in terms of number of quantum gates, ancillary inputs and garbage outputs, respectively. Moreover, on average, the proposed design gains an improvement of 5.62% in terms of area and power consumptions over the best known existing approach.  相似文献   

Let \(H_{1}, H_{2},\ldots ,H_{n}\) be separable complex Hilbert spaces with \(\dim H_{i}\ge 2\) and \(n\ge 2\). Assume that \(\rho \) is a state in \(H=H_1\otimes H_2\otimes \cdots \otimes H_n\). \(\rho \) is called strong-k-separable \((2\le k\le n)\) if \(\rho \) is separable for any k-partite division of H. In this paper, an entanglement witnesses criterion of strong-k-separability is obtained, which says that \(\rho \) is not strong-k-separable if and only if there exist a k-division space \(H_{m_{1}}\otimes \cdots \otimes H_{m_{k}}\) of H, a finite-rank linear elementary operator positive on product states \(\Lambda :\mathcal {B}(H_{m_{2}}\otimes \cdots \otimes H_{m_{k}})\rightarrow \mathcal {B}(H_{m_{1}})\) and a state \(\rho _{0}\in \mathcal {S}(H_{m_{1}}\otimes H_{m_{1}})\), such that \(\mathrm {Tr}(W\rho )<0\), where \(W=(\mathrm{Id}\otimes \Lambda ^{\dagger })\rho _{0}\) is an entanglement witness. In addition, several different methods of constructing entanglement witnesses for multipartite states are also given.  相似文献   

Two quantum correlations Q and \(Q_\mathcal P\) for \((m+n)\)-mode continuous-variable systems are introduced in terms of average distance between the reduced states under the local Gaussian positive operator-valued measurements, and analytical formulas of these quantum correlations for bipartite Gaussian states are provided. It is shown that the product states do not contain these quantum correlations, and conversely, all \((m+n)\)-mode Gaussian states with zero quantum correlations are product states. Generally, \(Q\ge Q_{\mathcal P}\), but for the symmetric two-mode squeezed thermal states, these quantum correlations are the same and a computable formula is given. In addition, Q is compared with Gaussian geometric discord for symmetric squeezed thermal states.  相似文献   

We revisit the analysis of the classical QuickSelect algorithm. Usually, the analysis deals with the mean number of key comparisons, but here we view keys as words produced by a source, and words are compared via their symbols in lexicographic order. Our probabilistic models belong to a broad category of information sources that encompasses memoryless (i.e., independent-symbols) and Markov sources, as well as many unbounded-correlation sources. The “realistic” cost of the algorithm is here the total number of symbol comparisons performed by the algorithm, and, in this context, the average-case analysis aims to provide estimates for the mean number of symbol comparisons. For the QuickSort algorithm, known average-case complexity results are of \({\Theta } (n \log n)\) in the case of key comparisons, and \({\Theta }(n\log ^{2} n)\) for symbol comparisons. For QuickSelect algorithms, and with respect to key comparisons, the average-case complexity is Θ(n). In this present article, we prove that, with respect to symbol comparisons, QuickSelect’s average-case complexity remains Θ(n). In each case, we provide explicit expressions for the dominant constants, closely related to the probabilistic behaviour of the source.  相似文献   

Spheroidal harmonics and modified Bessel functions have wide applications in scientific and engineering computing. Recursive methods are developed to compute the logarithmic derivatives, ratios, and products of the prolate spheroidal harmonics (\(P_n^m(x)\), \(Q_n^m(x)\), \(n\ge m\ge 0\), \(x>1\)), the oblate spheroidal harmonics (\(P_n^m(ix)\), \(Q_n^m(ix)\), \(n\ge m\ge 0\), \(x>0\)), and the modified Bessel functions (\(I_n(x)\), \(K_n(x)\), \(n\ge 0\), \(x>0\)) in order to avoid direct evaluation of these functions that may easily cause overflow/underflow for high degree/order and for extreme argument. Stability analysis shows the proposed recursive methods are stable for realistic degree/order and argument values. Physical examples in electrostatics are given to validate the recursive methods.  相似文献   

Speed scaling problems consider energy-efficient job scheduling in processors by adjusting the speed to reduce energy consumption, where power consumption is a convex function of speed (usually, \(P(s) =s^{\alpha }, \alpha =2,3\)). In this work, we study speed scaling problems considering memory/cache. Each job needs some time for memory operation when it is fetched from memory,, and needs less time if fetched from the cache. The objective is to minimize energy consumption while satisfying the time constraints of the jobs. Two models are investigated, the non-cache model and the with-cache model. The non-cache model is a variant of the ideal model, where each job i needs a fixed \(c_i\) time for its memory operation; the with-cache model further considers the cache, a memory device with much faster access time but limited space. The uniform with-cache model is a special case of the with-cache model in which all \(c_i\) values are the same. We provide an \(O(n^3)\) time algorithm and an improved \(O(n^2\log n)\) time algorithm to compute the optimal solution in the non-cache model. For the with-cache model, we prove that it is NP-complete to compute the optimal solution. For the uniform with-cache model with agreeable jobs (later-released jobs do not have earlier deadlines), we derive an \(O(n^4)\) time algorithm to compute the optimal schedule, while for the general case we propose a \((2\alpha \frac{g}{g-1})^{\alpha }/2\)-approximation algorithm in a resource augmentation setting in which the memory operation time can accelerate by at most g times.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号