首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Permuting a vector is a fundamental primitive which arises in many applications. In particular, rational permutations, which are defined by permutations of the bits of the binary representations of the vector indices, are widely used. Matrix transposition and bit-reversal are notable examples of rational permutations. In this paper we contribute a number of results regarding the execution of these permutations in cache hierarchies, with particular emphasis on the cache-oblivious setting. We first bound from below the work needed to execute a rational permutation with an optimal cache complexity. Then, we develop a cache-oblivious algorithm to perform any rational permutation, which exhibits optimal work and cache complexities under the tall cache assumption. We finally show that for certain families of rational permutations (including matrix transposition and bit reversal) no cache-oblivious algorithm can exhibit optimal cache complexity for all values of the cache parameters. This latter result specializes the one proved by Brodal and Fagerberg for general permutations to the case of rational permutations, and provides further evidence that the tall cache assumption is often necessary to attain cache optimality in the context of cache-oblivious algorithms.  相似文献   

2.
The problem of caching shortest paths has been widely studied. All of existing methods that address this problem assume that the condition of road networks does not change with time. In this paper, we study how to refresh a cache when one edge of the underlying road network (graph) changes. A bitmap-based cache structure is proposed to store and give access to shortest paths. In the following, algorithms are developed to detect shortest paths that are affected by the change of edge. After detecting affected paths, several heuristic-based refreshment strategies are proposed to update the cache. We have conducted a series of experiments to compare the performance of proposed strategies. It shows that replacing affected shortest paths with new paths whose benefit values are the largest should be applied in the shortest path caching applications such as navigation and map services.  相似文献   

3.
左秀峰  沈万杰 《计算机科学》2017,44(5):232-234, 267
路径分析是网络分析最基本的问题,其核心是对最短路径的求解。Floyd算法是一种求取最短路的经典算法。分析发现,两点间可能存在多条权重相同的最短路径,而这一点Floyd算法没有涉及。以无向联通图为研究对象,设计了基于Floyd求解多重等价最短路算法,并分析计算了一个实际算例。计算结果表明,基于Floyd的多重等价最短路算法可以有效解决多重等价最短路问题。  相似文献   

4.
The grid graph shortest path problem has many applications. In this paper, we present practical mesh algorithms using a local cost-reducing operation for various forms of the grid graph shortest path problem. The algorithms are very simple and can easily mark the vertices on shortest paths between any two vertices. The time complexity of the algorithm is proportional to the maximum length of the shortest paths with a very small multiplicative constant. Also in this paper, we discuss the application of the parallel algorithms in automatic chromosome analysis to intelligently split touching chromosomes. We identify local features useful for finding a potential path to separate touching chromosomes. We then define a distance measure based on the local features and find the best splitting path to cut touching chromosomes. The splitting algorithm only uses local information and is highly parallel.  相似文献   

5.
Implementations of relational operators on GPU processors have resulted in order of magnitude speedups compared to their multicore CPU counterparts. Here we focus on the efficient implementation of string matching operators common in SQL queries. Due to different architectural features the optimal algorithm for CPUs might be suboptimal for GPUs. GPUs achieve high memory bandwidth by running thousands of threads, so it is not feasible to keep the working set of all threads in the cache in a naive implementation. In GPUs the unit of execution is a group of threads and in the presence of loops and branches, threads in a group have to follow the same execution path; if some threads diverge, then different paths are serialized. We study the cache memory efficiency of single- and multi-pattern string matching algorithms for conventional and pivoted string layouts in the GPU memory. We evaluate the memory efficiency in terms of memory access pattern and achieved memory bandwidth for different parallelization methods. To reduce thread divergence, we split string matching into multiple steps. We evaluate the different matching algorithms in terms of average- and worst-case performance and compare them against state-of-the-art CPU and GPU libraries. Our experimental evaluation shows that thread and memory efficiency affect performance significantly and that our proposed methods outperform previous CPU and GPU algorithms in terms of raw performance and power efficiency. The Knuth–Morris–Pratt algorithm is a good choice for GPUs because its regular memory access pattern makes it amenable to several GPU optimizations.  相似文献   

6.
We present a parallel toolkit for pairwise distance computation in massive networks. Computing the exact shortest paths between a large number of vertices is a costly operation, and serial algorithms are not practical for billion‐scale graphs. We first describe an efficient parallel method to solve the single source shortest path problem on commodity hardware with no shared memory. Using it as a building block, we introduce a new parallel algorithm to estimate the shortest paths between arbitrary pairs of vertices. Our method exploits data locality, produces highly accurate results, and allows batch computation of shortest paths with 7% average error in graphs that contain billions of edges. The proposed algorithm is up to two orders of magnitude faster than previously suggested algorithms and does not require large amounts of memory or expensive high‐end servers. We further leverage this method to estimate the closeness and betweenness centrality metrics, which involve systems challenges dealing with indexing, joining, and comparing large datasets efficiently. In one experiment, we mined a real‐world Web graph with 700 million nodes and 12 billion edges to identify the most central vertices and calculated more than 63 billion shortest paths in 6 h on a 20‐node commodity cluster. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

7.
Optimization and evaluation of shortest path queries   总被引:1,自引:0,他引:1  
We investigate the problem of how to evaluate efficiently a collection of shortest path queries on massive graphs that are too big to fit in the main memory. To evaluate a shortest path query efficiently, we introduce two pruning algorithms. These algorithms differ on the extent of materialization of shortest path cost and on how the search space is pruned. By grouping shortest path queries properly, batch processing improves the performance of shortest path query evaluation. Extensive study is also done on fragment sizes, cache sizes and query types that we show that affect the performance of a disk-based shortest path algorithm. The performance and scalability of proposed techniques are evaluated with large road systems in the Eastern United States. To demonstrate that the proposed disk-based algorithms are viable, we show that their search times are significant better than that of main-memory Dijkstra's algorithm.  相似文献   

8.
复杂社会网络的介数性质近似计算方法研究   总被引:4,自引:0,他引:4       下载免费PDF全文
随着计算机和互联网的迅猛发展,面向互联网的社会网络挖掘和分析成为一个新的课题。从互联网挖掘的社会网络往往规模巨大,这对网络分析算法的性能提出了更高的要求 。介数值作为图的重要结构性质,广泛应用于基于图的聚类、分类算法,如何降低其计算的复杂性是急需解决的问题。目前,常用的方法是利用对最短路径长度的近似来降低低网络分析算法的复杂性,但已有的近似方法没有考虑现实大规模网络的复杂网络特性,对最短路径长度的近似方 近似计算方法,其基本思想是结合复杂网络的结构特性,利用通过网络中枢节点的路径来近似最短路径,以近似的最短路径求得介数的近似值。这为图的结构性质的近似估算算提供了一种新颖的思路。通过与传统的介数计算方法和近的分析得到了若干有益的结论,为进一步的研究工作奠定了基础。  相似文献   

9.
We consider triply-nested loops of the type that occur in the standard Gaussian elimination algorithm, which we denote by GEP (or the Gaussian Elimination Paradigm). We present two related cache-oblivious methods I-GEP and C-GEP, both of which reduce the number of cache misses incurred (or I/Os performed) by the computation over that performed by standard GEP by a factor of $\sqrt{M}We consider triply-nested loops of the type that occur in the standard Gaussian elimination algorithm, which we denote by GEP (or the Gaussian Elimination Paradigm). We present two related cache-oblivious methods I-GEP and C-GEP, both of which reduce the number of cache misses incurred (or I/Os performed) by the computation over that performed by standard GEP by a factor of ?M\sqrt{M}, where M is the size of the cache. Cache-oblivious I-GEP computes in-place and solves most of the known applications of GEP including Gaussian elimination and LU-decomposition without pivoting and Floyd-Warshall all-pairs shortest paths. Cache-oblivious C-GEP uses a modest amount of additional space, but is completely general and applies to any code in GEP form. Both I-GEP and C-GEP produce system-independent cache-efficient code, and are potentially applicable to being used by optimizing compilers for loop transformation.  相似文献   

10.
Load-balanced short-path routing in wireless networks   总被引:1,自引:0,他引:1  
We study routing algorithms on wireless networks that use only short paths, for minimizing latency, and achieve good load balance, for balancing the energy use. We consider the special case when all the nodes are located in a narrow strip with width at most /spl radic/3/2 /spl ap/ 0.86 times the communication radius. We present algorithms that achieve good performance in terms of both measures simultaneously. In particular, the routing path is at most four times the shortest path length and the maximum load on any node is at most three times that of the most load-balanced algorithm without path-length constraint. In addition, our routing algorithms make routing decisions by only local information and, as a consequence, are more adaptive to topology changes due to dynamic node insertions/deletions or due to mobility.  相似文献   

11.
We present three genetic algorithms (GAs) for allocating irregular data sets to multiprocessors. These are a sequential hybrid GA, a coarse-grain GA and a fine-grain GA. The last two are based on models of natural evolution that are suitable for parallel implementation; they have been implemented on a hypercube and a Connection Machine. Experimental results show that the three GAs evolve good suboptimal solutions which are better than those produced by other methods. The GAs are also robust and do not show a bias towards particular problem configurations. The two parallel GAs have reasonable execution times, with the coarse-grain GA producing better solutions for the allocation of loosely synchronous computations.  相似文献   

12.
关键词最优路径查询(KOR)查找在满足关键词全覆盖和路径长度约束条件下,时间开销最小的路线常用于旅行规划。现有优化算法虽然采用各种剪枝策略缩小搜索规模,但是本质上是广度优先搜索,在查找长路径时,搜索规模依然过大,执行时间长。针对该问题,提出一种关键词最优路径查询的分段拓展算法(SE-KOR)。SE-KOR算法根据关键词倒排索引表构建关键词顶点路径,将路径划分为多段分别拓展,降低搜索规模,从而缩短执行时间。该算法在路径拓展时给出路径走向,而现有剪枝策略不控制路径拓展方向,因此提出局部代价阈值剪枝,控制路径的走向沿关键词顶点路径拓展,并综合运用近似支配、可行解目标值剪枝和全局优先拓展策略加速拓展。实验结果表明,在不损失精度的情况下,该算法执行时间分别在不同关键词个数、代价阈值与查询图规模下至少缩短8.0%、61.0%和57.7%。  相似文献   

13.
The development of intelligent transportation systems (ITS) and the resulting need for the solution of a variety of dynamic traffic network models and management problems require faster‐than‐real‐time computation of shortest path problems in dynamic networks. Recently, a sequential algorithm was developed to compute shortest paths in discrete time dynamic networks from all nodes and all departure times to one destination node. The algorithm is known as algorithm DOT and has an optimal worst‐case running‐time complexity. This implies that no algorithm with a better worst‐case computational complexity can be discovered. Consequently, in order to derive algorithms to solve all‐to‐one shortest path problems in dynamic networks, one would need to explore avenues other than the design of sequential solution algorithms only. The use of commercially‐available high‐performance computing platforms to develop parallel implementations of sequential algorithms is an example of such avenue. This paper reports on the design, implementation, and computational testing of parallel dynamic shortest path algorithms. We develop two shared‐memory and two message‐passing dynamic shortest path algorithm implementations, which are derived from algorithm DOT using the following parallelization strategies: decomposition by destination and decomposition by transportation network topology. The algorithms are coded using two types of parallel computing environments: a message‐passing environment based on the parallel virtual machine (PVM) library and a multi‐threading environment based on the SUN Microsystems Multi‐Threads (MT) library. We also develop a time‐based parallel version of algorithm DOT for the case of minimum time paths in FIFO networks, and a theoretical parallelization of algorithm DOT on an ‘ideal’ theoretical parallel machine. Performances of the implementations are analyzed and evaluated using large transportation networks, and two types of parallel computing platforms: a distributed network of Unix workstations and a SUN shared‐memory machine containing eight processors. Satisfactory speed‐ups in the running time of sequential algorithms are achieved, in particular for shared‐memory machines. Numerical results indicate that shared‐memory computers constitute the most appropriate type of parallel computing platforms for the computation of dynamic shortest paths for real‐time ITS applications.  相似文献   

14.
This paper proposes a scheduling algorithm to solve the problem of task scheduling in a cloud computing system with time‐varying communication conditions. This algorithm converts the scheduling problem with communication changes into a directed acyclic graph (DAG) scheduling problem for existing fuzzy communication task nodes, that is, the scheduling problem for a communication‐change DAG (CC‐DAG). The CC‐DAG contains both computation task nodes and communication task nodes. First, this paper proposes a weighted time‐series network bandwidth model to solve the indefinite processing time (cost) problem for a fuzzy communication task node. This model can accurately predict the processing time of a fuzzy communication task node. Second, to address the scheduling order problem for the computation task nodes, a dynamic pre‐scheduling search strategy (DPSS) is proposed. This strategy computes the essential paths for the pre‐scheduling of the computation task nodes based on the actual computation costs (times) of the computation task nodes and the predicted processing costs (times) of the fuzzy communication task nodes during the scheduling process. The computation task node with the longest essential path is scheduled first because its completion time directly influences the completion time of the task graph. Finally, we demonstrate the proposed algorithm via simulation experiments. The experimental results show that the proposed DPSS produced remarkable performance improvement rate on the total execution time that ranges between 11.5% and 21.2%. In view of the experimental results, the proposed algorithm provides better quality scheduling solution that is suitable for scientific application task execution in the cloud computing environment than HEFT, PEFT, and CEFT algorithms.  相似文献   

15.
The problem of planning a path for a point robot from a source point s to a destination point d so as to avoid a set of polygonal obstacles in plane is considered. Using well-known methods, a shortest path from s to d can be computed with a time complexity of O(n2) where n is the total number of obstacle vertices. The focus here is in

1. (a) planning paths faster at the expense of setting for suboptimal path lengths and

2. (b) performance analysis of simple and/or well-known suboptimal methods.

A method that enables a hierarchical implementation of any path planning algorithm with no increase in the worst-case time complexity, is presented; this implementation enables fast planning of simple paths. Then methods are presented based on the Voronoi diagrams, trapezoidal decomposition and triangulation, which compute (suboptimal) paths in O(nlog n) time with the preprocessing costs of O(n log n), O(n2) and O(n log n), respectively. Using existing navigational algorithms for unknown terrains, algorithms that run in O(n log n) time (after preprocessing) and yield suboptimal paths, are presented. For all these algorithms, upper bounds on the path lengths are estimated in terms of the shortest of the obstacles, etc.  相似文献   


16.
高效的并行有限差分Stencil 算法对于求解大型线性方程组是十分重要的.针对并行有限差分Stencil 算法中数据局部性差、同步和通信开销大的问题.首先改进传统有限差分Stencil 算法,提出了多层对称遍历有限差分Stencil 算法.然后给出了以迭代空间条块序作为执行序的串行算法,通过沿时间轴对迭代空间进行时滞划分,在不改变迭代算法性质的同时,对迭代空间条块内部多次迭代计算,提高算法的数据局部性.最后提出一种基于迭代空间条块的并行算法,该算法利用改进的多面体模型对迭代空间网格划分,并通过网格条块重排序减少了Cache 缺失率、通信启动和同步次数.理论分析和实验结果表明,该并行模型比传统的区域分解方法和红黑排序并行算法具有更好的数据局部性,并行效率和可扩展性.  相似文献   

17.
Betweenness centrality is a fundamental measure in social network analysis, expressing the importance or influence of individual vertices (or edges) in a network in terms of the fraction of shortest paths that pass through them. Since exact computation in large networks is prohibitively expensive, we present two efficient randomized algorithms for betweenness estimation. The algorithms are based on random sampling of shortest paths and offer probabilistic guarantees on the quality of the approximation. The first algorithm estimates the betweenness of all vertices (or edges): all approximate values are within an additive factor \(\varepsilon \in (0,1)\) from the real values, with probability at least \(1-\delta \). The second algorithm focuses on the top-K vertices (or edges) with highest betweenness and estimate their betweenness value to within a multiplicative factor \(\varepsilon \), with probability at least \(1-\delta \). This is the first algorithm that can compute such approximation for the top-K vertices (or edges). By proving upper and lower bounds to the VC-dimension of a range set associated with the problem at hand, we can bound the sample size needed to achieve the desired approximations. We obtain sample sizes that are independent from the number of vertices in the network and only depend on a characteristic quantity that we call the vertex-diameter, that is the maximum number of vertices in a shortest path. In some cases, the sample size is completely independent from any quantitative property of the graph. An extensive experimental evaluation on real and artificial networks shows that our algorithms are significantly faster and much more scalable as the number of vertices grows than other algorithms with similar approximation guarantees.  相似文献   

18.
李忠飞  杨雅君  王鑫 《软件学报》2019,30(3):515-536
最短路径查询是图数据管理中非常重要的一类问题.研究了基于规则的最短路径查询,它是一类特殊的最短路径查询问题.给定起点和终点,基于规则的最短路径查询是指找到一条从起点到终点的最短路径,使得此路径经过用户指定点集中的所有点,并且某些点的访问顺序满足一定的偏序规则.该问题被证明是一个NP-hard问题.目前已有的工作侧重于空间数据集(两点之间的最短距离用欧氏距离表示)上基于规则的最短路径问题,它采用穷举的方式列出所有满足规则的路径,然后选择长度最小的路径作为问题的解.然而在实际的道路交通网中,两点之间的距离等于两点之间的最短路径的长度,它往往大于两点之间的欧氏距离;此外,采用穷举的方式会造成大量重复的计算.因此,设计了一种前向搜索算法以及一些优化技术来求解该问题.最后,在不同的真实数据集上设计了大量的实验来验证算法的有效性.实验结果表明,该算法可以快速给出问题的解,而且算法的效率在很大程度上超过了现有的算法.  相似文献   

19.
This paper presents the first Learning Automaton-based solution to the dynamic single source shortest path problem. It involves finding the shortest path in a single-source stochastic graph topology where there are continuous probabilistic updates in the edge-weights. The algorithm is significantly more efficient than the existing solutions, and can be used to find the "statistical" shortest path tree in the "average" graph topology. It converges to this solution irrespective of whether there are new changes in edge-weights taking place or not. In such random settings, the proposed learning automata solution converges to the set of shortest paths. On the other hand, the existing algorithms will fail to exhibit such a behavior, and would recalculate the affected shortest paths after each weight-change. The important contribution of the proposed algorithm is that all the edges in a stochastic graph are not probed, and even if they are, they are not all probed equally often. Indeed, the algorithm attempts to almost always probe only those edges that will be included in the shortest path graph, while probing the other edges minimally. This increases the performance of the proposed algorithm. All the algorithms were tested in environments where edge-weights change stochastically, and where the graph topologies undergo multiple simultaneous edge-weight updates. Its superiority in terms of the average number of processed nodes, scanned edges and the time per update operation, when compared with the existing algorithms, was experimentally established. The algorithm can be applicable in domains ranging from ground transportation to aerospace, from civilian applications to military, from spatial database applications to telecommunications networking.  相似文献   

20.
宋煜  张帅  严永辉  钱柱中 《计算机工程》2021,47(3):209-217,226
在增强现实应用中,距离较近的多个用户请求很可能是相似或者相同的,从而导致同样的计算任务被重复执行。针对该问题,设计基于冗余任务消减的计算任务缓存系统。通过在边缘节点设计任务缓存,使边缘服务器以自组织方式维护全局缓存。对客户端请求时延、用户轨迹、节点部署和总时延进行建模,基于此研究基站上边缘服务器的计算资源部署问题,在给定总的部署代价下优化平均请求时延,并将该问题转化为整数非线性规划问题,设计针对中小规模场景的IDM算法和针对大规模场景的LDM算法。实验结果表明:IDM算法的平均时延与参考最优解仅相差5.85%,对最优解具有较好的逼近效果;LDM算法在牺牲9.20%平均时延的情况下,相比于IDM算法运行时间缩短98.15%,大幅减少了运行开销。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号