期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

共查询到18条相似文献，搜索用时 218 毫秒

P₂-Packing问题参数算法的改进 总被引：1，自引：1，他引：0

王建新宁丹冯启龙陈建二《软件学报》2008,19(11):2879-2886

P₂-Packing问题是一个典型的NP难问题.目前这个问题的最好结果是时间复杂度为O^*(2^5.301k)的参数算法,其核的大小为15k.通过对P₂-packing问题的结构作进一步分析,提出了改进的核心化算法,得到大小为7k的核,并在此基础上提出了一种时间复杂度为O^*(2^4.142k)的参数算法,大幅度改进了目前文献中的最好结果. 相似文献

用擂台赛法则构造多目标Pareto最优解集的方法 总被引：14，自引：0，他引：14

郑金华蒋浩邝达史忠植《软件学报》2007,18(6):1287-1297

针对多目标进化的特点,提出了用擂台赛法则(arena's principle,简称AP)构造多目标Pareto最优解集的方法,论证了构造方法的正确性,分析了其时间复杂度为O(rmN)(0＜m/N＜1).理论上,当AP与Deb的算法以及Jensen的算法比较时(它们的时间复杂度分别为O(rN²)和O(Nlog^(r-1)N)),AP优于Deb的算法;当目标数r较大时(如r≥5),AP优于Jensen的算法;此外,当m/N较小时(如m/N≤50%),AP的效率与其他两种算法比较具有优势.对比实验结果表明,AP具有比其他两种算法更好的CPU时间效率.在应用中,AP可以被集成到任何基于Pareto的MOEA中,并能在较大程度上提高MOEA的运行效率. 相似文献

虫孔路由Mesh上的连通分量算法及其应用

许胤龙万颖瑜顾晓东陈国良《软件学报》2001,12(2):233-240

用倍增技术在带有Wormhole路由技术的n×n二维网孔机器上提出了时间复杂度为O(log²n)的连通分量和传递闭包并行算法,并在此基础上提出了一个时间复杂度为O(log³n)的最小生成树并行算法.这些都改进了Store-and-Forward路由技术下的时间复杂度下界O(n).同其他运行在非总线连接分布式存储并行计算机上的算法相比,此连通分量和传递闭包算法的时间复杂度是最优的. 相似文献

PRAM和LARPBS模型上有向序列翻转距离并行算法

下载免费PDF全文

沈一飞陈国良张强锋《软件学报》2007,18(11):2683-2690

分别在两种重要并行计算模型中给出计算有向基因组排列的反转距离新的并行算法.基于Hannenhalli和Pevzner理论,分3个主要部分设计并行算法:构建断点图、计算断点图中圈数、计算断点图中障碍的数目.在CREW-PRAM模型上,算法使用O(n²)处理器,时间复杂度为O(log²n);在基于流水光总线的可重构线性阵列系统(linear array with a reconfigurable pipelined bus system, LARPBS)模型上,算法使用O(n³)处理器,计算时间复杂度为O(logn). 相似文献

双向选择排序算法

袁关伟《计算机系统应用》2016,25(1):171-174

为丰富O(n²)阶排序算法的种类,以更好地服务于教学科研和日常应用,提出了一种新的排序算法-双向选择排序算法.通过数学方法分析得知:该算法的时间复杂度为O(n²),空间复杂度为O(1).通过实验对比得知:在相同条件下,该算法的运行时间平均为冒泡排序的27%、简单选择排序的62%、直接插入排序的88%. 相似文献

有Mate-Pairs的个体单体型MSR问题的参数化算法

谢民主陈建二王建新《软件学报》2007,18(9):2070-2082

个体单体型MSR(minimum SNP removal)问题是指如何利用个体的基因测序片断数据去掉最少的SNP(single-nucleotide polymorphisms)位点,以确定该个体单体型的计算问题.对此问题,Bafna等人提出了时间复杂度为O(2^kn²m)的算法,其中,m为DNA片断总数,n为SNP位点总数,k为片断中洞(片断中的空值位点)的个数.由于一个Mate-Pair片段中洞的个数可以达到100,因此,在片段数据中有Mate-Pair的情况下,Bafna的算法通常是不可行的.根据片段数据的特点提出了一个时间复杂度为O((n-1)(k₁-1)k₂2^2h+(k₁+1)^2h+nk₂+mk₁)的新算法,其中,k₁为一个片断覆盖的最大SNP位点数(不大于n),k₂为覆盖同一SNP位点的片段的最大数(通常不大于19),h为覆盖同一SNP位点且在该位点取空值的片断的最大数(不大于k₂).该算法的时间复杂度与片断中洞的个数的最大值k没有直接的关系,在有Mate-Pair片断数据的情况下仍然能够有效地进行计算,具有良好的可扩展性和较高的实用价值. 相似文献

RNA二级结构预测中动态规划的优化和有效并行 总被引：6，自引：0，他引：6

谭光明冯圣中孙凝晖《软件学报》2006,17(7):1501-1509

基于最小自由能模型的方法是计算生物学中RNA二级结构预测的主要方法,而计算最小自由能的动态规划算法需要O(n⁴)的时间,其中n是RNA序列的长度.目前有两种降低时间复杂度的策略:限制二级结构中内部环的大小不超过k,得到O(n²×k²)算法;Lyngso方法根据环的能量规则,不限制环的大小,在O(n3)的时间内获得近似最优解.通过使用额外的O(n)的空间,计算内部环中的冗余计算大为减少,从而在同样不限制环大小的情况下,在O(n³)的时间内能够获得最优解.然而,优化后的算法仍然非常耗时,通过有效的负载平衡方法,在机群系统上实现并行程序.实验结果表明,并行程序获得了很好的加速比. 相似文献

半动态矩形交查询算法

下载免费PDF全文

高静波李新友唐泽圣周晓辉《软件学报》1997,8(8):577-584

本文讨论了动态矩形交查询算法.文中介绍了两个半动态矩形查询的新算法，它们分别基于一维数据结构和二维数据结构.一维查询算法的查询时间复杂度是O（logM＋k′），更新时间复杂度是O（logMlogn），空间复杂度是O（nlogM/）.二维查询算法的查询时间复杂度是O（log²M＋k），更新时间复杂度是O（log²Mlogn），空间复杂度是O（nlog²M）.本文分别实现了这两个算法，通过对它们的性能进行比较，发现一维查询算法是一种高效、实用的算法. 相似文献

9.

数据仓库系统中层次式Cube存储结构   总被引：11，自引：0，他引：11       下载免费PDF全文

高宏   李建中   李金宝《软件学报》2003,14(7):1258-1266

区域查询是数据仓库上支持联机分析处理(on-line analytical processing,简称OLAP)的重要操作.近几年,人们提出了一些支持区域查询和数据更新的Cube存储结构.然而这些存储结构的空间复杂性和时间复杂性都很高,难以在实际中使用.为此,提出了一种层次式Cube存储结构HDC(hierarchical data cube)及其上的相关算法.HDC上区域查询的代价和数据更新代价均为O(log^dn),综合性能为O((logn)^2d)(使用C_qC_u模型)或O(K(logn)^d)(使用C_qn_q+C_un_u模型).理论分析与实验表明,HDC的区域查询代价、数据更新代价、空间代价以及综合性能都优于目前所有的Cube存储结构.  相似文献

10.

背包问题的最优并行算法   总被引：10，自引：2，他引：10

李庆华   李肯立   蒋盛益   张薇《软件学报》2003,14(5):891-896

利用分治策略,提出一种基于SIMD共享存储计算机模型的并行背包问题求解算法.算法允许使用O(2^n/4)^1-ε个并行处理机单元,0≤ε≤1,O(2^n/2)个存储单元,在O(2^n/4(2^n/4)^ε)时间内求解n维背包问题,算法的成本为O(2^n/2).将提出的算法与已有文献结论进行对比表明,该算法改进了已有文献的相应结果,是求解背包问题的成本最优并行算法.同时还指出了相关文献主要结论的错误.  相似文献

11.

More Efficient Topological Sort Using Reconfigurable Optical Buses

Li  Jie  Pan  Yi  Shen  Hong 《The Journal of supercomputing》2003,24(3):251-258

Topological sort of an acyclic graph has many applications such as job scheduling and network analysis. Due to its importance, it has been tackled on many models. Dekel et al. [3], proposed an algorithm for solving the problem in O(log² N) time on the hypercube or shuffle-exchange networks with O(N ³) processors. Chaudhuri [2], gave an O(log N) algorithm using O(N ³) processors on a CRCW PRAM model. On the LARPBS (Linear Arrays with a Reconfigurable Pipelined Bus System) model, Li et al. [5] showed that the problem for a weighted directed graph with N vertices can be solved in O(log N) time by using N ³ processors. In this paper, a more efficient topological sort algorithm is proposed on the same LARPBS model. We show that the problem can be solved in O(log N) time by using N ³/log N processors. We show that the algorithm has better time and processor complexities than the best algorithm on the hypercube, and has the same time complexity but better processor complexity than the best algorithm on the CRCW PRAM model.  相似文献

12.

A Compiler-Directed Approach to Network Latency Reduction for Distributed Shared Memory Multiprocessors

Sibabrata Ray  Hong Jiang  Qing Yang 《Journal of Parallel and Distributed Computing》1996,38(2):267

In distributed shared memory multiprocessor systems, parallel tasks communicate through sharing memory data. As the system size increases, such communication cost becomes the main factor that limits the overall parallelism and performance. In this paper, we propose a new solution to the problem through judiciously managing the relevant resource, namely, the shared data and the interconnection network (IN) through which the sharing is carried out. In this approach, communication cost is minimized by means of data migration/allocation which is based on analyzing general layered task graphs, sharing behavior of parallel tasks, and network topology. Our method is not applicable for read only variables. Further, for the time being, the usefulness of the method is limited to multiprocessors where no cache coherence mechanism is implemented. Four typical interconnection topologies for multiprocessors are considered, namely, shared-bus, hierarchical-bus, 2-D mesh, and fat-tree structures. Efficient data allocation algorithms for each of the four network topologies are developed that make decision on data allocation/migration at the compile time. The complexity of one algorithm isO(np) for shared-bus andO(n²p) for the remaining three in a system withnprocessors executing ap-layer task graph for one shared variable. We have also given an algorithm to determine optimal allocation/migration scheme for multiple shared variables. However, the cost of the algorithm become prohibitive when the number of shared variables is high. Therefore, a heuristic of low complexity is suggested. The heuristic is optimal for some topologies.  相似文献

13.

Processor Allocation Using Partitioning in Mesh Connected Parallel Computers

Prasant Mohapatra 《Journal of Parallel and Distributed Computing》1996,39(2):181

Several processor allocation schemes for mesh connected parallel computers have been proposed in the literature. These schemes aim at improving system performance by reducing internal fragmentation or by enhancing submesh recognition ability. In this paper, we propose a system partitioning approach to reduce external fragmentation and thereby improve system performance. The target systems considered here are two-dimensional meshes where the side lengths are powers of 2. Processors are allocated to a partitioned mesh based on their submesh size requirements. The proposed scheme can be implemented in conjunction with any of the existing processor allocation schemes and thereby can also exploit the advantages offered by those schemes. The performance measurements are done through simulation experiments. Completion time for a fixed number of jobs, internal and external fragmentation, and system utilization are measured as performance indicators. It is observed that, in most cases, the proposed scheme demonstrates better performance than the previously proposed algorithms. Time complexity of the proposed scheme is less by a factor ofnthan the corresponding allocation scheme without partitioning, wheren= log₂{min(w,h)}, andwandhare the width and height of a two-dimensional mesh.  相似文献

14.

A New and Faster Gaussian Elimination Based Fault Tolerant Systolic Linear System Solver

K. Bhuvaneswari  K.N. Balasubramanya Murthy  C. Siva Ram Murthy 《Journal of Parallel and Distributed Computing》1997,44(2):107

This paper presents a new systolic algorithm for thecompletesolution of a system ofNlinear equations in (N²/2 +O(N)) time steps using 2Nprocessing elements (PEs). It is based on a variant of the Gaussian elimination (GE) algorithm called the successive GE and is faster than any existing GE based algorithm usingO(N) PEs. We also suggest two fault tolerant schemes that tolerate up toNPE failures. The first scheme is a time redundancy based approach with no hardware overhead and 100% time overhead. This scheme can tolerate up toNPE failures. The second scheme is based on algorithm based fault tolerance (ABFT) and usesNextra PEs to tolerate up toN− 1 PE failures with very little time overhead. The number of errors that can be detected/corrected in both schemes is more than that in any existing fault tolerant systolic array.  相似文献

15.

Fast computation of sample entropy and approximate entropy in biomedicine   总被引：1，自引：0，他引：1

Pan YH  Wang YH  Liang SF  Lee KT 《Computer methods and programs in biomedicine》2011,104(3):382-396

Both sample entropy and approximate entropy are measurements of complexity. The two methods have received a great deal of attention in the last few years, and have been successfully verified and applied to biomedical applications and many others. However, the algorithms proposed in the literature require O(N²) execution time, which is not fast enough for online applications and for applications with long data sets. To accelerate computation, the authors of the present paper have developed a new algorithm that reduces the computational time to O(N^3/2)) using O(N) storage. As biomedical data are often measured with integer-type data, the computation time can be further reduced to O(N) using O(N) storage. The execution times of the experimental results with ECG, EEG, RR, and DNA signals show a significant improvement of more than 100 times when compared with the conventional O(N²) method for N = 80,000 (N = length of the signal). Furthermore, an adaptive version of the new algorithm has been developed to speed up the computation for short data length. Experimental results show an improvement of more than 10 times when compared with the conventional method for N > 4000.  相似文献

16.

Optimal Computing the Chessboard Distance Transform on Parallel Processing Systems

Yu-Hua Lee  Shi-Jinn Horng 《Computer Vision and Image Understanding》1999,73(3):272

Thedistance transform(DT) is an image computation tool which can be used to extract the information about the shape and the position of the foreground pixels relative to each other. It converts a binary image into a grey-level image, where each pixel has a value corresponding to the distance to the nearest foreground pixel. The time complexity for computing the distance transform is fully dependent on the different distance metrics. Especially, the more exact the distance transform is, the worse execution time reached will be. Nowadays, quite often thousands of images are processed in a limited time. It seems quite impossible for a sequential computer to do such a computation for the distance transform in real time. In order to provide efficient distance transform computation, it is considerably desirable to develop a parallel algorithm for this operation. In this paper, based on the diagonal propagation approach, we first provide anO(N²) time sequential algorithm to compute thechessboard distance transform(CDT) of anN×Nimage, which is a DT using the chessboard distance metrics. Based on the proposed sequential algorithm, the CDT of a 2D binary image array of sizeN×Ncan be computed inO(logN) time on the EREW PRAM model usingO(N²/logN) processors,O(log logN) time on the CRCW PRAM model usingO(N²/log logN) processors, andO(logN) time on the hypercube computer usingO(N²/logN) processors. Following the mapping as proposed by Lee and Horng, the algorithm for the medial axis transform is also efficiently derived. The medial axis transform of a 2D binary image array of sizeN×Ncan be computed inO(logN) time on the EREW PRAM model usingO(N²/logN) processors,O(log logN) time on the CRCW PRAM model usingO(N²/log logN) processors, andO(logN) time on the hypercube computer usingO(N²/logN) processors. The proposed parallel algorithms are composed of a set of prefix operations. In each prefix operation phase, only increase (add-one) operation and minimum operation are employed. So, the algorithms are especially efficient in practical applications.  相似文献

17.

A linear algebraic approach to multisequence shift-register synthesis   总被引：1，自引：0，他引：1

V. R. Sidorenko  G. Schmidt 《Problems of Information Transmission》2011,47(2):149-165

An efficient algorithm which synthesizes all shortest linear-feedback shift registers generating K given sequences with possibly different lengths over a field is derived, and its correctness is proved. The proposed algorithm generalizes the Berlekamp-Massey and Feng-Tzeng algorithms and is based on Massey’s ideas. The time complexity of the algorithm is O(KλN) ≲ O(KN ²), where N is the length of a longest sequence and λ is the linear complexity of the sequences.  相似文献

18.

Parallel Construction of (a, b)-Trees

Deo N.  Jain A.  Medidi M. 《Journal of Parallel and Distributed Computing》1994,23(3)

We present an optimal parallel algorithm for the construction of (a, b)-trees-a generalization of 2-3 trees, 2-3-4 trees, and B-trees. We show the existence of a canonical form for (a, b)-trees, with a very regular structure, which allows us to obtain a scalable parallel algorithm for the construction of a minimum-height (a, b)-tree with N keys in O(N/p + log log N) time using p ≤ N/log log N processors on the EREW-PRAM model, and in O(N/p) time using p ≤ N processors on the CREW model. We show that the average memory utilization for the canonical form is at least 50% better than that for the worst-case and is also better than that for a random (a, b)-tree. A significant feature of the proposed parallel algorithm is that its time-complexity depends neither on a nor on b, and hence our general algorithm is superior to earlier algorithms for parallel construction of B-trees.  相似文献