共查询到18条相似文献,搜索用时 218 毫秒
1.
2.
用擂台赛法则构造多目标Pareto最优解集的方法 总被引:14,自引:0,他引:14
针对多目标进化的特点,提出了用擂台赛法则(arena's principle,简称AP)构造多目标Pareto最优解集的方法,论证了构造方法的正确性,分析了其时间复杂度为O(rmN)(0<m/N<1).理论上,当AP与Deb的算法以及Jensen的算法比较时(它们的时间复杂度分别为O(rN2)和O(Nlog(r-1)N)),AP优于Deb的算法;当目标数r较大时(如r≥5),AP优于Jensen的算法;此外,当m/N较小时(如m/N≤50%),AP的效率与其他两种算法比较具有优势.对比实验结果表明,AP具有比其他两种算法更好的CPU时间效率.在应用中,AP可以被集成到任何基于Pareto的MOEA中,并能在较大程度上提高MOEA的运行效率. 相似文献
3.
4.
分别在两种重要并行计算模型中给出计算有向基因组排列的反转距离新的并行算法.基于Hannenhalli和Pevzner理论,分3个主要部分设计并行算法:构建断点图、计算断点图中圈数、计算断点图中障碍的数目.在CREW-PRAM模型上,算法使用O(n2)处理器,时间复杂度为O(log2n);在基于流水光总线的可重构线性阵列系统(linear array with a reconfigurable pipelined bus system, LARPBS)模型上,算法使用O(n3)处理器,计算时间复杂度为O(logn). 相似文献
5.
6.
个体单体型MSR(minimum SNP removal)问题是指如何利用个体的基因测序片断数据去掉最少的SNP(single-nucleotide polymorphisms)位点,以确定该个体单体型的计算问题.对此问题,Bafna等人提出了时间复杂度为O(2kn2m)的算法,其中,m为DNA片断总数,n为SNP位点总数,k为片断中洞(片断中的空值位点)的个数.由于一个Mate-Pair片段中洞的个数可以达到100,因此,在片段数据中有Mate-Pair的情况下,Bafna的算法通常是不可行的.根据片段数据的特点提出了一个时间复杂度为O((n-1)(k1-1)k222h+(k1+1)2h+nk2+mk1)的新算法,其中,k1为一个片断覆盖的最大SNP位点数(不大于n),k2为覆盖同一SNP位点的片段的最大数(通常不大于19),h为覆盖同一SNP位点且在该位点取空值的片断的最大数(不大于k2).该算法的时间复杂度与片断中洞的个数的最大值k没有直接的关系,在有Mate-Pair片断数据的情况下仍然能够有效地进行计算,具有良好的可扩展性和较高的实用价值. 相似文献
7.
RNA二级结构预测中动态规划的优化和有效并行 总被引:6,自引:0,他引:6
基于最小自由能模型的方法是计算生物学中RNA二级结构预测的主要方法,而计算最小自由能的动态规划算法需要O(n4)的时间,其中n是RNA序列的长度.目前有两种降低时间复杂度的策略:限制二级结构中内部环的大小不超过k,得到O(n2×k2)算法;Lyngso方法根据环的能量规则,不限制环的大小,在O(n3)的时间内获得近似最优解.通过使用额外的O(n)的空间,计算内部环中的冗余计算大为减少,从而在同样不限制环大小的情况下,在O(n3)的时间内能够获得最优解.然而,优化后的算法仍然非常耗时,通过有效的负载平衡方法,在机群系统上实现并行程序.实验结果表明,并行程序获得了很好的加速比. 相似文献
8.
9.
区域查询是数据仓库上支持联机分析处理(on-line analytical processing,简称OLAP)的重要操作.近几年,人们提出了一些支持区域查询和数据更新的Cube存储结构.然而这些存储结构的空间复杂性和时间复杂性都很高,难以在实际中使用.为此,提出了一种层次式Cube存储结构HDC(hierarchical data cube)及其上的相关算法.HDC上区域查询的代价和数据更新代价均为O(logdn),综合性能为O((logn)2d)(使用CqCu模型)或O(K(logn)d)(使用Cqnq+Cunu模型).理论分析与实验表明,HDC的区域查询代价、数据更新代价、空间代价以及综合性能都优于目前所有的Cube存储结构. 相似文献
10.
11.
Topological sort of an acyclic graph has many applications such as job scheduling and network analysis. Due to its importance, it has been tackled on many models. Dekel et al. [3], proposed an algorithm for solving the problem in O(log2
N) time on the hypercube or shuffle-exchange networks with O(N
3) processors. Chaudhuri [2], gave an O(log N) algorithm using O(N
3) processors on a CRCW PRAM model. On the LARPBS (Linear Arrays with a Reconfigurable Pipelined Bus System) model, Li et al. [5] showed that the problem for a weighted directed graph with N vertices can be solved in O(log N) time by using N
3 processors. In this paper, a more efficient topological sort algorithm is proposed on the same LARPBS model. We show that the problem can be solved in O(log N) time by using N
3/log N processors. We show that the algorithm has better time and processor complexities than the best algorithm on the hypercube, and has the same time complexity but better processor complexity than the best algorithm on the CRCW PRAM model. 相似文献
12.
In distributed shared memory multiprocessor systems, parallel tasks communicate through sharing memory data. As the system size increases, such communication cost becomes the main factor that limits the overall parallelism and performance. In this paper, we propose a new solution to the problem through judiciously managing the relevant resource, namely, the shared data and the interconnection network (IN) through which the sharing is carried out. In this approach, communication cost is minimized by means of data migration/allocation which is based on analyzing general layered task graphs, sharing behavior of parallel tasks, and network topology. Our method is not applicable for read only variables. Further, for the time being, the usefulness of the method is limited to multiprocessors where no cache coherence mechanism is implemented. Four typical interconnection topologies for multiprocessors are considered, namely, shared-bus, hierarchical-bus, 2-D mesh, and fat-tree structures. Efficient data allocation algorithms for each of the four network topologies are developed that make decision on data allocation/migration at the compile time. The complexity of one algorithm isO(np) for shared-bus andO(n2p) for the remaining three in a system withnprocessors executing ap-layer task graph for one shared variable. We have also given an algorithm to determine optimal allocation/migration scheme for multiple shared variables. However, the cost of the algorithm become prohibitive when the number of shared variables is high. Therefore, a heuristic of low complexity is suggested. The heuristic is optimal for some topologies. 相似文献
13.
Several processor allocation schemes for mesh connected parallel computers have been proposed in the literature. These schemes aim at improving system performance by reducing internal fragmentation or by enhancing submesh recognition ability. In this paper, we propose a system partitioning approach to reduce external fragmentation and thereby improve system performance. The target systems considered here are two-dimensional meshes where the side lengths are powers of 2. Processors are allocated to a partitioned mesh based on their submesh size requirements. The proposed scheme can be implemented in conjunction with any of the existing processor allocation schemes and thereby can also exploit the advantages offered by those schemes. The performance measurements are done through simulation experiments. Completion time for a fixed number of jobs, internal and external fragmentation, and system utilization are measured as performance indicators. It is observed that, in most cases, the proposed scheme demonstrates better performance than the previously proposed algorithms. Time complexity of the proposed scheme is less by a factor ofnthan the corresponding allocation scheme without partitioning, wheren= log2{min(w,h)}, andwandhare the width and height of a two-dimensional mesh. 相似文献
14.
K. Bhuvaneswari K.N. Balasubramanya Murthy C. Siva Ram Murthy 《Journal of Parallel and Distributed Computing》1997,44(2):107
This paper presents a new systolic algorithm for thecompletesolution of a system ofNlinear equations in (N2/2 +O(N)) time steps using 2Nprocessing elements (PEs). It is based on a variant of the Gaussian elimination (GE) algorithm called the successive GE and is faster than any existing GE based algorithm usingO(N) PEs. We also suggest two fault tolerant schemes that tolerate up toNPE failures. The first scheme is a time redundancy based approach with no hardware overhead and 100% time overhead. This scheme can tolerate up toNPE failures. The second scheme is based on algorithm based fault tolerance (ABFT) and usesNextra PEs to tolerate up toN− 1 PE failures with very little time overhead. The number of errors that can be detected/corrected in both schemes is more than that in any existing fault tolerant systolic array. 相似文献
15.
Both sample entropy and approximate entropy are measurements of complexity. The two methods have received a great deal of attention in the last few years, and have been successfully verified and applied to biomedical applications and many others. However, the algorithms proposed in the literature require O(N2) execution time, which is not fast enough for online applications and for applications with long data sets. To accelerate computation, the authors of the present paper have developed a new algorithm that reduces the computational time to O(N3/2)) using O(N) storage. As biomedical data are often measured with integer-type data, the computation time can be further reduced to O(N) using O(N) storage. The execution times of the experimental results with ECG, EEG, RR, and DNA signals show a significant improvement of more than 100 times when compared with the conventional O(N2) method for N = 80,000 (N = length of the signal). Furthermore, an adaptive version of the new algorithm has been developed to speed up the computation for short data length. Experimental results show an improvement of more than 10 times when compared with the conventional method for N > 4000. 相似文献
16.
Thedistance transform(DT) is an image computation tool which can be used to extract the information about the shape and the position of the foreground pixels relative to each other. It converts a binary image into a grey-level image, where each pixel has a value corresponding to the distance to the nearest foreground pixel. The time complexity for computing the distance transform is fully dependent on the different distance metrics. Especially, the more exact the distance transform is, the worse execution time reached will be. Nowadays, quite often thousands of images are processed in a limited time. It seems quite impossible for a sequential computer to do such a computation for the distance transform in real time. In order to provide efficient distance transform computation, it is considerably desirable to develop a parallel algorithm for this operation. In this paper, based on the diagonal propagation approach, we first provide anO(N2) time sequential algorithm to compute thechessboard distance transform(CDT) of anN×Nimage, which is a DT using the chessboard distance metrics. Based on the proposed sequential algorithm, the CDT of a 2D binary image array of sizeN×Ncan be computed inO(logN) time on the EREW PRAM model usingO(N2/logN) processors,O(log logN) time on the CRCW PRAM model usingO(N2/log logN) processors, andO(logN) time on the hypercube computer usingO(N2/logN) processors. Following the mapping as proposed by Lee and Horng, the algorithm for the medial axis transform is also efficiently derived. The medial axis transform of a 2D binary image array of sizeN×Ncan be computed inO(logN) time on the EREW PRAM model usingO(N2/logN) processors,O(log logN) time on the CRCW PRAM model usingO(N2/log logN) processors, andO(logN) time on the hypercube computer usingO(N2/logN) processors. The proposed parallel algorithms are composed of a set of prefix operations. In each prefix operation phase, only increase (add-one) operation and minimum operation are employed. So, the algorithms are especially efficient in practical applications. 相似文献
17.
An efficient algorithm which synthesizes all shortest linear-feedback shift registers generating K given sequences with possibly different lengths over a field is derived, and its correctness is proved. The proposed algorithm
generalizes the Berlekamp-Massey and Feng-Tzeng algorithms and is based on Massey’s ideas. The time complexity of the algorithm
is O(KλN) ≲ O(KN
2), where N is the length of a longest sequence and λ is the linear complexity of the sequences. 相似文献
18.
We present an optimal parallel algorithm for the construction of (a, b)-trees-a generalization of 2-3 trees, 2-3-4 trees, and B-trees. We show the existence of a canonical form for (a, b)-trees, with a very regular structure, which allows us to obtain a scalable parallel algorithm for the construction of a minimum-height (a, b)-tree with N keys in O(N/p + log log N) time using p ≤ N/log log N processors on the EREW-PRAM model, and in O(N/p) time using p ≤ N processors on the CREW model. We show that the average memory utilization for the canonical form is at least 50% better than that for the worst-case and is also better than that for a random (a, b)-tree. A significant feature of the proposed parallel algorithm is that its time-complexity depends neither on a nor on b, and hence our general algorithm is superior to earlier algorithms for parallel construction of B-trees. 相似文献