期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Computational Complexity of Algebraic Functions

Nicholas Pippenger 《Journal of Computer and System Sciences》1981,22(3):454-470

We consider algebraic functions that are rational functions of roots (of various degrees) of rational functions of indeterminates. We associate a cost C(d) with the extraction of a dth root and assume that C satisfies certain natural axioms. We show that the minimum cost of computing a finite set of algebraic functions of the form considered is C(d₁) + … + C(d_r), where d₁…d_r are the torsion orders of the Galois group of the extension generated by the functions. 相似文献

2.

Prediction of human breast and colon cancers from imbalanced data using nearest neighbor and support vector machines

Abdul Majid Safdar Ali Mubashar Iqbal Nabeela Kausar 《Computer methods and programs in biomedicine》2014

This study proposes a novel prediction approach for human breast and colon cancers using different feature spaces. The proposed scheme consists of two stages: the preprocessor and the predictor. In the preprocessor stage, the mega-trend diffusion (MTD) technique is employed to increase the samples of the minority class, thereby balancing the dataset. In the predictor stage, machine-learning approaches of K-nearest neighbor (KNN) and support vector machines (SVM) are used to develop hybrid MTD-SVM and MTD-KNN prediction models. MTD-SVM model has provided the best values of accuracy, G-mean and Matthew's correlation coefficient of 96.71%, 96.70% and 71.98% for cancer/non-cancer dataset, breast/non-breast cancer dataset and colon/non-colon cancer dataset, respectively. We found that hybrid MTD-SVM is the best with respect to prediction performance and computational cost. MTD-KNN model has achieved moderately better prediction as compared to hybrid MTD-NB (Naïve Bayes) but at the expense of higher computing cost. MTD-KNN model is faster than MTD-RF (random forest) but its prediction is not better than MTD-RF. To the best of our knowledge, the reported results are the best results, so far, for these datasets. The proposed scheme indicates that the developed models can be used as a tool for the prediction of cancer. This scheme may be useful for study of any sequential information such as protein sequence or any nucleic acid sequence. 相似文献

3.

Optimal per-edge processing times in the semi-streaming model

Mariano Zelke 《Information Processing Letters》2007,104(3):106-112

We present semi-streaming algorithms for basic graph problems that have optimal per-edge processing times and therefore surpass all previous semi-streaming algorithms for these tasks. The semi-streaming model, which is appropriate when dealing with massive graphs, forbids random access to the input and restricts the memory to bits.Particularly, the formerly best per-edge processing times for finding the connected components and a bipartition are O(α(n)), for determining k-vertex and k-edge connectivity O(k²n) and O(n⋅logn) respectively for any constant k and for computing a minimum spanning forest O(logn). All these time bounds we reduce to O(1).Every presented algorithm determines a solution asymptotically as fast as the best corresponding algorithm up to date in the classical RAM model, which therefore cannot convert the advantage of unlimited memory and random access into superior computing times for these problems. 相似文献

4.

Topology optimization of 2D continua for minimum compliance using parallel computing 总被引：3，自引：3，他引：0

A. Mahdavi R. Balaji M. Frecker E. M. Mockensturm 《Structural and Multidisciplinary Optimization》2006,32(2):121-132

Topology optimization is often used in the conceptual design stage as a preprocessing tool to obtain overall material distribution in the solution domain. The resulting topology is then used as an initial guess for shape optimization. It is always desirable to use fine computational grids to obtain high-resolution layouts that minimize the need for shape optimization and postprocessing (Bendsoe and Sigmund, Topology optimization theory, methods and applications. Springer, Berlin Heidelberg New York 2003), but this approach results in high computation cost and is prohibitive for large structures. In the present work, parallel computing in combination with domain decomposition is proposed to reduce the computation time of such problems. The power law approach is used as the material distribution method, and an optimality criteria-based optimizer is used for locating the optimum solution [Sigmund (2001)21:120–127; Rozvany and Olhoff, Topology optimization of structures and composites continua. Kluwer, Norwell 2000]. The equilibrium equations are solved using a preconditioned conjugate gradient algorithm. These calculations have been done using a master–slave programming paradigm on a coarse-grain, multiple instruction multiple data, shared-memory architecture. In this study, by avoiding the assembly of the global stiffness matrix, the memory requirement and computation time has been reduced. The results of the current study show that the parallel computing technique is a valuable tool for solving computationally intensive topology optimization problems. 相似文献

5.

Evaluating the SAT problem on P systems for different high-performance architectures

José M. Cecilia José M. García Ginés D. Guerrero Manuel Ujaldón 《The Journal of supercomputing》2014,69(1):248-272

Membrane computing is an emergent research area studying the behavior of living cells to define bio-inspired computing devices, also called P systems. Such devices provide polynomial time solutions to NP-complete problems by trading time for space. The efficient simulation of P systems poses three major challenging issues: an intrinsic massive parallelism of P systems, an exponential computational workspace, and a non-intensive floating point nature. This paper analyzes the simulation of a family of recognizer P systems with active membranes that solves the satisfiability problem in linear time on three different architectures: a shared memory multiprocessor, a distributed memory system, and a manycore graphics processing unit (GPU). For an efficient handling of the exponential workspace created by the P systems computation, we enable different data policies on those architectures to increase memory bandwidth and exploit data locality through tiling. Parallelism inherent to the target P system is also managed on each architecture to demonstrate that GPUs offer a valid alternative for high-performance computing at a considerably lower cost. Our results lead to execution time improvements exceeding 310 \(\times \) and 78 \(\times \) , respectively, for a much cheaper high-performance alternative. 相似文献

6.

Garbage collection of multi-version indexed data on flash memory

《Journal of Systems Architecture》2014,60(8):630-643

Maintaining a multi-version index on flash memory could generate a lot of updates and invalid pages. It is important to have an efficient garbage collection mechanism to ensure the flash memory has sufficient number of free blocks for storing new data versions and their index structures. In this paper, we study the important performance issues in using the purging-range query to reclaim the blocks, which are storing old data versions and invalid index entries, to be free blocks. To reduce the cost for processing the purging-range query, we propose the physical block labeling (PBL) scheme to provide a better estimation on the purging version number to be used for purging old data versions. To further enhance the performance of the garbage collection process, and at the same time to maximize the deadspans of data versions and balance the wear levels of the blocks, we propose two schemes called, the sequential placement (SQ) and frequency-based placement (FBP), for placing new data versions into free pages. As illustrated in the performance studies, both SQ and FBP can effectively balance the wear levels of the blocks. The deadspans of data versions are longer under FBP than both SQ and RR, and the page reallocation cost is also lower under FBP especially when the size of flash memory allocated for the database is limited. The experimental results also illustrate that PBL can effectively minimize the number of invocations of the purging-range query to be one to reclaim the required number of blocks in each garbage collection. 相似文献

7.

On the K best integer network flows

Antonio Sedeño-Noda Juan José Espino-Martín 《Computers & Operations Research》2013

We address the problem of finding the K best integer solutions of a linear integer network flow problem. We design an O(f(n,m,L,U)+KmS(n,m,L)) time and O(K+m) memory space algorithm to determine the K best integer solutions, in a directed network with n nodes, m arcs, maximum absolute value cost L, and an upper bound U on arc capacities and node supplies. f(n,m,L,U) is the best time needed to solve the minimum cost flow problem in a directed network and S(n,m,L) is the best time to solve the single-source shortest path problem in a network with non-negative lengths. The introduced algorithm efficiently determines a “proper minimal cycle” by taking advantage of the relationship between the best solutions. This way, we improve the theoretical as well as practical memory space bounds of the well-known method due to Hamacher. Our computational experiments confirm this result. 相似文献

8.

Hardware-oriented numerics and concepts for PDE software

《Future Generation Computer Systems》2006,22(1-2):217-238

Processor technology is still dramatically advancing and promises further enormous improvements in processing data for the next decade. On the other hand, much lower advances in moving data are expected such that the efficiency of many numerical software tools for partial differential equations (PDEs) are restricted by the cost for memory access. We demonstrate how data locality and pipelining can achieve a significant percentage of the available huge computing power, and we explain the influence of processor technology on recent and future numerical PDE simulation tools. Exemplarily, we describe hardware-oriented concepts for adaptive error control, multigrid/domain decomposition schemes and incompressible flow solvers and discuss their numerical and computational characteristics. 相似文献

9.

Design and analysis of crossbar architecture based on complementary resistive switching non-volatile memory cells

《Journal of Parallel and Distributed Computing》2014,74(6):2484-2496

Emerging non-volatile memories (e.g. STT-MRAM, OxRRAM and CBRAM) based on resistive switching are under intense research and development investigation by both academics and industries. They provide high performance such as fast write/read speed, low power and good endurance (e.g. >10¹²), and could be used as both computing and storage memories beyond flash memories. However the conventional access architecture based on 1 transistor + 1 memory cell limits its storage density as the selection transistor should be large enough to ensure enough current for the switching operation. This paper presents the design and analysis of crossbar architecture based on complementary resistive switching non-volatile memory cells with a particular focus on reliability and power performance investigation. This architecture allows fewer selection transistors, and minimum contacts between memory cells and CMOS control circuits. The complementary cell and parallel data sensing mitigate the impact of sneak currents in the crossbar architecture and provide fast data access for computing purpose. We perform transient and statistical simulations based on two memory technologies: STT-MRAM and OxRRAM to validate the functionality of this design by using CMOS 40 nm design kit and memory compact models, which were developed based on relative physics and experimental parameters. 相似文献

10.

LVMSR: an efficient algorithm to multicast layered video

《Computer Networks》2003,41(4):363-383

Layered video is a video-compression technique to encode video data in multiple layers. It typically consists of a base layer and some additional layers that provide enhanced video quality. The multicasting operation of layered video consists of many receivers dynamically joining and leaving different multicast sessions of different layers depending on their network condition. A layered video multicasting system needs to satisfy: (i) bounded end-to-end delay from the video source to each receiver; (ii) minimum total cost; and (iii) minimum delay jitter between the various video streams received by each receiver. The problem of computing such data distribution paths is NP-complete. This paper presents a new heuristic algorithm, called layered video multicast super-tree routing algorithm, with O(Rn²) time complexity and O(R²) message complexity, where n is the number of nodes in the network and R is the receiver group size. Our investigation shows that the multicast data paths computed by our algorithm can always satisfy the delay constraint with reasonably low total cost. 相似文献

11.

On Sorting, Heaps, and Minimum Spanning Trees

Gonzalo Navarro Rodrigo Paredes 《Algorithmica》2010,57(4):585-620

Let A be a set of size m. Obtaining the first k≤m elements of A in ascending order can be done in optimal O(m+klog?k) time. We present Incremental Quicksort (IQS), an algorithm (online on k) which incrementally gives the next smallest element of the set, so that the first k elements are obtained in optimal expected time for any k. Based on IQS, we present the Quickheap (QH), a simple and efficient priority queue for main and secondary memory. Quickheaps are comparable with classical binary heaps in simplicity, yet are more cache-friendly. This makes them an excellent alternative for a secondary memory implementation. We show that the expected amortized CPU cost per operation over a Quickheap of m elements is O(log?m), and this translates into O((1/B)log?(m/M)) I/O cost with main memory size M and block size B, in a cache-oblivious fashion. As a direct application, we use our techniques to implement classical Minimum Spanning Tree (MST) algorithms. We use IQS to implement Kruskal’s MST algorithm and QHs to implement Prim’s. Experimental results show that IQS, QHs, external QHs, and our Kruskal’s and Prim’s MST variants are competitive, and in many cases better in practice than current state-of-the-art alternative (and much more sophisticated) implementations. 相似文献

12.

Automated design of hierarchical intranets

《Computer Communications》2002,25(11-12):1066-1075

This paper describes the use of i-CAD, a prototype tool for automatically synthesizing application specific intranets. i-CAD is a novel intranet computer-aided design software tool, and its ultimate goal is to concurrently design hierarchical network topologies and data management (content distribution) systems for data-intensive multimedia intranets. The prototype software tool introduced here synthesizes a three-level intranet architecture that entails minimal installation cost and yet enables all an intranet's clients to perform their tasks with acceptable performance. The tool chooses network technologies (hardware resources and protocols) based on requirements specified by the user and determines the topology. An evolutionary approach is used to search the design space for a minimal cost three-level network. The experimental results for several network design problems described here indicate the effectiveness of the prototype network tool in finding good designs from a large design space in a reasonable amount of time. 相似文献

13.

Processor-time optimal parallel algorithms for digitized images on mesh-connected processor arrays

Hussein M. Alnuweiri V. K. Prasanna Kumar 《Algorithmica》1991,6(1-6):698-733

We present processor-time optimal parallel algorithms for several problems onn ×n digitized image arrays, on a mesh-connected array havingp processors and a memory of sizeO(n ²) words. The number of processorsp can vary over the range [1,n ^3/2] while providing optimal speedup for these problems. The class of image problems considered here includes labeling the connected components of an image; computing the convex hull, the diameter, and a smallest enclosing box of each component; and computing all closest neighbors. Such problems arise in medium-level vision and require global operations on image pixels. To achieve optimal performance, several efficient data-movement and reduction techniques are developed for the proposed organization. 相似文献

14.

Tool path generation for a surface model with defects 总被引：2，自引：0，他引：2

Sang C. Park Author Vitae Author Vitae 《Computers in Industry》2010,61(1):75-82

A 3-axis tool path generation algorithm for free-form surface models including defects, such as gaps and overlaps is presented in this paper. To avoid the difficulty of computing a complete cutter location (CL)-surface, the proposed approach generates a tool path by slicing CL-elements instead of a complete CL-surface. A key feature of the proposed approach is that it reduces the number of CL-elements to be sliced by utilizing the correspondence information between CL-elements and cutter contact (CC)-elements. This feature significantly improves the computational efficiency of the proposed algorithm. Empirical tests show that the proposed approach is robust to geometric defects of CAD models, gaps and overlaps, with a near O(n) time complexity, where n is the number of slicing planes. 相似文献

15.

Longest common subsequence between run-length-encoded strings: a new algorithm with improved parallelism

Valerio Freschi 《Information Processing Letters》2004,90(4):167-173

Data compression can be used to simultaneously reduce memory, communication and computation requirements of string comparison. In this paper we address the problem of computing the length of the longest common subsequence (LCS) between run-length-encoded (RLE) strings. We exploit RLE both to reduce the complexity of LCS computation from O(M×N) to O(mN+Mn−mn), where M and N are the lengths of the original strings and m and n the number of runs in their RLE representation, and to improve the inherent parallelism of the proposed algorithm, so that it may execute in O(m+n) steps on a systolic array of M+N units.We also discuss the application of the proposed algorithm to the related problem of edit distance (ED) computation. 相似文献

16.

A recursive algorithm for the reliability of a circular connected-(r,s)-out-of-(m,n):F lattice system

《Computers & Industrial Engineering》2005,49(1):21-34

A circular connected-(r, s)-out-of-(m, n):F lattice system consists of m×n components arranged in a cylindrical grid. Each of m circles has n components, and this system fails if and only if there exists a grid of size r ×s which all components are failed. A circular connected-(r, s)-out-of-(m, n):F lattice system might be used in reliability models for ‘Feelers for measuring temperature on reaction chamber’ and ‘TFT Liquid Crystal Display system with 360° wide area’.In this study, we proposed a new recursive algorithm for obtaining the reliability of a circular connected-(r, s)-out-of-(m, n):F lattice system. We evaluated our proposed algorithms in terms of computing time and memory capacity. Furthermore, a numerical experiment comparing our proposed algorithm with Yamamoto and Miyakawa's algorithm [Yamamoto, H., & Miyakawa, M. (1996). Reliability of circular connected-(r, s)-out-of-(m, n):F lattice system. Journal of the Operations Research Society of Japan, 39(3), 389–406] showed that our proposed algorithm is more effective for systems with a large n. 相似文献

17.

Communication-efficient parallel algorithms for distributed random-access machines

Charles E. Leiserson Bruce M. Maggs 《Algorithmica》1988,3(1-4):53-77

This paper introduces a model for parallel computation, called thedistributed randomaccess machine (DRAM), in which the communication requirements of parallel algorithms can be evaluated. A DRAM is an abstraction of a parallel computer in which memory accesses are implemented by routing messages through a communication network. A DRAM explicitly models the congestion of messages across cuts of the network. We introduce the notion of aconservative algorithm as one whose communication requirements at each step can be bounded by the congestion of pointers of the input data structure across cuts of a DRAM. We give a simple lemma that shows how to “shortcut” pointers in a data structure so that remote processors can communicate without causing undue congestion. We giveO(lgn)-step, linear-processor, linear-space, conservative algorithms for a variety of problems onn-node trees, such as computing treewalk numberings, finding the separator of a tree, and evaluating all subexpressions in an expression tree. We giveO(lg² n)-step, linear-processor, linear-space, conservative algorithms for problems on graphs of sizen, including finding a minimum-cost spanning forest, computing biconnected components, and constructing an Eulerian cycle. Most of these algorithms use as a subroutine a generalization of the prefix computation to trees. We show that any suchtreefix computation can be performed inO(lgn) steps using a conservative variant of Miller and Reif's tree-contraction technique. 相似文献

18.

Internal and external algorithms for the points-in-regions problem—the inside join of geo-relational algebra

Gabriele Blankenagel Ralf Hartmut Güting 《Algorithmica》1990,5(1-4):251-276

We consider the problem of collectively locating a set of points within a set of disjoint polygonal regions when neither for points nor for regions preprocessing is allowed. This problem arises in geometric database systems. More specifically it is equivalent to computing theinside join of geo-relational algebra, a conceptual model for geo-data management. We describe efficient algorithms for solving this problem based on plane-sweep and divide-and-conquer, requiringO(n(logn) +t) andO(n(log² n) +t) time, respectively, andO(n) space, wheren is the total number of points and edges, and (is the number of reported (point, region) pairs. Since the algorithms are meant to be practically useful we consider as well as the internal versions-running completely in main memory-versions that run internally but use much less than linear space and versions that run externally, that is, require only a constant amount of internal memory regardless of the amount of data to be processed. Comparing plane-sweep and divide-and-conquer, it turns out that divide-and-conquer can be expected to perform much better in the external case even though it has a higher internal asymptotic worst-case complexity. An interesting theoretical by-product is a new general technique for handling arbitrarily large sets of objects clustered on a singlex-coordinate within a planar divide-and-conquer algorithm and a proof that the resulting “unbalanced” dividing does not lead to a more than logarithmic height of the tree of recursive calls. 相似文献

19.

An efficient algorithm for comparing two protein sequences: Implementation for microcomputers

《Computers & chemistry》1988,12(1):21-25

An algorithm which can be used on microcomputers to compare two protein sequences is presented. It is based on the algorithm developed by Needleman & Wunsch (J. Mol. Biol. 48, 444). The original algorithm requries memory space of mn and computing time proportional to mn² where m and n are the lengths of the two sequences. Because of these space and time requirements, the algorithm is of limited use on microcomputers. The modified algorithm presented here reduces the computing time proportional to mn and the size of directly accessed memory to 2n, thus making this algorithm usable by many currently available microcomputers. The time saving part of the algorithm will also improve the efficiency of computations on larger computers. 相似文献

20.

Neural computing models for prediction of permeability coefficient of coarse-grained soils 总被引：1，自引：0，他引：1

I??k Yilmaz Marian Marschalko Martin Bednarik O?uz Kaynar Lucie Fojtova 《Neural computing & applications》2012,21(5):957-968

Correlations are very significant from the earliest days; in some cases, it is essential as it is difficult to measure the amount directly, and in other cases it is desirable to ascertain the results with other tests through correlations. Soft computing techniques are now being used as alternate statistical tool, and new techniques such as artificial neural networks, fuzzy inference systems, genetic algorithms, and their hybrids were employed for developing the predictive models to estimate the needed parameters, in the recent years. Determination of permeability coefficient (k) of soils is very important for the definition of hydraulic conductivity and is difficult, expensive, time-consuming, and involves destructive tests. In this paper, use of some soft computing techniques such as ANNs (MLP, RBF, etc.) and ANFIS (adaptive neuro-fuzzy inference system) for prediction of permeability of coarse-grained soils was described and compared. As a result of this paper, it was obtained that the all constructed soft computing models exhibited high performance for predicting k. In order to predict the permeability coefficient, ANN models having three inputs, one output were applied successfully and exhibited reliable predictions. However, all four different algorithms of ANN have almost the same prediction capability, and accuracy of MLP was relatively higher than RBF models. The ANFIS model for prediction of permeability coefficient revealed the most reliable prediction when compared with the ANN models, and the use of soft computing techniques will provide new approaches and methodologies in prediction of some parameters in soil mechanics. 相似文献