期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Optimization of speeded-up robust feature algorithm for hardware implementation

ShanShan Cai LeiBo Liu ShouYi Yin RenYan Zhou WeiLong Zhang ShaoJun Wei 《中国科学:信息科学(英文版)》2014,57(4):1-15

相似文献

2.

Analog memristive memory with applications in audio signal processing

DUAN ShuKai HU XiaoFang WANG LiDan LI ChuanDong 《中国科学:信息科学(英文版)》2014,(4):239-253

Since the development of the HP memristor, much attention has been paid to studies of memris- tive devices and applications, particularly memristor-based nonvolatile semiconductor memory. Owing to its unique properties, theoretically, one could restart a memristor-based computer immediately without the need for reloading the data. Further, current memories are mainly binary and can store only ones and zeros, whereas memristors have multilevel states, which means a single memristor unit can replace many binary transistors and realize higher-density memory. It is believed that memristors can also implement analog storage besides binary and multilevel information memory. In this paper, an implementation scheme for analog memristive memory is considered. A charge-controlled memristor model is derived and the corresponding SPICE model is constructed. Special write and read operations are demonstrated through numerical analysis and circuit simulations. In addition, an audio analog record/play system using a memristor crossbar array is designed. This system can provide great storage capacity （long recording time） and high audio quality with a simple small circuit structure. A series of computer simulations and analyses verify the effectiveness of the proposed scheme. 相似文献

3.

Binary trees as a computational framework

《Computer Languages, Systems and Structures》2013,39(4):163-181

We present a new set of algorithms for performing arithmetic computations on the set of natural numbers, represented as ordered rooted binary trees. We show formally that these algorithms are correct and discuss their time and space complexity in comparison to traditional arithmetic operations on bitstrings.Our binary tree algorithms follow the structure of a simple type language, similar to that of Gödel's System T.Generic implementations using Haskell's type class mechanism are shared between instances shown to be isomorphic to the set of natural numbers. This representation independence is illustrated by instantiating our computational framework to the language of balanced parenthesis languages.The self-contained source code of the paper is available at http://logic.cse.unt.edu/tarau/research/2012/jtypes.hs. 相似文献

4.

A General Multi-step Algorithm for Voxel Traversing Along a Line

Y. K. Liu H. Y. Song B. &#;alik 《Computer Graphics Forum》2008,27(1):73-80

Traversing voxels along a three dimensional (3D) line is one of the most fundamental algorithms for voxel‐based applications. This paper presents a new 6‐connectivity integer algorithm for this task. The proposed algorithm accepts voxels having different sizes in x, y and z directions. To explain the idea of the proposed approach, a 2D algorithm is firstly considered and then extended in 3D. This algorithm is a multi‐step as up to three voxels may be added in one iteration. It accepts both integer and floating‐point input. The new algorithm was compared to other popular voxel traversing algorithms. Counting the number of arithmetic operations showed that the proposed algorithm requires the least amount of operations per traversed voxel. A comparison of spent CPU time using either integer or floating‐point arithmetic confirms that the proposed algorithm is the most efficient. This algorithm is simple, and in compact form which also makes it attractive for hardware implementation. 相似文献

5.

Methods for fast morphological image transforms using bitmapped binary images

《CVGIP: Graphical Models and Image Processing》1992,54(3):252-258

In this paper we present new implementations for morphological binary image processing on a general-purpose computer, using a bitmap representation of binary images instead of representing binary images as bitplanes inserted in gray value images. The bitmap data representation is a very efficient one, both in terms of memory requirements and in terms of algorithmic efficiency because of the CPU operates on 32 pixels in parallel. The algorithms described in this paper are capable of performing the basic morphological image transforms using structuring elements of arbitrary size and shape. In order to speed up morphological operations with respect to commonly used, large, convex structuring elements, the logarithmic decomposition of structuring elements is used. Experiments indicate that the new algorithms are more than 30 times faster for pixelwise operations and about an order of magnitude faster for the basic morphological transforms than the fastest known software implementations. 相似文献

6.

Exact Algorithms for Linear Programming over Algebraic Extensions

Beling 《Algorithmica》2008,31(4):459-478

Abstract. We study the computational complexity of linear programs with coefficients that are real algebraic numbers under a Turing machine model of computation. After reviewing a method for exact representation of algebraic numbers under the Turing model, we show that the fundamental tasks of comparison and arithmetic can be performed in polynomial time. Our technique for establishing polynomial-time algorithms for comparison and arithmetic is distinct from the usual resultant-based approaches, and has the advantage that it provides a natural framework for analysis of the complexity of computational tasks, such as Gaussian elimination, that involve a sequence of arithmetic operations. Our main contribution is to show that a variant of the ellipsoid method can be used to solve linear programming in time polynomial in the encoding size of the problem coefficients and the degree of any algebraic extension that contains those coefficients. 相似文献

7.

Relative performance of micro and mainframe computers: Two case studies

《Computers & Industrial Engineering》1986,10(2):117-119

This article compares the relative performances from the user point of view of two typical opertions research algorithms implemented on an IBM PC microcomputer and on a CDC Cyber 6400 mainframe computer. The basis of comparison is the computation time expressed in CPU seconds. The first algorithm performs exclusively integer operations, while the second algorithm performs primarily real arithmetic and transcendental functions. For the integer algorithm the speed ratio is 2.14 or 2.65 depending on the accuracy of the PC, for the real algorithm the speed ratio is 3.6. In general, the speed ratio is approximately equal to 3 to 1, which is very favorable for the microcomputer, considering the availability of micro computers and the relative cost of CPU time on each machine. 相似文献

8.

A long accumulator like a carry-save adder

Stefan Siegel Jürgen Wolff von Gudenberg 《Computing》2012,94(2-4):203-213

Long accumulators for the exact summation of floating point numbers or products are well known tools in numerical analysis especially in algorithms verifying the result (C++ Toolbox for Verified Computing, Springer, New York, 1995). An exact dot product is one of the features of the upcoming interval standard (IEEE Interval Standard Working Group, P1788. http://grouper.ieee.org/groups/1788/). Usually an accumulator is realized as a memory block with operations to add floating point numbers and products. Several variants have been proposed to avoid carry rippling: use separate accumulators for positive and negative numbers, initialize the accu with a pattern not equal to zero, or perform a kind of carry look-ahead-technique. All these approaches are described in detail in Kulisch (Computer arithmetic and validity—theory, implementation, and applications. Series: de Gruyter Studies in Mathematics 33, 2008) and Bohlender (Computer arithmetic and self-validating numerical methods, Academic Press, San Diego, 1990). In this paper we propose a long accumulator similar to a carry-save adder. The main idea is to augment the long accumulator with cache information. The cache is used to store the carries or borrows, instead of propagating them through the whole accumulator every time. Due to the cache, operations are kept local in our approach. The full information of the exact result is represented by the accu and the cache. When we want to deliver the result we have to add the contents of the cache into the accumulator. In this paper we present an implementation in software and compare it with other approaches. Furthermore we discuss the advantages of this algorithm for a hardware implementation. 相似文献

9.

From binary to grey tone image processing using fuzzy logic concepts

Vartkes Goetcherian 《Pattern recognition》1980,12(1):7-15

The concept of fuzzy logic was introduced by Zadeh in 1965 as a tool for analysing complex systems and decision processes. In fuzzy set theory, the binary Boolean operations of OR and AND generalise to the min and max functions. Nakagawa(⁸) later introduced the idea of local min and max operators to image processing as the grey equivalents to the parallel binary operations of shrinking and expanding. In this paper the ideas of min and max are further investigated and their usefulness in image processing assessed by extending some already well known binary processes into grey level algorithms. The functions of min_π and max_π are introduced (deleted neighbourhood min_π and max_π) as the analogues of nearest neighbour “propagation” signals of binary images. Examples of grey edge detection, spatial filtering, object labelling and grey thinning algorithms are given. 相似文献

10.

Binary relations,matrices and inference developments

Jack Minker 《Information Systems》1978,3(1):37-47

The relationships among binary relations, directed graphs and boolean matrices are discussed. The development of new relations from given relations is described in terms of boolean matrix operations.The development of new relations from given relations is equivalent to inferring new relationships from the given facts and the operations on these facts. Given a directed graph, one can infer all the facts possible by completing the graph. An alternative method is to store the given facts and upon being queried, one derives the results required.Two algorithms are described which employ matrix operations to make explicit new relationships that are implicit as a result of the given facts and general rules which map sets of relations into new relations. The two algorithms are analyzed in terms of space and time requirements. Comments are made concerning the practicality of the approach. 相似文献

11.

RS-store: RDMA-enabled skiplist-based key-value store for efficient range query

Chenchen HUANG Huiqi HU Xuecheng Qi Xuan ZHOU Aoying ZHOU 《Frontiers of Computer Science》2021,15(6):156617

Many key-value stores use RDMA to optimize the messaging and data transmission between application layer and the storage layer, most of which only provide point-wise operations. Skiplist-based store can support both point operations and range queries, but its CPU-intensive access operations combined with the high-speed network will easily lead to the storage layer reaches CPU bottlenecks. The common solution to this problem is offloading some operations into the application layer and using RDMA bypassing CPU to directly perform remote access, but this method is only used in the hash tablebased store. In this paper, we present RS-store, a skiplist-based key-value store with RDMA, which can overcome the CPU handle of the storage layer by enabling two access modes: local access and remote access. In RS-store, we redesign a novel data structure R-skiplist to save the communication cost in remote access, and implement a latch-free concurrency control mechanism to ensure all the concurrency during two access modes. RS-store also supports client-active range query which can reduce the storage layer’s CPU consumption. At last, we evaluate RS-store on an RDMA-capable cluster. Experimental results show that RS-store achieves up to 2x improvements over RDMA-enabled RocksDB on the throughput and application’s scalability. 相似文献

12.

The Compatibility of Binary Characters on Phylogenetic Networks: Complexity and Parameterized Algorithms

Iyad A. Kanj Luay Nakhleh Ge Xia 《Algorithmica》2008,51(2):99-128

In a recent article, Nakhleh, Ringe and Warnow introduced perfect phylogenetic networks—a model of language evolution where languages do not evolve via clean speciation—and formulated a set of problems for their accurate reconstruction. Their new methodology assumes networks, rather than trees, as the correct model to capture the evolutionary history of natural languages. They proved the NP-hardness of the problem of testing whether a network is a perfect phylogenetic one for characters exhibiting at least three states, leaving open the case of binary characters, and gave a straightforward brute-force parameterized algorithm for the problem of running time O(3^k n), where k is the number of bidirectional edges in the network and n is its size. In this paper, we first establish the NP-hardness of the binary case of the problem. Then we provide a more efficient parameterized algorithm for this case running in time O(2^k n ²). The presented algorithm is very simple, and utilizes some structural results and elegant operations developed in this paper that can be useful on their own in the design of heuristic algorithms for the problem. The analysis phase of the algorithm is very elegant using amortized techniques to show that the upper bound on the running time of the algorithm is much tighter than the upper bound obtained under a conservative worst-case scenario assumption. Our results bear significant impact on reconstructing evolutionary histories of languages—particularly from phonological and morphological character data, most of which exhibit at most two states (i.e., are binary), as well as on the design and analysis of parameterized algorithms. Research of I.A. Kanj was supported in part by DePaul University Competitive Research Grant. 相似文献

13.

On-line scheme for computing rotation factors

《Journal of Parallel and Distributed Computing》1988,5(3):209-227

An integrated radix-2 on-line algorithm for computing rotation factors for matrix transformations is presented. The inputs are in sign-and-magnitude, floating-point representation and the outputs can be used in on-line signed-digit or in parallel form. The exponents are computed using conventional arithmetic while the significands are processed using on-line algorithms. The conventional result is obtained by using an on-the-fly conversion scheme. The rotation factors are computed in 10 + n clock cycles for n-bit significands. The clock period is kept small by the use of redundant adder schemes and low-precision estimates. The implementation and performance of the algorithm are discussed and compared with other approaches. 相似文献

14.

VRS51L3074上增强型算术单元的高效算术运算

李翔李智王勇军《单片机与嵌入式系统应用》2010,(1):35-37

Ramtron公司推出的VRS51L3074单片机拥有增强型算术单元,能够实现16位乘除法、乘加和移位等操作。本文分析了该单元的特性及使用要点,并给出利用该单元实现的2个实用算法——32位有符号整数开平方和16位二进制数转BCD码。实践表明．该单元可有效提高VRS51L3074处理复杂算术运算的效率。相似文献

15.

On the performance of transputer arrays for dense linear systems

J. Boreddy A. Paulraj 《Parallel Computing》1990,15(1-3):107-117

In this paper, computation and communication performance is evaluated for single and multitransputer arrays. Performance models are proposed for Occam program execution, under Transputer Development System TDS2. The performance features of normalised arithmetic, concurrent floating and integer arithmetic, logarithmic array indexing, and on-chip/off-chip RAM are studied. The startup time, byte transfer rate, asymptotic link bandwidth, and half performance message length are estimated for simultaneous operation of one, two, three, and four links at 10/20 MHz clock in unidirectional/bidirectional modes. The impact of various performance maximisation techniques on execution time is also addressed.

The matrix factorisation algorithms for dense linear systems are chosen as the focus for this study. The implementations include LUD, Householder, Gauss-Jordan, Choleski, and Givens methods. Floating point operations count alone is inadequate to estimate computation time; many other factors such as array indexing, load/store overhead, and loop overhead play a significant role in the transouter performance for the dense linear systems. The reduction in array indexing overheads in multitransputer arrays may result in superlinear speedups. 相似文献

16.

DSolving: a novel and efficient intelligent algorithm for large-scale sliding puzzles

GuiPing Wang Ren Li 《人工智能实验与理论杂志》2013,25(4):809-822

Abstract

Sliding puzzles are classic and ancient intellectual problems. Since the amount of states in a sliding puzzle equals to the factorial of the number of tiles including the blank tile, traditional algorithms are only effective for small-scale ones, e.g. 8-puzzle. This article proposes a novel and efficient algorithm called DSolving for large-scale sliding puzzles. DSolving adopts direct solving manner. It does not need to store the intermediate states. Therefore, theoretically, it can solve any scale sliding puzzle. For general number tiles except the after-mentioned ones, DSolving adopts an efficient method to quickly move them to their target locations along shortest paths. For the top-right 3 × 2 corner sub-puzzles beginning with the last two positions in each row, DSolving constructs a state transition table (STT), which can ensure those two number tiles in the top-right corner be moved and placed correctly. The last two rows of tiles are considered as several 2 × 3 sub-puzzles, and another STT is constructed to solve these 2 × 3 sub-puzzles. These two STTs reduce corresponding problems to simple table-look-up operations. Experimental results show that DSolving exhibits high time-efficiency and stability. It takes only 4–5 ms to solve a random instance of 20 × 20 sliding puzzle on a general personal computer. 相似文献

17.

Fast exponentiation by folding the signed-digit exponent in half

《国际计算机数学杂志》2012,89(10):1251-1259

For modern cryptographic systems, the public key cryptosystem such as RSA requires modular exponentiation (M ^E mod?N). The M, E and N are either as large as the 1024-bit integers or even larger, it is not a very good idea to directly compute M ^E mod?N. Recently, there are many techniques have been invented to solve the time-consuming computations of such time-consuming modular exponentiation. Among these useful algorithms, the “binary (square-and-multiply) algorithm” reduces the amount of modulo multiplications. As the “signed-digit representation algorithm” has the property of the nonzero digit occurrence probability equals to 1/3, taking this advantage, this method can more effectively decrease the amount of modular multiplications. Moreover, by using the technique of recording the common parts in the folded substrings, the “folding-exponent algorithm” can improve the efficiency of the binary algorithm, thus can further decrease the computational complexity of modular exponentiation. In this paper, a new modular exponentiation algorithm is proposed which based on the binary algorithm, signed-digit representation, and the folding-exponent technique. By using the parallel processing technique, in our proposed method, the modular multiplications and modular squaring can be executed in parallel, and thus lower down the computational complexity to k?+?3 multiplications. As modular squaring operation over GF(2ⁿ) is carried out by a simple cyclic right shift operation, the computational complexity of our proposed method can be further reduced to 29k/36?+?3 multiplications. 相似文献

18.

Approximation by interval Bezier curves 总被引：5，自引：0，他引：5

Sederberg T.W. Farouki R.T. 《Computer Graphics and Applications, IEEE》1992,12(5):87-95

相似文献

19.

An adaptive reversible data hiding scheme based on prediction error histogram shifting by exploiting signed-digit representation

Xie Xiao-Zhu Chang Chin-Chen Hu Yu-Chen 《Multimedia Tools and Applications》2020,79(33-34):24329-24346

A prediction error histogram shifting (PEHS)-based reversible data hiding scheme is proposed in this paper. A novel representation for the secret stream, called signed-digit representation, is proposed to improve the image quality. The secret binary stream is first converted into a signed-digit stream, which results in a high occurrence of ‘0’. Meanwhile, a block-wise-based prediction is performed on the original image to generate prediction errors, which lead to a sharp prediction error histogram. Then, the converted signed-digit stream is embedded into the prediction errors according to the improved histogram shifting (HS)-based scheme with multiple selected peak points, resulting in an adaptive embedding capacity. The experimental results validate that the proposed scheme outperforms state-of-the-art schemes in terms of embedding capacity while maintaining a good image quality.

相似文献

20.

A unified architecture for a public key cryptographic coprocessor

Yi Douglas L. Jussipekka Leiwo 《Journal of Systems Architecture》2008,54(10):1004-1016

This paper presents a unified architecture for public key cryptosystems that can support the operations of the Rivest–Shamir–Adleman cryptogram (RSA) and the elliptic curve cryptogram (ECC). A hardware solution is proposed for operations over finite fields GF(p) and GF(2^p). The proposed architecture presents a unified arithmetic unit which provides the functions of dual-field modular multiplication, dual-field modular addition/subtraction, and dual-field modular inversion. A new adder based on the signed-digit (SD) number representation is provided for carry-propagated and carry-less operations. The critical path of the proposed design is reduced compared with previous full adder implementation methods. Experimental results show that the proposed design can achieve a clock speed of 1 GHz using 776 K gates in a 0.09 μm CMOS standard cell technology, or 150 MHz using 5227 CLBs in a Xilinx Virtex 4 FPGA. While the different technologies, platforms and standards make a definitive comparison difficult, based on the performance of our proposed design, we achieve a performance improvement of between 30% and 250% when compared with existing designs. 相似文献