首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

The Needleman-Wunsch (NW) is a dynamic programming algorithm used in the pairwise global alignment of two biological sequences. In this paper, three sets of parallel implementations of the NW algorithm are presented using a mixture of specialized software and hardware solutions: POSIX Threads-based, SIMD Extensions-based and a GPU-based implementations. The three implementations aim at improving the performance of the NW algorithm on large scale input without affecting its accuracy. Our experiments show that the GPU-based implementation is the best implementation as it achieves performance 72.5X faster than the sequential implementation, whereas the best performance achieved by the POSIX threads and the SIMD techniques are 2X and 18.2X faster than the sequential implementation, respectively.

  相似文献   

2.
Code cloning is one of the active research areas in the software engineering community. Specifically, researchers have conducted numerous empirical studies on code cloning and reported that 7 % to 23 % of the code in a typical software system has been cloned. However, there was less awareness of code clones in dynamically-typed languages and most studies are limited to statically-typed languages such as Java, C, and C++. In addition, most previous studies did not consider different application domains such as standalone projects or web applications. As a result, very little is known about clones in dynamically-typed languages, such as JavaScript, in different application domains. In this paper, we report a large-scale clone detection experiment in a dynamically-typed programming language, JavaScript, for different application domains: web pages and standalone projects. Our experimental results showed that unlike JavaScript standalone projects, JavaScript web applications have 95 % of inter-file clones and 91–97 % of widely scattered clones. We observed that web application developers created clones intentionally and such clones may not be as risky as claimed in previous studies. Understanding the risks of cloning in web applications requires further studies, as cloning may be due to either good or bad intentions. Also, we identified unique development practices such as including browser-dependent or device-specific code in code clones of JavaScript web applications. This indicates that features of programming languages and technologies affect how developers duplicate code.  相似文献   

3.
Multiplication of polynomials of large degrees is the predominant operation in lattice-based cryptosystems in terms of execution time. This motivates the study of its fast and efficient implementations in hardware. Also, applications such as those using homomorphic encryption need to operate with polynomials of different parameter sets. This calls for design of configurable hardware architectures that can support multiplication of polynomials of various degrees and coefficient sizes.In this work, we present the design and an FPGA implementation of a run-time configurable and highly parallelized NTT-based polynomial multiplication architecture, which proves to be effective as an accelerator for lattice-based cryptosystems. The proposed polynomial multiplier can also be used to perform Number Theoretic Transform (NTT) and Inverse NTT (INTT) operations. It supports 6 different parameter sets, which are used in lattice-based homomorphic encryption and/or post-quantum cryptosystems. We also present a hardware/software co-design framework, which provides high-speed communication between the CPU and the FPGA connected by PCIe standard interface provided by the RIFFA driver [1]. For proof of concept, the proposed polynomial multiplier is deployed in this framework to accelerate the decryption operation of Brakerski/Fan-Vercauteren (BFV) homomorphic encryption scheme implemented in Simple Encrypted Arithmetic Library (SEAL), by the Cryptography Research Group at Microsoft Research [2]. In the proposed framework, polynomial multiplication operation in the decryption of the BFV scheme is offloaded to the accelerator in the FPGA via PCIe bus while the rest of operations in the decryption are executed in software running on an off-the-shelf desktop computer. The hardware part of the proposed framework targets Xilinx Virtex-7 FPGA device and the proposed framework achieves the speedup of almost 7 ×  in latency for the offloaded operations compared to their pure software implementations, excluding I/O overhead.  相似文献   

4.
随着量子计算机的发展,传统的公钥加密方案,如RSA加密和椭圆曲线加密算法(Elliptic curve cryptography,ECC)受到了严重威胁。为了对抗量子攻击,基于格的密码学引起了关注,其中环错误学习(Ring-learning with error,R-LWE)格加密算法具有电路实现简单、抗量子攻击等优点,在硬件加密领域具有极大的应用潜力。本文从硬件应用的角度,提出并实现了一种R-LWE加密方案中多项式乘法的并行电路结构,采用了数论转换(Number theoretic transforms, NTT)方法,并使用了两个并行的蝶形运算单元。结果表明在增加较少硬件资源的情况下,本文设计的算法提升了42%的运算速度。  相似文献   

5.
The Applicative Programming System Architecture contains a novel Data Structure Memory (DSM) which supports fast access operations on compact linear data structures. Several problems that arise in implementations of applicative and functional programming languages can be solved efficiently using special data representations on the DSM. Each memory word in the DSM contains a very small local processor, and there is also a tree-structured communications network within the DSM. Therefore the DSM is a massively parallel SIMD machine. This paper describes a VLSI implementation of the DSM architecture and compares its performance with implementations on a conventional sequential computer and the NASA Massively Parallel Processor.  相似文献   

6.
基于集中式WLAN分离MAC架构的TKIP协议   总被引:1,自引:0,他引:1       下载免费PDF全文
实现强健安全网络(RSN)是大规模部署安全无线网络的最佳选择。该文分析了集中式WLAN结构的特点,比较了分离MAC方式与其他方式的不同之处,研究了TKIP协议封装原理。给出一种新的分离MAC功能的方法,在集中式WLAN分离MAC架构下设计实现了暂时密钥完整性协议(TKIP),描述了软件在嵌入式系统中的实现。对有待进一步解决的问题进行了讨论。  相似文献   

7.
In this paper, implementation and analysis of three different versions of pseudorandom bit generators (PRBG) based on elliptic curves over prime and binary fields is presented. Implementations are carried out so that the algorithms could be compared in terms of time complexity and sequences could be compared in terms of periodicity, since the periodicity of all the generated streams are not available in literature. Based on the results of implementation and analysis, the pseudorandom bit generators (PRBG) most suitable for software and hardware realisations of stream cipher are identified. The software implementations of PRBG are carried out using Mathematica and the implementations in VHDL are done using the Altera Quartus IIv6.0 simulation software. The Montgomery’s point multiplication method has also been discussed and implemented for comparison with the conventional point multiplication algorithm. Together with this, faster software algorithms for field inversion and point counting are discussed.  相似文献   

8.
Complex graphical user interfaces (GUIs) that support a large amount of user interaction require a fast response time, a rich set of building blocks for an esthetic look‐and‐feel, and a development environment that supports ongoing change. On the World Wide Web, client‐side technologies offer more of these features than do server‐side solutions. Java and JavaScript are the two most popular languages used for client‐side GUI implementations. Java implementations require a user to download a plug‐in that contains a virtual machine to execute the Java byte‐code. The installation and maintenance of this plug‐in is sometimes an unsurmountable barrier to using Java. JavaScript lacks some of the desirable features of Java, such as easy to use object‐oriented features and having a GUI class library, but does not require a plug‐in. We have enhanced JavaScript by implementing a new language Object‐JavaScript (OJS) and by providing an OJS library of GUI components, thus making it a viable alternative to Java. Copyright © 2000 John Wiley & Sons, Ltd.  相似文献   

9.
We demonstrate how both area coherence and parallelism can be exploited in order to speed up rendering operations on a SIMD square array of processors. Our algorithms take advantage of the method of differences, in order to incrementally compute the values of a linear polynomial function at discrete intervals and thus implement area rendering operations efficiently. We discuss how filling of convex polygons, hidden surface elimination and smooth shading can be implemented on an N × N processor array that supports planar arithmetic, that is, arithmetic operations performed on N × N matrices in parallel for all matrix elements. A major attraction of the method we present is that it is based on a SIMD processor array; such machines are now recognised as highly general purpose given the wide range of applications successfully implemented on them.  相似文献   

10.
In a variety of emerging networked computing system domains over the years, there have been bursts of activity on new medium access control (MAC) protocols, as new communication transceiver technologies with greater data‐movement performance or lower power dissipation have been introduced. To enable implementations flexible to evolving standards and improving application‐domain insight, such MAC protocols are typically initially implemented in software, and interface between applications or system software, typically executing on an embedded processor or microcontroller, and the evolving radio transceiver hardware. Many challenges exist in implementing MAC protocols across evolving or competing transceiver hardware implementations and processor architectures. Some of these challenges are peculiar to the requirements of MAC protocols, and others are a result of the plethora of system and processor architectures in the embedded systems domain. This article studies the challenges facing software implementations of MAC protocols running on embedded microcontrollers, and interfacing with radio transceiver hardware. Experience with an implementation of the IEEE 802.15.4 MAC across three hardware platforms with different processor, system, and systems software architectures is presented, focusing on implementation approach and interfaces. Pitfalls are pointed out, and guidelines are provided for ensuring that new MAC implementations are easily portable across processor architectures and transceiver hardware. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

11.
Reuse in programming language development is an open research problem. Many authors have proposed frameworks for modular language development. These frameworks focus on maximizing code reuse, providing primitives for componentizing language implementations. There is also an open debate on combining feature-orientation with modular language development. Feature-oriented programming is a vision of computer programming in which features can be implemented separately, and then combined to build a variety of software products. However, even though feature-orientation and modular programming are strongly connected, modular language development frameworks are not usually meant primarily for feature-oriented language definition. In this paper we present a model of language development that puts feature implementation at the center, and describe its implementation in the Neverlang framework. The model has been evaluated through several languages implementations: in this paper, a state machine language is used as a means of comparison with other frameworks, and a JavaScript interpreter implementation is used to further illustrate the benefits that our model provides.  相似文献   

12.
Despite the popularity of JavaScript for client‐side web applications, there is a lack of effective software tools supporting JavaScript development and testing. The dynamic characteristics of JavaScript pose software engineering challenges such as program understanding and security. One important feature of JavaScript is that its objects support flexible mechanisms such as property changes at runtime and prototype‐based inheritance, making it difficult to reason about object behavior. We have performed an empirical study on real JavaScript applications to understand the dynamic behavior of JavaScript objects. We present metrics to measure behavior of JavaScript objects during execution (e.g., operations associated with an object, object size, and property type changes). We also investigated the behavioral patterns of observed objects to understand the coding or user interaction practices in JavaScript software. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

13.
基于理想格构造的 Aigis-sig 数字签名方案具有实现效率高、签名长度短、抗量子攻击等优势。针对Aigis-sig方案,构造了一种改进的模乘计算元件,设计了一种基于快速数论变换(NTT)算法实现环上多项式运算的紧凑硬件架构;同时以此架构为基础,提出了Aigis-sig数字签名方案的FPGA软硬件协同实现方法。实验表明,在Xilinx Zynq-7000 SoC平台上,CPU频率和硬件频率分别设置为666.66 MHz和150 MHz时,该实现方案相较于纯软件实现,签名阶段和验签阶段分别取得约26%和17%的性能提升。  相似文献   

14.
15.
The aim was to develop a software tool for refractive surgeons using a standard user-friendly web-based interface, providing the user with a secure environment to protect large volumes of patient data. The software application was named "Internet-based refractive analysis" (IBRA), and was programmed with the computer languages PHP, HTML and JavaScript, attached to the opensource MySQL database. IBRA facilitated internationally accepted presentation methods including the stability chart, the predictability chart and the safety chart; it was able to perform vector analysis for the course of a single patient or for group data. With the integrated nomogram calculation, treatment could be customised to reduce the postoperative refractive error. Multicenter functions permitted quality-control comparisons between different surgeons and laser units.  相似文献   

16.
The RTAG (real-time asynchronous grammars) programming language is discussed. The language is based on an attribute grammar notation for specifying protocols. Its main design goals are: (1) to support concise and easily understood expression of complex real-world protocols; and (2) to serve as the basis of a portable software system for automated protocol implementation. The algorithms used in generating implementations from given specifications are sketched, and a Unix-based automated implementation system for RTAG is described  相似文献   

17.
Quantum control plays a key role in quantum technology, in particular for steering quantum systems. As problem size grows exponentially with the system size, it is necessary to deal with fast numerical algorithms and implementations. We improved an existing code for quantum control concerning two linear algebra tasks: the computation of the matrix exponential and efficient parallelisation of prefix matrix multiplication.For the matrix exponential we compare three methods: the eigendecomposition method, the Padé method and a polynomial expansion based on Chebyshev polynomials. We show that the Chebyshev method outperforms the other methods both in terms of computation time and accuracy. For the prefix problem we compare the tree-based parallel prefix scheme, which is based on a recursive approach, with a sequential multiplication scheme where only the individual matrix multiplications are parallelised. We show that this fine-grain approach outperforms the parallel prefix scheme by a factor of 2–3, depending on parallel hardware and problem size, and also leads to lesser memory requirements.Overall, the improved linear algebra implementations not only led to a considerable runtime reduction, but also allowed us to tackle problems of larger size on the same parallel compute cluster.  相似文献   

18.
Model driven engineering (MDE) of software product lines (SPLs) merges two increasing important paradigms that synthesize programs by transformation. MDE creates programs by transforming models, and SPLs elaborate programs by applying transformations called features. In this paper, we present the design and implementation of a transformational model of a product line of scalar vector graphics and JavaScript applications. We explain how we simplified our implementation by lifting selected features and their compositions from our original product line (whose implementations were complex) to features and their compositions of another product line (whose specifications were simple). We used operators to map higher-level features and their compositions to their lower-level counterparts. Doing so exposed commuting relationships among feature compositions in both product lines that helped validate our model and implementation.  相似文献   

19.
This study presents a platform for industrial, real-world simulation–optimization based on web techniques. The design of the platform is intended to be generic and thereby make it possible to apply the platform in various problem domains. In the implementation of the platform, modern web techniques, such as Ajax, JavaScript, GWT, and ProtoBuf, are used. The platform is tested and evaluated on a real industrial problem of production optimization at Volvo Aero Corporation, a company that develops and manufactures high-technology components for aircraft and gas turbine engines. The results of the evaluation show that while the platform has several benefits, implementing a web-based system is not completely straightforward. At the end of the paper, possible pitfalls are discussed and some recommendations for future implementations are outlined.  相似文献   

20.
张拥军  陈艇 《计算机应用》2015,35(4):1179-1184
针对3GPP-LTE协议中多输入多输出(MIMO)均衡算法的高复杂度和高吞吐率问题,提出了一种面向软件无线电的并行MIMO均衡处理器,该处理器采用单指令流多数据流(SIMD)和超长指令字(VLIW)技术同时开发子载波间MIMO均衡和子载波内矩阵运算的并行性,并且每一个SIMD功能单元能够支持16 bit定点和20 bit伪浮点复数向量运算和矩阵运算,满足不同天线配置的MIMO均衡算法对处理精度、延迟和功耗的要求。实验结果表明,MIMO均衡处理器的4×4矩阵逆运算吞吐率达到了95 MInversion/s,满足3GPP-LTE协议的要求,并且其灵活可编程性和可配置性能够支持不同的均衡算法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号