首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 980 毫秒
1.
The Secure Hash Algorithm is the most popular hash function currently used in many security protocols such as SSL and IPSec. Like other cryptographic algorithms, the hardware implementation of hash functions is of great importance for high speed applications. Because of the iterative structure of hash functions, a single error in their hardware implementation could result in a large number of errors in the final hash value. In this paper, we propose a novel time-redundancy-based fault diagnostic scheme for the implementation of SHA-1 and SHA-512 round computations. This scheme can detect permanent as well as transient faults as opposed to the traditional time redundancy technique which is only capable of detecting transient errors. The proposed design does not impose significant timing overhead to the original implementation of SHA-1 and SHA-512 round computation. We have implemented the proposed design for SHA-1 and SHA-512 on Xilinx xc2p7 FPGA. It is shown that for the proposed fault detection SHA-1 and SHA-512 round computations, there are, respectively, 3% and 10% reduction in the throughput with 58% and 30% area overhead as compared to the original schemes. The fault simulation of the implementation shows that almost 100% fault coverage can be achieved using the proposed scheme for transient and permanent faults.  相似文献   

2.
一种基于循环展开结构的SHA-1算法实现   总被引:1,自引:0,他引:1  
哈希算法在信息安全领域主要应用于验证数据完整性和签名认证。通过对SHA-1算法进行深入分析,提出了一种快速实现此算法的硬件方案。该方案改变了标准算法中的迭代结构,减少消息处理时钟周期数,进而提高吞吐量。与其他IP)核相比,该设计在面积、频率和吞吐量等方面表现出了较强的优势。  相似文献   

3.
The growth of the blockchain-based cryptocurrencies has attracted a lot of attention from a variety of fields, especially in academic research. One of them is Bitcoin, the most popular and highest valued cryptocurrency on the market. The SHA256 is the main processing part in Bitcoin mining, to date the difficulty of which is extremely high and still increases relentlessly. Hence, it is essential to improve the speed of the SHA256 cores in the Bitcoin mining system. In this paper, we propose a two-level pipeline hardware architecture for the SHA256 processing. The first-level pipeline helps the system reduce the number of operating cycles. Besides, the maximum frequency of the system is boosted by the second-level pipeline. The proposed hardware is implemented on FPGA Xilinx Virtex 7-VC707 (28 nm technology). The mining hash rate using the proposed pipeline SHA256 cores reaches 514.92 MH/s that improves 2.4 times compared to the FPGA based conventional technique. The throughput of SHA core of current study is 296.108 Gbps that is 240 times higher compared to the standard technique. The proposed architecture is also implemented in an ASIC design using ROHM 180 nm CMOS technology, which resulted in a throughput of 69.28 Gbps that is 18 times higher than that of conventional work implemented in Intel 14 nm process.  相似文献   

4.
A new technique for Boolean random masking of the logic and operation in terms of nand logic gates is proposed and applied for masking the integer addition. The new technique can be used for masking arbitrary cryptographic functions and is more efficient than previously known techniques, recently applied to the Advanced Encryption Standard (AES). New techniques for the conversions from Boolean to arithmetic random masking and vice versa are also developed. They are hardware oriented and do not require additional random bits. Unlike the previous, software-oriented techniques showing a substantial difference in the complexity of the two conversions, they have a comparable complexity being about the same as that of one integer addition only. All the techniques proposed are in theory secure against the first-order differential power analysis on the logic gate level. They can be applied in hardware implementations of various cryptographic functions, including AES, (keyed) SHA-1, IDEA, and RC6  相似文献   

5.
It’s a promising way to improve performance significantly by adding reconfigurable processing unit (RPU) to a general purpose processor. In this paper, a Reconfigurable Multi-Core (RMC) architecture combining general multi-core and reconfigurable logic is proposed. Reconfigurable logic is separated into RPUs logically, which are coupled with general purpose cores as co-processors via a full crossbar switch. An RPU Manager (RPU-M) is also designed to manage RPUs. To verify RMC, a simulation method based on the Simics and Virtex 5 FPGA is adopted, which simplifies the simulation and assures the evaluation accuracy of hardware function cores. Five workloads are selected to test RMC, including 3-DES, AES, SHA2, IDCT and JPEG_ENC. The experimental results show a 3.10 times average speedup over software implementation on the original multi-core, and the data and control communication overhead on RMC is acceptable.  相似文献   

6.
SHA是由美国国家安全局(NSA)设计的安全杂凑算法.该算法主要应用在通讯完整性验证以及数字签名认证领域.以面积优化为目标,从系统设计入手到模块级设计,以具体设计为实例,在智能卡芯片中以较小的面积代价实现了SHA-1算法,对于类似的杂凑算法设计具有普遍的参考价值.  相似文献   

7.
While hardware/software partitioning has been shown to provide significant performance gains, most hardware/software partitioning approaches are limited to partitioning computational kernels utilizing integers or fixed point implementations. Software developers often initially develop an application using floating point representations built-in to most programming languages and later convert the application to a fixed point representation—a potentially time consuming process. In this paper, we present the Arizona Float Fixed Hardware Library (AFFHL) consisting of efficient, configurable floating point to fixed point and fixed point to floating point hardware converters. By utilizing these converters, a system’s hardware/software implementation can be separated into a floating point domain consisting of the microprocessor and memory subsystem and a fixed point domain consisting of one or more partitioned hardware coprocessors. This separation enables a rapid hardware/software partitioning approach in which floating point software kernels can be implemented using fixed point hardware coprocessors without the need for application developers to first rewrite software applications as fixed point implementations. We further present an overview of a basic hardware/software partitioning methodology for rapidly partitioning computational kernels within floating point software application to either statically determined fixed point hardware coprocessors or dynamically adaptable fixed point hardware coprocessors in which the required fixed point representation can be dynamically determined and adjusted at runtime.  相似文献   

8.
This paper presents a hardware efficient system-on-chip (SoC) sensor architecture for ultrasonic imaging applications that uses the split-spectrum processing (SSP) algorithm. The SSP design is realized using recursive subband decomposition techniques for achieving minimal hardware and power consumption. Recursive implementations of discrete Fourier transform (DFT) and discrete cosine transform (DCT) are presented for subband decomposition which result in sparse transform operations and significantly reduced hardware and power requirements. A comparative study and performance results present the advantages of the recursive hardware architecture compared to the conventional implementation of the SSP algorithm using IP cores for FFT.  相似文献   

9.
As new applications in embedded communications and control systems push the computational limits of digital signal processing (DSP) functions, there will be an increasing need for software applications to be migrated to hardware in the form of a hardware-software codesign system. In many cases, access to the high-level source code may not be available. It is thus desirable to have a technology to translate the software binaries intended for processors to hardware implementations. This paper provides details on the retargetable FREEDOM compiler. The compiler automatically translates DSP software binaries to register-transfer level (RTL) VHDL and Verilog for implementation on field-programmable gate arrays (FPGAs) as standalone or system-on-chip implementations. We describe the underlying optimizations and some novel algorithms for alias analysis, data dependency analysis, memory optimizations, procedure call recovery, and back-end code scheduling. Experimental results on resource usage and performance are shown for several program binaries intended for the Texas Instruments C 6211 DSP (VLIW) and the ARM 922 T reduced instruction set computer (RISC) processors. Implementation results for four kernels from the Simulink demo library and others from commonly used DSP applications, such as MPEG-4, Viterbi, and JPEG are also discussed. The compiler generated RTL code is mapped to Xilinx Virtex II and Altera Stratix FPGAs. We record overall performance gains of 1.5-26.9 for the hardware implementations of the kernels. Comparisons with the power aware compiler techniques (PACT) high-level synthesis compiler are used to show that software binaries can be used as intermediate representations from any high-level language and generate efficient hardware implementations.  相似文献   

10.
Four implementations of fault-tolerant software techniques are evaluated with respect to hardware and design faults. Project participants were divided into four groups, each of which developed fault-tolerant software based on a common specification. Each group applied one of the following techniques: N-version programming, recovery block, concurrent error-detection, and algorithm-based fault tolerance. Independent testing and modeling groups analyzed the software. The testing group subjected it to simulated design and hardware faults. The data were then mapped into a discrete-time Markov model developed by the modeling group. The effectiveness of each technique with respect to availability, correctness, and time to failure given an error, as shown by the model, is contrasted with measured data. The model is analyzed with respect to additional figures of merit identified during the modeling process, and the techniques are ranked using an application taxonomy  相似文献   

11.
朱宁龙  戴紫彬  张立朝  赵峰 《微电子学》2015,45(6):777-780, 784
针对当前国内外杂凑算法标准和应用需求不同的现状,采用数据流可重构的设计思想和方法,在对SM3及SHA-2系列杂凑算法的不同特征进行分析研究的基础上,总结归纳出统一的处理模型,进而设计了一种新的硬件结构。基于该结构,根据不同环境对杂凑算法安全强度的不同要求,可以单独灵活地实现SM3,SHA-256,SHA-384及SHA-512算法。实验结果表明,设计的硬件电路有效降低了硬件资源消耗,提高了系统吞吐率,能够满足国内外商用杂凑算法的要求。  相似文献   

12.
Hardware/software co-design of the Stanford FLASH multiprocessor   总被引:1,自引:0,他引:1  
Hardware/software co-design is a methodology for solving design problems in systems with processors or embedded controllers where the design requirements mandate a functionality and performance level for the system, independent of the hardware and software boundary. In addition to the challenges of functional correctness and total system performance, design time is often a critical factor. To design MAGIC, the programmable memory and communication controller for the Stanford FLASH multiprocessor, the authors employed a hardware/software co-design methodology. This methodology allowed them to concurrently design the hardware and software thereby reducing design time while simultaneously ensuring that the design would meet ambitious performance goals. Serializing the hardware and software design would have lengthened the design time and significantly increased the amount of redesign when the tradeoffs between the hardware and software implementations became clear late in the design process. The co-design approach led them to build a series of hierarchical simulators that allowed them to begin design verification early and to reduce the level of effort required to ensure a functional design  相似文献   

13.
A successful hardware/software architecture that resolves performance bottlenecks at the workstation-to-network host interface and offers high end-to-end performance is described. The solution reported carefully splits protocol processing functions into hardware and software implementations. The interface hardware is highly parallel and performs all per-cell functions with dedicated logic to maximize performance. Software provides support for the transfer of data between the interface and application memory, as well as the state management necessary for virtual circuit setup and maintenance. In addition, all higher-level protocol processing is implemented with host software. The prototype connects a RISC System/6000 to a SONET-based asynchronous transfer model (ATM) network carrying data at the OC-3c rate of 155 Mb/s. An experimental evaluation of the interface hardware and software has been performed. Several conclusions are drawn about this host interface architecture and the workstations to which it is connected  相似文献   

14.
The technical analysis used in determining which of the potential Advanced Encryption Standard candidates was selected as the Advanced Encryption Algorithm includes efficiency testing of both hardware and software implementations of candidate algorithms. Reprogrammable devices such as field-programmable gate arrays (FPGAs) are highly attractive options for hardware implementations of encryption algorithms, as they provide cryptographic algorithm agility, physical security, and potentially much higher performance than software solutions. This contribution investigates the significance of FPGA implementations of the Advanced Encryption Standard candidate algorithms. Multiple architectural implementation options are explored for each algorithm. A strong focus is placed on high-throughput implementations, which are required to support security for current and future high bandwidth applications. Finally, the implementations of each algorithm will be compared in an effort to determine the most suitable candidate for hardware implementation within commercially available FPGAs  相似文献   

15.
Embedded digital signal processing (DSP) systems are usually associated with real time constraints and/or high data rates such that fully software implementations are often not satisfactory. In that case, mixed hardware/software implementations are to be investigated. This paper presents the design of a HW/SW G.729 voice decoder dedicated to embedded systems. The decoder has been built around, on the one hand a reconfigurable digital circuit (FPGA) to achieve the so called IP hardware part—the autocorrelation computation—using a linear systolic array, and on the other hand a digital signal processor (DSP) for the remainder of the algorithm. Apart such an implementation is typically driven by the use of reusable component (IP) it is of great interest for new G729-based applications such as Voice over IP (VoIP) for example. It results in an overall reduction of the execution time per frame. Another interesting point is the design of a parameterizable autocorrelation block which can be useful for a wide range of applications such as GSM 13 Kbit/s, APC 9.6 Kbit/s and G723 6.3 Kbit/s and 5.3 Kbit/s. In the G729 context and using a V50 Virtex FPGA, the execution time of this function is 10 times faster than a TMS320C6201 DSP implementation.  相似文献   

16.
An overview of configurable computing machines for software radio handsets   总被引:2,自引:0,他引:2  
The advent of software radios has brought a paradigm shift to radio design. A multimode handset with dynamic reconfigurability has the promise of integrated services and global roaming capabilities. However, most of the work to date has been focused on software radio base stations, which do not have as tight constraints on area and power as handsets. Base station software radio technology progressed dramatically with advances in system design, adaptive modulation and coding techniques, reconfigurable hardware, A/D converters, RF design, and rapid prototyping systems, and has helped bring software radio handsets a step closer to reality. However, supporting multimode radios on a small handset still remains a design challenge. A configurable computing machine, which is an optimized FPGA with application-specific capabilities, show promise for software radio handsets in optimizing hardware implementations for heterogeneous systems. In this article contemporary CCM architectures that allow dynamic hardware reconfiguration with maximum flexibility are reviewed and assessed. This is followed by design recommendations for CCM architectures for use in software radio handsets.  相似文献   

17.
Detailed techniques and experimental results are given on circuitry for clock recovery in a 2 Gbit/s digital communications system. The approach used can readily be extended to data rates in excess of 10 Gbit/s.  相似文献   

18.
本文提出一种基于离散预约速率与分组长度组单元的公平隐列调度器实现结构,该结构可根据不同预约速率需求,为其方便灵活的提供不同的预约带宽实现精度。组单元的模块化设计结构与流水线设计技术使得硬件逻辑资源得到更有效的利用。文中同时提了一种适用于结构的定点时标重构技术,利用该技术可有效节约存储流时标的所需的外部存储空间,算法仿真与FPGA综合结果表明,该结构可支持1.2Gbit/s 的输出链路,通过有效的集成方式,该设计可进一步应用到端口速率为OC-48(2.4Gbps)的高速路由器中。l  相似文献   

19.
对在原有8/16×2.5 Gbit/s波分复用系统中增加或替换少量10 Git/s波道的扩容技术作了介绍,如波分复用终端设备改造方案,扩容工程所应用的技术及测试项目等。  相似文献   

20.
DS2432是美国美信公司生产的一种自动加密电路,内部含有SHA-1加密引擎,可使硬件设计更安全可靠.根据DS2432的工作原理,提出一种带软件加密狗的1-Wire总线USB口适配器的设计方法,同时介绍硬件电路,对电路进行了分析和说明,给出软件加密流程.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号