期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Automatic Synthesis of FPGA Processor Arrays from Loop Algorithms

Marcus Bednara Jürgen Teich 《The Journal of supercomputing》2003,26(2):149-165

We consider the problem of automatic mapping of computation-intensive loop nests onto FPGA hardware. The regular cell array structure of these chips reflects the parallelism in regular loop-like computations. Furthermore, the flexibility of FPGAs allows the cost-effective implementation of reconfigurable high performance processor arrays. So far, there exists no continuous design flow that allows automated generation of FPGA configuration data from a loop nest specified in a high level language. Here, we present a methodology for automatic generation of synthesizable VHDL code specifying a processor array and optimized for FPGA implementation. 相似文献

2.

基于多FPGA的NoC多核处理器验证平台设计 总被引：1，自引：0，他引：1

黄晓林潘红兵易伟杨虎凌梦黄辰何书专李丽《计算机工程与设计》2012,33(1):180-185

为了能够灵活地验证和实现自主设计的基于NoC的多核处理器,缩短NoC多核处理器的设计周期,提出了设计集成4片Virtex-6-550T FPGA的NoC多核处理器原型芯片设计/验证平台.分析和评估了NoC多核处理器的规模以及对FPGA硬件资源的需求,在此基础上给出了集成4片FPGA的开发板详细设计方案,并对各主要模块如互联架构、电源、板级时钟分布、接口技术、存储资源等关键设计要点进行阐述.描述了开发板各个主要模块的测试过程和结果,表明了该设计的可行性. 相似文献

3.

Soft error susceptibility analysis methodology of HLS designs in SRAM-based FPGAs

《Microprocessors and Microsystems》2017

SRAM-based FPGAs are attractive to critical applications due to their reconfiguration capability, which allows the design to be adapted on the field under different upset rate environments. High level Synthesis (HLS) is a powerful method to explore different design architectures in FPGAs. In this paper, the HLS tool from Xilinx is used to generate different design architectures and then analyze the probability of errors in those architectures. Two different case studies scenarios are investigated. First, it is evaluated the influence of control flow and pipeline architectures combined with the use of specialized DSP blocks in the FPGA. The number of errors classified as silent data corruption and timeout according to the architectures and DSP blocks usage is analyzed. Moreover, more possibilities of HLS designs are explored such as data organization, aggressive pipeline insertion and the implementation of the algorithm in a soft processor like the Microblaze from Xilinx. These architectures are strongly optimized in performance and the least susceptible design under soft errors is investigated. All case-study designs are evaluated in a 28 nm SRAM-based FPGA under fault injection. The dynamic cross section, soft error rate and mean work between failures are calculated based on the experimental results. The proposed characterization method can be used to guide designers to select better architectures concerning the susceptibility to upsets and performance efficiency. 相似文献

4.

FTT—1：一个基于硬件的故障注入器的设计与实现 总被引：3，自引：0，他引：3

王建莹孙峻朝《计算机工程与设计》1998,19(4):14-19

故障注入是评价计算机系统可信性的一种重要的试验方法。构造故障注入器是故障注入研究中的一人重要组成部分。此文介绍了ＦＴＴ－１，一个基于硬件的故障注入器的设计与实现。文章讨论了设计与实现硬件故障注入器的关键技术，并介绍了在ＦＴＴ－１的实现中解决这些关键技术的方法。试验结果证明了ＦＴＴ－１用于评价容错计算机系统可信性的有效性。相似文献

5.

Flexible VLIW processor based on FPGA for efficient embedded real-time image processing

Vincent Brost Fan Yang Charles Meunier 《Journal of Real-Time Image Processing》2014,9(1):47-59

Modern field programmable gate array (FPGA) chips, with their larger memory capacity and reconfigurability potential, are opening new frontiers in rapid prototyping of embedded systems. With the advent of high-density FPGAs, it is now possible to implement a high-performance VLIW (very long instruction word) processor core in an FPGA. With VLIW architecture, the processor effectiveness depends on the ability of compilers to provide sufficient ILP (instruction-level parallelism) from program code. This paper describes research result about enabling the VLIW processor model for real-time processing applications by exploiting FPGA technology. Our goals are to keep the flexibility of processors to shorten the development cycle, and to use the powerful FPGA resources to increase real-time performance. We present a flexible VLIW VHDL processor model with a variable instruction set and a customizable architecture which allows exploiting intrinsic parallelism of a target application using advanced compiler technology and implementing it in an optimal manner on FPGA. Some common algorithms of image processing were tested and validated using the proposed development cycle. We also realized the rapid prototyping of embedded contactless palmprint extraction on an FPGA Virtex-6 based board for a biometric application and obtained a processing time of 145.6 ms per image. Our approach applies some criteria for co-design tools: flexibility, modularity, performance, and reusability. 相似文献

6.

High-Level Modeling and FPGA Prototyping of Produced Order Parallel Queue Processor Core

Ben A. Abderazek Tsutomu Yoshinaga Masahiro Sowa 《The Journal of supercomputing》2006,38(1):3-15

相似文献

7.

High-Level synthesis assisted design and verification framework for automotive radar processors

《Microprocessors and Microsystems》2020

In radar-based advanced driver assistance systems, baseband processing is necessary to detect the speed, distance, and angle of elevation of the target (e.g., vehicle, pedestrian, traffic sign, etc.). The target and the source often move at high speeds; therefore, the computation rate must be sufficiently high to perform actions (e.g., braking) in real-time. Software-based implementations of such systems fall short of the required performance, which has led to an increase in the popularity of custom hardware implementations, e.g., on field-programmable gate arrays (FPGAs). FPGAs also serve as platforms to develop software concurrent with system-on-chip (SoC) development, thereby decreasing the time to market. High-level synthesis (HLS) tools are gaining considerable attention in the very-large-scale integration design community because of their flexibility. In this paper, we propose a novel design and verification framework for a RADAR processing SoC. The framework is assisted by an HLS-based design scheme for the processor and supports the application of a real-world stimulus to register transfer-level design implementation running on FPGAs. Customer use cases for the distance and velocity calculations are executed in a pre-silicon environment using range and Doppler processing on the Xilinx Kintex-7(XC 7K 480T) FPGA. Our findings show that the proposed framework, based on MATLAB HDL Coder and HDL Verifier, is superior to similar implementations from prior research in terms of speed and FPGA resources. This is owing to the usage of appropriate HLS directives and the usage of a novel design method based on application-specific bit width for intermediate data nodes. 相似文献

8.

A compact FPGA-based processor for the Secure Hash Algorithm SHA-256

Rommel García Ignacio Algredo-Badillo Miguel Morales-Sandoval Claudia Feregrino-Uribe René Cumplido 《Computers & Electrical Engineering》2014

This work reports an efficient and compact FPGA processor for the SHA-256 algorithm. The novel processor architecture is based on a custom datapath that exploits the reusing of modules, having as main component a 4-input Arithmetic-Logic Unit not previously reported. This ALU is designed as a result of studying the type of operations in the SHA algorithm, their execution sequence and the associated dataflow. The processor hardware architecture was modeled in VHDL and implemented in FPGAs. The results obtained from the implementation in a Virtex5 device demonstrate that the proposed design uses fewer resources achieving higher performance and efficiency, outperforming previous approaches in the literature focused on compact designs, saving around 60% FPGA slices with an increased throughput (Mbps) and efficiency (Mbps/Slice). The proposed SHA processor is well suited for applications like Wi-Fi, TMP (Trusted Mobile Platform), and MTM (Mobile Trusted Module), where the data transfer speed is around 50 Mbps. 相似文献

9.

An evaluation model for dependability of Internet-scale software on basis of Bayesian Networks and trustworthiness

《Journal of Systems and Software》2014

Internet-scale software becomes more and more important as a mode to construct software systems when Internet is developing rapidly. Internet-scale software comprises a set of widely distributed software entities which are running in open, dynamic and uncontrollable Internet environment. There are several aspects impacting dependability of Internet-scale software, such as technical, organizational, decisional and human aspects. It is very important to evaluate dependability of Internet-scale software by integrating all the aspects and analyzing system architecture from the most foundational elements. However, it is lack of such an evaluation model. An evaluation model of dependability for Internet-scale software on the basis of Bayesian Networks is proposed in this paper. The structure of Internet-scale software is analyzed. An evaluating system of dependability for Internet-scale software is established. It includes static metrics, dynamic metrics, prior metrics and correction metrics. A process of trust attenuation based on assessment is proposed to integrate subjective trust factors and objective dependability factors which impact on system quality. In this paper, a Bayesian Network is build according to the structure analysis. A bottom-up method that use Bayesian reasoning to analyses and calculate entity dependability and integration dependability layer by layer is described. A unified dependability of the whole system is worked out and is corrected by objective data. The analysis of experiment in a real system proves that the model in this paper is capable of evaluating the dependability of Internet-scale software clearly and objectively. Moreover, it offers effective help to the design, development, deployment and assessment of Internet-scale software. 相似文献

10.

An energy efficient FPGA partial reconfiguration based micro-architectural technique for IoT applications

《Microprocessors and Microsystems》2020

Low power consumption and high computational performance are two important processor design goals for IoT applications. Achieving both design goals in one processor architecture is challenging due to their conflicting requirements. This paper introduces a reconfigurable micro-architectural level technique that allows a Reduced Instruction Set Computing (RISC) processor to support IoT applications with different performance and energy trade-off requirements. The processor can be reconfigured into either multi-cycle execution mode (low computational speed with low dynamic power consumption) or pipeline execution mode (high computational speed at the expense of high dynamic power), based on dynamic workload characteristics in IoT applications. Switching between modes is accomplished by exploiting the partial reconfiguration (PR) feature offered by the recent advancements in modern FPGAs. A RISC processor was designed based on the proposed micro-architectural level technique and implemented on FPGA as IoT sensor node. Experimental results demonstrate that the proposed technique with reconfigurable micro-architecture is able to significantly reduce the dynamic energy consumption, compared to conventional multi-cycle and pipeline only micro-architectures, while allowing better performance-energy trade-off in IoT applications. 相似文献

11.

Reconfiguring one-time programmable FPGAs

《Micro, IEEE》1999,19(6):53-63

Field-programmable gate arrays can suffer from a variety of faults, ranging from wire anomalies and defects to inoperative programmable connections. The solution to these faults depends on whether or not we are dealing with a reprogrammable FPGA or a one time programmable (OTP) FPGA. To correct faults, developers can reconfigure FPGAs such as those made by Xilinx and Altera by reprogramming. These devices can be programmed many times, for different designs and applications. Correcting faults in OTP FPGAs, such as those made by Actel is more difficult. For one thing, OTP FPGAs are based on antifuses. With an antifuse, the FPGAs configuration information has an initial (default) value that can be changed, but once changed cannot be restored. Therefore, the procedures to bypass faulty cells or faulty routing in an OTP FPGA must meet more stringent requirements than for reprogrammable FPGAs. The “Reconfiguration Approaches” sidebar describes two methods other researchers have tried. This article describes our approach to reconfiguring OTP FPGAs. We explain how we determine if reconfiguration is feasible, the algorithms we used, and the results of our experiments on a generic OTP FPGA model and a generic detail router 相似文献

12.

Hardware–software optimizations of reconfigurable multi-core processors for floating-point computations of large sparse matrices

Xiaofang Wang 《Journal of Real-Time Image Processing》2014,9(1):187-204

State-of-the-art field-programmable gate array (FPGA) technologies have provided exciting opportunities to develop more flexible, less expensive, and better performance floating-point computing platforms for embedded systems. To better harness the full power of FPGAs and to bring FPGAs to more system designers, we investigate unique advantages and optimization opportunities in both software and hardware offered by multi-core processors on a programmable chip (MPoPCs). In this paper, we present our hardware customization and software dynamic scheduling solutions for LU factorization of large sparse matrices on in-house developed MPoPCs. Theoretical analysis is provided to guide the design. Implementation results on an Altera Stratix III FPGA for five benchmark matrices of size up to 7,917 × 7,917 are presented. Our hardware customization alone can reduce the execution time by up to 17.22 %. The integrated hardware–software optimization improves the speedup by an average of 60.30 %. 相似文献

13.

Efficient dynamic priority based soft error mitigation techniques for configuration memory of FPGA hardware

《Microprocessors and Microsystems》2017

Radiation-induced single bit upsets (SBUs) and multi-bit upsets (MBUs) are more prominent in Field Programmable Gate Arrays (FPGAs) due to the presence of a large number of latches in the configuration memory (CM) of FPGAs. At the same time, SBUs and MBUs in the CM can permanently or temporarily affect the hardware circuit implemented on FPGA. Hence, error mitigation and recovery techniques are necessary to protect the FPGA hardware from permanent faults arising due to such SBUs and MBUs. Different existing techniques used to mitigate the effect of soft errors in FPGA have high overhead and their implementations are also quite complex. In this paper, we have proposed efficient single bit as well as multi-bit error correcting methods to correct errors in the CM of FPGAs using simple parity equations and Erasure code. These codes are easy to implement, and the needed decoding circuits are also simple. Use of Dynamic Partial Reconfiguration (DPR) along with a simple hardware scheduling algorithm based download manager helps to perform the error correction in the CM without suspending the operations of the other hardware blocks. We propose a first of its kind methodology for novel transient fault correction using efficient error correcting codes with hardware scheduling for FPGAs. To validate the design we have tested the proposed methodology with Kintex FPGA. We have also measured different parameters like fault recovery time, power consumption, resource overhead and error correction efficiency to estimate the performance of our proposed methods. 相似文献

14.

Domain-Specific Modeling for Rapid Energy Estimation of Reconfigurable Architectures

Choi Seonil Jang Ju-wook Mohanty Sumit Prasanna Viktor K. 《The Journal of supercomputing》2003,26(3):259-281

Reconfigurable architectures such as FPGAs are flexible alternatives to DSPs or ASICs used in mobile devices for which energy is a key performance metric. Reconfigurable architectures offer several design parameters such as operating frequency, precision, amount of memory, degree of parallelism, etc. These parameters define a large design space that must be explored to find energy-efficient solutions. It is also challenging to predict the energy variation at the early design phases when a design is modified at algorithm level. Efficient traversal of such a large design space requires high-level modeling to facilitate rapid estimation of system-wide energy. However, FPGAs do not exhibit a high-level structure like, for example, a RISC processor for which high-level as well as low-level energy models are available. To address this scenario, we propose a domain-specific modeling technique for energy-efficient kernel design that exploits the knowledge of the algorithm and the target architecture family for a given kernel to develop a high-level model. This model captures architecture and algorithm features, parameters affecting energy performance, and power estimation functions based on these parameters. A system-wide energy function is derived based on the power functions and cycle specific power state of each building block of the architecture. This model is used to understand the impact of various parameters on system-wide energy and can be a basis for the design of energy-efficient algorithms. Our high-level model is used to quickly obtain fairly accurate estimate of the system-wide energy dissipation of data paths configured using FPGAs. We demonstrate our modeling methodology by applying it to four domains. 相似文献

15.

Sorting networks on FPGAs

Rene Mueller Jens Teubner Gustavo Alonso 《The VLDB Journal The International Journal on Very Large Data Bases》2012,21(1):1-23

Computer architectures are quickly changing toward heterogeneous many-core systems. Such a trend opens up interesting opportunities but also raises immense challenges since the efficient use of heterogeneous many-core systems is not a trivial problem. Software-configurable microprocessors and FPGAs add further diversity but also increase complexity. In this paper, we explore the use of sorting networks on field-programmable gate arrays (FPGAs). FPGAs are very versatile in terms of how they can be used and can also be added as additional processing units in standard CPU sockets. Our results indicate that efficient usage of FPGAs involves non-trivial aspects such as having the right computation model (a sorting network in this case); a careful implementation that balances all the design constraints in an FPGA; and the proper integration strategy to link the FPGA to the rest of the system. Once these issues are properly addressed, our experiments show that FPGAs exhibit performance figures competitive with those of modern general-purpose CPUs while offering significant advantages in terms of power consumption and parallel stream evaluation. 相似文献

16.

A dependability model for TMR system 总被引：1，自引：0，他引：1

Jun-Jie Peng Yan-Ping Liu Yuan-Yuan Chen 《国际自动化与计算杂志》2012,9(3):315-324

Much research has been done on the dependability evaluation of computer systems. However, much of this is gone no further than study of the fault coverage of such systems, with little focus on the relationship between fault coverage and overall system dependability. In this paper, a Markovian dependability model for triple-modular-redundancy (TMR) system is presented. Having fully considered the effects of fault coverage, working time, and constant failure rate of single module on the dependability of the target TMR system, the model is built based on the stepwise degradation strategy. Through the model, the relationship between the fault coverage and the dependability of the system is determined. What is more, the dependability of the system can be dynamically and precisely predicted at any given time with the fault coverage set. This will be of much benefit for the dependability evaluation and improvement, and be helpful for the system design and maintenance. 相似文献

17.

OFDM系统中傅里叶变换的硬件实现方法 总被引：1，自引：0，他引：1

汤晓峰戎蒙恬邓波林巍《计算机工程与应用》2005,41(25):106-108,111

在宽带OFDM系统中,FFT处理器是一个重要组成部分。文章介绍了一种适合OFDM系统的高效FFT处理器的VLSI设计方法,针对高效的特点采用了改进的Radix-4DIT算法,乒乓RAM的设计思想,以及流水线结构。根据Radix-4算法的特点,在基4运算单元CU(Computing Unit)设计,存取地址混序,每级迭代控制,数据对齐等方面也有一些特点。文章针对256点,36bit位长,浮点复数进行FFT运算。目前,此FFT处理器已经通过了FPGA验证,处理能力为100MSPS。相似文献

18.

Multiplierless and fully pipelined JPEG compression soft IP targeting FPGAs

Luciano Volcan Ivan Saraiva Sergio 《Microprocessors and Microsystems》2007,31(8):487-497

This paper presents the design of a soft IP for JPEG compression targeted for high performance in a FPGA device. The JPEG compressor architecture achieves high throughput with a deep and optimized pipeline and with a multiplierless datapath architecture. The JPEG compressor architecture was designed in a hierarchical and modular fashion and the details of the global architecture and of its modules are presented in this paper. A modular and strictly structural VHDL design is followed to develop the JPEG compressor soft IP. The VHDL codes were synthesized to Altera and Xilinx FPGAs. Synthesis results and relevant performance comparisons with related works are presented. Our high throughput compressor is able to compress 39.8 millions of pixels per second when mapped onto an Altera FLEX 10KE FPGA. Our JPEG soft IP mapped to FLEX 10KE low cost FPGA is able to compress 115 images per second in SDTV resolution (720 × 480 pixels). Considering this SDTV resolution our design is worthy as a core of an M-JPEG video compressor, reaching a real time processing rate of 30 fps, once mapped to the FLEX 10KE FPGA device. 相似文献

19.

Dependability analysis in the Ambient Assisted Living Domain: An exploratory case study

Genaína Nunes RodriguesAuthor Vitae Vander Alves^{Author Vitae} 《Journal of Systems and Software》2012,85(1):112-131

相似文献

20.

基于云模型的电力生产管理软件可信评估方法

王琦王永滨曹轶臻《计算机应用与软件》2012,29(7):33-36,45

随着电力生产管理软件的规模和复杂度不断增加,对此类软件可信性的评估越来越重要。通过对电力生产管理软件质量特性的分析,确定电力生产管理软件可信性的二级评估指标,然后,利用云模型定性与定量之间的转换关系,提出基于云模型理论的电力生产管理软件可信性评估方法,通过评估因素的云模型化和云合并算法,以云模型数字特征图的形式给出最终软件可信评估等级。经过实例证实,该方法的评价结果直观,更加符合实际情况。相似文献