期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Exploiting Thread‐Level Parallelism in Lockstep Execution by Partially Duplicating a Single Pipeline

Jaegeun Oh Seok Joong Hwang Huong Giang Nguyen Areum Kim Seon Wook Kim Chulwoo Kim Jong‐Kook Kim 《ETRI Journal》2008,30(4):576-586

In most parallel loops of embedded applications, every iteration executes the exact same sequence of instructions while manipulating different data. This fact motivates a new compiler‐hardware orchestrated execution framework in which all parallel threads share one fetch unit and one decode unit but have their own execution, memory, and write‐back units. This resource sharing enables parallel threads to execute in lockstep with minimal hardware extension and compiler support. Our proposed architecture, called multithreaded lockstep execution processor (MLEP), is a compromise between the single‐instruction multiple‐data (SIMD) and symmetric multithreading/chip multiprocessor (SMT/CMP) solutions. The proposed approach is more favorable than a typical SIMD execution in terms of degree of parallelism, range of applicability, and code generation, and can save more power and chip area than the SMT/CMP approach without significant performance degradation. For the architecture verification, we extend a commercial 32‐bit embedded core AE32000C and synthesize it on Xilinx FPGA. Compared to the original architecture, our approach is 13.5% faster with a 2‐way MLEP and 33.7% faster with a 4‐way MLEP in EEMBC benchmarks which are automatically parallelized by the Intel compiler. 相似文献

2.

Unifying simulation and execution in a design environment for FPGAsystems

Hutchings B.L. Nelson B.E. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2001,9(1):201-205

Field programmable gate array (FPGA)-based systems provide advantages over conventional hardware including: (1) availability of the hardware during design and debug; (2) programmability; and (3) visibility. These three advantages can greatly shorten the design and verification cycle. This paper discusses a design environment that exploits these three FPGA-specific advantages to create a unified simulation/execution debug environment implemented in the JHDL design system. The described system provides a hardware debugging environment with the functionality of a simulator but up to 10000× faster. In addition, testbenches and other typical verification software used in simulators can be used to verify running hardware 相似文献

3.

BUSpec: A framework for generation of verification aids for standard bus protocol specifications

《Integration, the VLSI Journal》2007,40(3):285-304

A typical verification intellectual property (VIP) of a bus protocol such as ARM advanced micro-controller bus architecture (AMBA) or PCI consists of a set of assertions and associated verification aids such as test-benches, design-ware models and coverage metrics. While several languages have been formalized for specifying assertions (examples include Open-Vera Assertions, Sugar, ForSpec, System Verilog Assertions, etc.), it is widely accepted that the tasks of writing protocol-compliant models and test-benches that produce protocol compliant stimuli are also tasks of equal importance. In this paper, we present a platform for high-level specification of a bus protocol in a hierarchical manner and an automated methodology for generating a variety of verification aids that supplement the set of assertions in a VIP. We also show that the verification aids can be efficiently used to determine the completeness of the set of assertions in a simulation-based verification environment. 相似文献

4.

基于ARCA3 CPU的嵌入式SoC的FPGA原型验证

陈达燕王进祥孙俊杨奕《微电子学与计算机》2011,28(12)

基于FPGA的验证是SoC功能验证的有效途径,建立一个基于FPGA的原型验证系统已成为SoC验证的重要方法.ARCA3是一种高性能、低功耗,国产的嵌入式微处理器.在ARCA3和AMBA架构上集成存储器控制器等IP核和外设,构建一个嵌入式SoC,并在FPGA上实现SoC的原型验证系统和软硬件协同验证环境.在FPGA原型机上运行Bootloader和操作系统,验证整个系统硬件的可操作性和软硬件之间的交互.基于FPGA的原型验证系统的实现可以快速验证基于ARCA3的各种抽象层次的IP核和开发基于ARCA3的软件应用. 相似文献

5.

基于FPGA的软硬件协同仿真加速技术

江霞林周剑扬杨银涛林晓立《中国集成电路》2010,19(8):30-33

在系统设计中,硬件复杂电路设计的调试与仿真工作对于设计者来说十分困难。为了降低仿真复杂度,加快仿真速度,本文提出利用FPGA加速的思想,实现软硬件协同加速仿真。经过实验,相对于纯软件仿真,利用软硬件协同加速仿真技术,仿真速度提高近30倍,大大缩短了仿真时间。相似文献

6.

SoC based floating point implementation of differential evolution algorithm using FPGA

Kiran Kumar Anumandla Rangababu Peesapati Samrat L. Sabat Siba K. Udgata 《Design Automation for Embedded Systems》2012,16(4):221-240

This paper presents floating point design and implementation of System on Chip (SoC) based Differential Evolution (DE) algorithm using Xilinx Virtex-5 Field Programmable Gate Array (FPGA). The hardware implementation is carried out to enhance the execution speed of the embedded applications. Intellectual Property (IP) of DE algorithm is developed and interfaced with the 32-bit PowerPC 440 processor using processor local bus (PLB) of Xilinx Virtex-5 FPGA. In the proposed architecture the algorithmic parameters of DE are scalable. The software and hardware implementation of the DE algorithm is carried out in PowerPC embedded processor and hardware IP respectively. The optimization of numerical benchmark functions and system identification in control systems are implemented to verify the proposed hardware SoC platform. The performance of the IP is measured in terms of acceleration gain of the DE algorithm. The optimization problems are solved by using floating point arithmetic in both embedded processor and hardware. The experimental result concludes that the hardware DE IP accelerates the execution speed approximately by 200 times compared to equivalent software implementation of DE algorithm on PowerPC 440 processor. Further, as a case study an Infinite Impulse Response (IIR) based system identification task on SoC using the developed hardware accelerator is implemented. 相似文献

7.

MetaCore: an application-specific programmable DSP developmentsystem

Jin-Hyuk Yang Byoung-Woon Kim Sang-Joon Nam Young-Su Kwon Dae-Hyun Lee Jong-Yeol Lee Chan-Soo Hwang Yong-Hoon Lee Seung-Ho Hwang In-Cheol Park Chong-Min Kyung 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2000,8(2):173-183

相似文献

8.

基于SOPC的嵌入式系统

张珍李雷《信息技术》2007,31(12):109-112

利用SOPC Builder可以在短时间内把Nios Ⅱ CPU、Avalon总线、外围设备、片内调试模块等集成在一起生成系统需要的NiosⅡ处理器，然后用QuartusⅡ软件把NIOSⅡ处理器其它外部设备接口结合在一起编译下载到FPGA芯片中，即完成系统的硬件设计；软件设计通常采用C／C＋＋语言编写并用NoisⅡIDE编译后下载到FPGA中来实现一个SOPC系统。相似文献

9.

PPCBoot在MPC8250上的移植方法

冯俊平黄建忠王新梅《国外电子元器件》2006,(2):4-7

The Bootloader(引导加载程序)是嵌入式系统CPU加电后即开始运行的第一段代码,它把Linux内核与硬件平台衔接在一起,对于嵌入式系统的后续软件开发十分重要.PPCBoot是功能十分强大的Bootloader.深入研究了PPCBoot的工作机理,详细分析了PPCBoot在基于MPC8250型处理器的嵌入式系统板上的移植方法、过程与移植要点. 相似文献

10.

基于FPGA与Cortex—M3的仿真验证平台设计

李效白周强左捷郭忠元王孟《中国集成电路》2013,(12):43-47

本文主要介绍基于xillaxK7系列FPGA与ARMCortexM3处理器设计的可用于仿真与芯片验证的综合平台。本文结合FPGA与ARMCortexM3处理器的技术特点,着重描述了该平台的软硬件设计与应用场合,并介绍了平台实际使用情况。相似文献

11.

A State Language for the Sequencing in a Hybrid Electric Vehicle

Sutherland Hunt A. Bose Bimal K. Somuah Clement B. 《Industrial Electronics, IEEE Transactions on》1983,(4):318-322

The application of a state language to the real-time control of a hybrid electric vehicle is explained. The state language has been developed both as a specification aid to the system designer and as a means for the programmer to produce microcomputer software. A translator program, which was developed on a VAX minicomputer, preprocesses the state language into a software module to be compiled by the standard Intel PL/M 86 compiler. 相似文献

12.

Design Methodology for Offloading Software Executions to FPGA

Tomasz Patyk Perttu Salmela Teemu Pitkänen Pekka Jääskeläinen Jarmo Takala 《Journal of Signal Processing Systems》2011,65(2):245-259

Field programmable gate array (FPGA) is a flexible solution for offloading part of the computations from a processor. In particular, it can be used to accelerate an execution of a computationally heavy part of the software application, e.g., in DSP, where small kernels are repeated often. Since an application code for a processor is a software, a design methodology is needed to convert the code into a hardware implementation, applicable to the FPGA. In this paper, we propose a design method, which uses the Transport Triggered Architecture (TTA) processor template and the TTA-based Co-design Environment toolset to automate the design process. With software as a starting point, we generate a RTL implementation of an application-specific TTA processor together with the hardware/software interfaces required to offload computations from the system main processor. To exemplify how the integration of the customized TTA with a new platform could look like, we describe a process of developing required interfaces from a scratch. Finally, we present how to take advantage of the scalability of the TTA processor to target platform and application-specific requirements. 相似文献

13.

FPGA prototyping of a RISC processor core for embedded applications

Gschwind M. Salapura V. Maurer D. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2001,9(2):241-250

Application-specific processors offer an attractive option in the design of embedded systems by providing high performance for a specific application domain. In this work, we describe the use of a reconfigurable processor core based on an RISC architecture as starting point for application-specific processor design. By using a common base instruction set, development cost can be reduced and design space exploration is focused on the application-specific aspects of performance. An important aspect of deploying any new architecture is verification which usually requires lengthy software simulation of a design model. We show how hardware emulation based on programmable logic can be integrated into the hardware/software codesign flow. While previously hardware emulation required massive investment in design effort and special purpose emulators, an emulation approach based on high-density field-programmable gate array (FPGA) devices now makes hardware emulation practical and cost effective for embedded processor designs. To reduce development cost and avoid duplication of design effort, FPGA prototypes and ASIC implementations are derived from a common source: We show how to perform targeted optimizations to fully exploit the capabilities of the target technology while maintaining a common source base 相似文献

14.

基于SoC设计的软硬件协同验证方法学 总被引：3，自引：3，他引：0

赵刚侯立刚刘源朱修殿吴武臣《微电子学与计算机》2006,23(6):24-26

文章介绍了软硬件协同验证方法学及其验证流程。在软件方面，采用了一套完整的软件编译调试仿真工具链，它包括处理器的仿真虚拟原型和基本的汇编、链接、调试器；在硬件方面，对软件调试好的应用程序进行RTL仿真、综合，并最终在SoC设计的硬件映像加速器（FPGA）上实现并验证。相似文献

15.

Linux在Xilinx FPGA上的移植

买培培邵东晖苏涛《火控雷达技术》2009,38(4):67-72

Xilinx公司开发的Virtex-Ⅱ pro等FPGA结合可编程片上系统（SOPC）技术嵌入了PowerPC处理器硬核。本文结合Linux操作系统的优点及PowerPC嵌入式处理器硬核,在Virtex-Ⅱ Pro开发平台上,研究并实现了Linux操作系统在PowerPC405处理器中的移植,其中包括硬件平台的定制、交叉编译环境的建立、内核的配置及根文件系统的制作,最后通过具体的应用验证了系统的稳定性及可靠性。文中将处理器、操作系统与FPGA融合在一起完成既定的信号处理任务,既具有操作系统多任务、实时性等优点,又充分发挥了FPGA的优势,具有较好的应用前景。相似文献

16.

Specification and Verification of Switching Software

Kajiwara M. Ichikawa H. Itoh M. Yoshida Y. 《Communications, IEEE Transactions on》1985,33(3):193-198

相似文献

17.

一种公共的多DSP硬件模块实现方法

沈会敏《无线电工程》2007,37(6):57-59

简单介绍了多DSP硬件模块的应用背景。主要介绍了基于美国德州仪器(TI)公司生产的TMS320 VC5416 DSP芯片实现的8 DSP硬件模块实现方法。该模块的结构主要包括多片DSP、FLASH程序加载、JTAG硬件仿真和FPGA等子模块。详细论述了多DSP与FPGA的连接、FLASH存储器与DSP、FPGA的连接,以及硬件仿真所用的JTAG菊花链。并且通过验证该硬件模块运行正确。相似文献

18.

System Architecture and Implementation of MIMO Sphere Decoders on FPGA

Xinming Huang Cao Liang Jing Ma 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2008,16(2):188-197

Multiple-input-multiple-output (MIMO) systems use multiple antennas in both transmitter and receiver ends for higher spectrum efficiency. The hardware implementation of MIMO detection becomes a challenging task as the computational complexity increases. This paper presents the architectures and implementations of two typical sphere decoding algorithms, including the Viterbo-Boutros (VB) algorithm and the Schnorr-Euchner (SE) algorithm. Hardware/software codesign technique is applied to partition the decoding algorithm on a single field-programmable gate array (FPGA) device. Three levels of parallelism are explored to improve the decoding rate: the concurrent execution of the channel matrix preprocessing on an embedded processor and the decoding functions on customized hardware modules, the parallel decoding of real/imaginary parts for complex constellation, and the concurrent execution of multiple steps during the closest lattice point search. The decoders for a 4times4 MIMO system with 16-QAM modulation are prototyped on a Xilinx XC2VP30 FPGA device with a MicroBlaze soft core processor. The hardware prototypes of the SE and VB algorithms show that they support up to 81.5 and 36.1 Mb/s data rates at 20 dB signal-to-noise ratio, which are about 22 and 97 times faster than their respective implementations in a digital signal processor. 相似文献

19.

Hardware-Assisted Run-Time Monitoring for Secure Program Execution on Embedded Processors

Arora D. Ravi S. Raghunathan A. Jha N. K. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2006,14(12):1295-1308

Embedded system security is often compromised when "trusted" software is subverted to result in unintended behavior, such as leakage of sensitive data or execution of malicious code. Several countermeasures have been proposed in the literature to counteract these intrusions. A common underlying theme in most of them is to define security policies at the system level in an application-independent manner and check for security violations either statically or at run time. In this paper, we present a methodology that addresses this issue from a different perspective. It defines correct execution as synonymous with the way the program was intended to run and employs a dedicated hardware monitor to detect and prevent unintended program behavior. Specifically, we extract properties of an embedded program through static program analysis and use them as the bases for enforcing permissible program behavior at run time. The processor architecture is augmented with a hardware monitor that observes the program's dynamic execution trace, checks whether it falls within the allowed program behavior, and flags any deviations from expected behavior to trigger appropriate response mechanisms. We present properties that capture permissible program behavior at different levels of granularity, namely inter-procedural control flow, intra-procedural control flow, and instruction-stream integrity. We outline a systematic methodology to design application-specific hardware monitors for any given embedded program. Hardware implementations using a commercial design flow, and cycle-accurate performance simulations indicate that the proposed technique can thwart several common software and physical attacks, facilitating secure program execution with minimal overheads 相似文献

20.

GCC2Verilog Compiler Toolset for Complete Translation of C Programming Language into Verilog HDL

Giang Nguyen Thi Huong Seon Wook Kim 《ETRI Journal》2011,33(5):731-740

相似文献