首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Application-specific processors offer an attractive option in the design of embedded systems by providing high performance for a specific application domain. In this work, we describe the use of a reconfigurable processor core based on an RISC architecture as starting point for application-specific processor design. By using a common base instruction set, development cost can be reduced and design space exploration is focused on the application-specific aspects of performance. An important aspect of deploying any new architecture is verification which usually requires lengthy software simulation of a design model. We show how hardware emulation based on programmable logic can be integrated into the hardware/software codesign flow. While previously hardware emulation required massive investment in design effort and special purpose emulators, an emulation approach based on high-density field-programmable gate array (FPGA) devices now makes hardware emulation practical and cost effective for embedded processor designs. To reduce development cost and avoid duplication of design effort, FPGA prototypes and ASIC implementations are derived from a common source: We show how to perform targeted optimizations to fully exploit the capabilities of the target technology while maintaining a common source base  相似文献   

2.
针对片上系统(SoC)开发周期较长和现场可编程门阵列(FPGA)可重用的特点,设计了基于ARM7TDMI处理器核的SoC的百万门级FPGA验证平台。介绍了怎样设计平台并利用该平台进行IP核验证、底层硬件驱动和实时操作系统及高层应用软件的验证。使用该平台能够基本验证SoC系统的设计,并加快SoC系统的开发。整个系统原理清晰,结构简单,扩展灵活、方便。  相似文献   

3.
This paper presents floating point design and implementation of System on Chip (SoC) based Differential Evolution (DE) algorithm using Xilinx Virtex-5 Field Programmable Gate Array (FPGA). The hardware implementation is carried out to enhance the execution speed of the embedded applications. Intellectual Property (IP) of DE algorithm is developed and interfaced with the 32-bit PowerPC 440 processor using processor local bus (PLB) of Xilinx Virtex-5 FPGA. In the proposed architecture the algorithmic parameters of DE are scalable. The software and hardware implementation of the DE algorithm is carried out in PowerPC embedded processor and hardware IP respectively. The optimization of numerical benchmark functions and system identification in control systems are implemented to verify the proposed hardware SoC platform. The performance of the IP is measured in terms of acceleration gain of the DE algorithm. The optimization problems are solved by using floating point arithmetic in both embedded processor and hardware. The experimental result concludes that the hardware DE IP accelerates the execution speed approximately by 200 times compared to equivalent software implementation of DE algorithm on PowerPC 440 processor. Further, as a case study an Infinite Impulse Response (IIR) based system identification task on SoC using the developed hardware accelerator is implemented.  相似文献   

4.
5.
An increasing number of standards in wireless communications have encouraged to study programmable processors as platforms for flexible receivers. A multiple-input multiple-output (MIMO) antenna system combined with orthogonal frequency division multiplexing (OFDM) technique has been introduced in many wireless communications standards, such as in the third generation long term evolution (3G LTE). The MIMO-OFDM system requires an efficient detector and a platform support for parallel processing of multiple subcarriers. A K-best list sphere detector (LSD) provides for near optimal decoding performance and a fixed throughput making it an interesting algorithm from the point of view of practical implementations.In this paper, we compare the implementations of the K-best LSD on four processor platforms: a digital signal processor (DSP), software defined radio (SDR), application-specific processor (ASP) and application-specific instruction-set processor (ASIP). The DSP is a popular very long instruction word (VLIW) device (TMS320C6455), the SDR processor employs multithreading and multiple cores (SB3500 core processor), the ASP is based on transport triggered architecture (TTA), while the ASIP is the SDR processor enhanced with a special instruction-set extension for sorting.A 2×2 MIMO antenna system with 64-quadrature amplitude modulation (64-QAM) is assumed. The chosen list sizes K=8 and 16 are based on simulation results carried out in MATLAB environment with the third generation long term evolution (3G LTE) parameters. The proposed ASIP achieved a promising throughput of 32.0 Mbps, where the software defined radio (SDR) implementation on the SB3500 core processor suffers from an inefficient software sorter. The ASP, in which the minimized hardware complexity has been the goal, achieves a throughput of 7.6 Mbps. However, more essential examination is related to the symbol time, which sets strict parallel processing requirements to the programmable processors.  相似文献   

6.
张珍  李雷 《信息技术》2007,31(12):109-112
利用SOPC Builder可以在短时间内把Nios Ⅱ CPU、Avalon总线、外围设备、片内调试模块等集成在一起生成系统需要的NiosⅡ处理器,然后用QuartusⅡ软件把NIOSⅡ处理器其它外部设备接口结合在一起编译下载到FPGA芯片中,即完成系统的硬件设计;软件设计通常采用C/C++语言编写并用NoisⅡIDE编译后下载到FPGA中来实现一个SOPC系统。  相似文献   

7.
针对电路设计中经常碰到数据的噪声干扰现象,提出了一种Kalman滤波的FPGA实现方法。该方法采用了Ⅱ公司的高精度模数转换器ADS1251以及Altera公司的EPIC12,首先用卡尔曼滤波算法设计了一个滤波器,然后将该滤波器分解成简单的加、减、乘、除运算。通过基于FPGA平台的硬件与软件的合理设计,成功地实现了数据噪声的滤除设计,并通过实践仿真计算,验证了所实现滤波的有效性。  相似文献   

8.
基于ARM7TDMI的SoC芯片的FPGA验证平台设计   总被引:4,自引:0,他引:4  
针对片上系统(SoC)开发周期较长和现场可编程门阵列(FPGA)可重用的特点,设计了基于ARM7TDMI处理器核的SoC的FPGA验证平台,介绍了怎样利用该平台进行软硬件协同设计、IP核验证、底层硬件驱动和实时操作系统设计验证.使用该平台通过软硬件协同设计,能够加快SoC系统的开发.整个系统原理清晰,结构简单,扩展灵活、方便.  相似文献   

9.
信号处理机是雷达侦察设备的重要组成部分,提高其工作性能、处理能力、可靠性,应用灵活性、通用性、标准化,以及减小体积、降低功耗都具有极其重要的意义。首先介绍信号处理硬件平台的现状及需求,然后提出了一种基于CPCI标准由TMS320C6455双DSP加FPGA通用信号处理平台的设计方案,并详细讨论了该平台的硬件组成、工作原理、特点以及实现方法。  相似文献   

10.
提出了基于DSP FPGA混合平台的H.264/AVC编码器设计思路与实现方法.以DSP为主处理器,FPGA为协处理器实现算法的硬件加速,针对编码器中最复杂耗时的模块,设计相应的硬件加速引擎.并针对硬件加速引擎制定出便于控制和数据传输的软/硬件通信协议,实现了H.264/AVC D1编码器所需的实时性能.  相似文献   

11.
12.
本文主要介绍基于xillaxK7系列FPGA与ARMCortexM3处理器设计的可用于仿真与芯片验证的综合平台。本文结合FPGA与ARMCortexM3处理器的技术特点,着重描述了该平台的软硬件设计与应用场合,并介绍了平台实际使用情况。  相似文献   

13.
This paper introduces hthreads, a unifying programming model for specifying application threads running within a hybrid computer processing unit (CPU)/field-programmable gate-array (FPGA) system. Presently accepted hybrid CPU/FPGA computational models-and access to these computational models via high level languages-focus on programming language extensions to increase accessibility and portability. However, this paper argues that new high-level programming models built on common software abstractions better address these goals. The hthreads system, in general, is unique within the reconfigurable computing community as it includes operating system and middleware layer abstractions that extend across the CPU/FPGA boundary. This enables all platform components to be abstracted into a unified multiprocessor architecture platform. Application programmers can then express their computations using threads specified from a single POSIX threads (pthreads) multithreaded application program and can then compile the threads to either run on the CPU or synthesize them to run within an FPGA. To enable this seamless framework, we have created the hardware thread interface (HWTI) component to provide an abstract, platform-independent compilation target for hardware-resident computations. The HWTI enables the use of standard thread communication and synchronization operations across the software/hardware boundary. Key operating system primitives have been mapped into hardware to provide threads running in both hardware and software uniform access to a set of sub-microsecond, minimal-jitter services. Migrating the operating system into hardware removes the potential bottleneck of routing all system service requests through a central CPU.  相似文献   

14.
随着数字信息技术和网络技术的高速发展,嵌入式系统设计已经成为目前蓬勃发展的行业之一。由于FPGA嵌入式处理器具有体积小、功耗低、集成度高、能与FPGA内部逻辑紧密结合的优势,它必将引领嵌入式系统设计的新潮流。基于Xilinx公司Virtex-IV系列FPGA平台,在其内嵌PowerPC405处理器上构建Vxworks操作系统平台,通过与传统嵌入式处理平台的比较说明该平台的特点和优势。详细阐述了系统软硬件协同设计方法和平台构建过程。  相似文献   

15.
This paper describes the acceleration of an infrared automatic target recognition (IR ATR) application with a co-processor board that contains multiple field programmable gate array (FPGA) chips. Template and pixel level parallelism is exploited in an FPGA design for the bottleneck portion of the application. The implementation of this design achieved a speedup of 21 compared to running on the host processor. The paper then describes an FPGA resource manager (RM) developed to support concurrent applications sharing the FPGA board. With the RM, the system is dynamically reconfigurable. That is, while part of the co-processor board is busy computing, another part can be reconfigured for other purposes. The IR ATR application was ported to work with the RM and has been shown to adapt to the amount of reconfigurable hardware that is available at the time the application is executed.  相似文献   

16.
In order to make software applications simpler to write and easier to maintain, a software digital signal-processing library that performs essential signal- and image-processing functions is an important part of every digital signal processor (DSP) developer's toolset. In general, such a library provides high-level interface and mechanisms, therefore, developers only need to know how to use algorithms, not the details of how they work. Complex signal transformations then become function calls, e.g., C-callable functions. Considering the two-dimensional (2-D) convolver function as an example of great significance for DSP's, this paper proposes to replace this software function by an emulation on a field-programmable gate array (FPGA) initially configured by software programming. Therefore, the exploration of the 2-D convolver's design space will provide guidelines for the development of a library of DSP-oriented hardware configurations intended to significantly speed up the performance of general DSP processors. Based on the specific convolver, and considering operators supported in the library as hardware accelerators, a series of tradeoffs for efficiently exploiting the bandwidth between the general-purpose DSP and accelerators are proposed. In terms of implementation, this paper explores the performance and architectural tradeoffs involved in the design of an FPGA-based 2-D convolution coprocessor for the TMS320C40 DSP microprocessor available from Texas Instruments Incorporated. However, the proposed concept is not limited to a particular processor  相似文献   

17.
In many embedded systems, the computational power of an instruction set processor is combined with hardware accelerators. Building such combined systems implies co-design of the software that runs on the processor and the hardware that accelerates the embedded application. During the co-design process, the application is partitioned into a software part (running on the processor) and a hardware part (running on the accelerator). In order to ease the iterative process of partitioning, we introduce a novel design methodology. In our methodology, the interface between hardware and software is transparent to the software designer, and is based on dynamic method interception. Because the interface is transparent and generated automatically, the initial all-software prototype of the system can more easily be refined and partitioned. We show that method interception is inexpensive, and we demonstrate method interception in a real-life application. Using our methodology, embedded systems can be designed fast, reducing time-to-market, while still achieving a high run-time performance.  相似文献   

18.
Viterbi decoding is widely used in many radio systems. Because of the large computation complexity, it is usually implemented with ASIC chips, FPGA chips, or optimized hardware accelerators. With the rapid development of the multicore technology, multicore platforms become a reasonable choice for software radio (SR) systems. The Cell Broadband Engine processor is a state-of-art multi-core processor designed by Sony, Toshiba, and IBM. In this paper, we present a 64-state soft input Viterbi decoder for WiMAX SR Baseband system based on the Cell processor. With one Synergistic Processor Element (SPE) of a Cell Processor running at 3.2GHz, our Viterbi decoder can achieve the throughput up to 30Mb/s to decode the tail-biting convolutional code. The performance demonstrates that the proposed Viterbi decoding implementation is very efficient. Moreover, the Viterbi decoder can be easily integrated to the SR system and can provide a highly integrated SR solution. The optimization methodology in this module design can be extended to other modules on Cell platform.  相似文献   

19.
研究了基于FPGA的基-2 FFT算法的设计与实现。为减小硬件资源开销,论文采用蝶形运算单元和控制器单元构成的反馈结构对基-2 FFT处理器的硬件j结构进行了总体设计,采用时序控制方法完成蝶形运算电路设计,采用同步有限状态机(FSM,finite state machine)方法实现了旋转因子系数的产生与控制。并基于Quartus II软件平台,完成了整个FFT处理器电路的FPGA实现,最后通过仿真验证了设计方案的正确性。  相似文献   

20.
介绍了一种基于嵌入式Vxworks的智能化以太网测试仪的设计开发和实现方案.该测试仪采用ARM9处理器,简化了硬件平台的设计,配合嵌入式vxworks操作系统和应用软件,对以太网设备进行基于RFC2544网络性能的测试.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号