期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

甘子平张立毅鲁峰辛建芳《山西电子技术》2008,(1):22-23

研究并完成了基于FPGA的浮点乘法器的硬件实现,详细阐述了其原理,重点介绍了乘法器的结构并通过了数据验证。在MaxplusⅡ上完成了综合仿真测试。相似文献

2.

Configurable Floating-Point FFT Accelerator on FPGA Based Multiple-Rotation CORDIC

《电子学报:英文版》2016,(6):1063-1070

Fast Fourier transform (FFT) accelerator and Coordinate rotation digital computer (CORDIC) algorithm play important roles in signal processing.We propose a conflgurable floating-point FFT accelerator based on CORDIC rotation,in which twiddle direction prediction is presented to reduce hardware cost and twiddle angles are generated in real time to save memory.To finish CORDIC rotation efficiently,a novel approach in which segmentedparallel iteration and compress iteration based on CSA are presented and redundant CORDIC is used to reduce the latency of each iteration.To prove the efficiency of our FFT accelerator,four FFT accelerators are prototyped into a FPGA chip to perform a batch-FFT.Experimental results show that our structure,which is composed of four butterfly units and finishes FFT with the size ranging from 64 to 8192 points,occupies 33230(3％) REGs and 143006(30％)LUTs.The clock frequency can reach 122MHz.The resources of double-precision FFT is only about 2.5 times of single-precision while the theoretical value is 4.What's more,only 13331 cycles are required to implement 8192-points double-precision FFT with four butterfly units in parallel. 相似文献

3.

浮点加法器的低功耗结构设计

高海霞杨银堂《微电子学》2002,32(2):128-130,135

浮点加法器是集成电路数据通道中重要的单元，它的性能和功耗极大地影响着处理器和数字信号处理器的性能。文章分析了浮点加法器的几种结构，重点介绍了实现低功耗的三数据通道结构。最后，还对浮点加法器结构的实用性进行了分析。相似文献

4.

一种适于数据通路电路的FPGA结构

黄志军张鹏徐健童家榕《微电子学》1999,29(5):305-310

提出了一种适于实现通路逻辑的ＦＰＧＡ结构ＦＤＰ。该结构的主要创新之处在于采用了两条通用反馈逻辑,基于全加器的通用逻辑单元,基于信号流的不对称连线结构和并行的测试扫描链。ＳＰＩＣＥ模拟结果表明,用０．８μｍ的工艺,ＦＤＰ块内延时２．７ｎｓ,平均进位链延时０．１ｎｓ。相似文献

5.

赛灵思推出面向应用的FPGA架构

《世界电子元器件》2004,(1):69-69

可编程逻辑解决方案供应商赛灵思公司(Xilinx)日前发布了其革命性的新架构——ASMBL架构(面向应用的组合模块架构)。赛灵思表示,ASMBL架构的核心是一个硅硬件子系统模块化框架,该架构支持新的FPGA开发方法,可快速、经济地推出针对不同应用领域的FPGA平台。同时,对于同样价位点的器件,新结构将使器件能力提升高达10倍。相似文献

6.

FPGA芯片时钟架构分析

《电子与封装》2016,(6):28-30

FPGA设计中时钟信号的设计与处理是保证系统稳定工作的重要组成部分,随着FPGA器件规模的不断增大,集成度不断提高,多时钟域管理、时钟延迟、时钟信号完整性和相位偏移等已成为影响FPGA设计的关键因素。结合微电子电路相关知识,针对Xilinx公司的Virtex4系列芯片,详细分析其时钟架构及时钟资源的特性。针对FPGA时钟设计的典型应用情况,从芯片角度给出了时钟设计与使用的一些技巧和建议。相似文献

7.

FPGA布线结构的蒙特卡罗评估

高海霞杨银堂王平《固体电子学研究与进展》2006,26(2):260-263

提出一种基于蒙特卡罗技术的FPGA结构研究新方法。该方法在布线资源中随机产生均匀分布的开路故障,并绕开障碍物布线互连,不依赖于CAD算法和基准电路。开关块拓扑分析实例表明该方法与CAD方法的结论一致,而评估时间从15小时缩短到15分钟。相似文献

8.

FPGA结构设计方法及EDA工具

张峰李艳陈亮李明于芳《微电子学与计算机》2013,30(5)

针对当前FPGA结构设计方法灵活度低、容易出错、自动化程度不够高的现状,提出一种FPGA结构设计方法.根据这种方法实现了EDA工具VA.VA使用GUI编辑结构描述文件,具有使用结构描述文件自动生成FPGA详细结构的功能,并通过在GUI中局部调整FPGA结构来实现设计异质型布线结构的功能.VA将FPGA结构设计和结构评估功能集成在一起,提供了全自动的评估流程.借助VA,成功设计出一款自主研发的FPGA芯片VS1000,设计过程和结果证明了VA的高效性和正确性. 相似文献

9.

An FPGA and ASIC Implementation of Cubing Architecture

Reddy B. Naresh Kumar Seetharamulu B. Krishna G. Siva Vani B. Veena 《Wireless Personal Communications》2022,125(4):3379-3391

Wireless Personal Communications - The optimization of VLSI design is playing an important role in the development of technological applications. The optimization of VLSI technology helps to... 相似文献

10.

A Highly Efficient Multicore Floating-Point FFT Architecture Based on Hybrid Linear Algebra/FFT Cores

Ardavan Pedram John D. McCalpin Andreas Gerstlauer 《Journal of Signal Processing Systems》2014,77(1-2):169-190

FFT algorithms have memory access patterns that prevent many architectures from achieving high computational utilization, particularly when parallel processing is required to achieve the desired levels of performance. Starting with a highly efficient hybrid linear algebra/FFT core, we co-design the on-chip memory hierarchy, on-chip interconnect, and FFT algorithms for a multicore FFT processor. We show that it is possible to to achieve excellent parallel scaling while maintaining power and area efficiency comparable to that of the single-core solution. The result is an architecture that can effectively use up to 16 hybrid cores for transform sizes that can be contained in on-chip SRAM. When configured with 12MiB of on-chip SRAM, our technology evaluation shows that the proposed 16-core FFT accelerator should sustain 388 GFLOPS of nominal double-precision performance, with power and area efficiencies of 30 GFLOPS/W and 2.66 GFLOPS/mm², respectively. 相似文献

11.

Designing a 3-D FPGA: Switch Box Architecture and Thermal Issues

Gayasen A. Narayanan V. Kandemir M. Rahman A. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2008,16(7):882-893

Three-dimensional (3-D) integration is an attractive technology to reduce wirelengths in a field-programmable gate array (FPGA). However, it suffers from two problems: one, the inter-layer vias are limited in number, and second, the increased power density leads to high junction temperatures. In this paper, we tackle the first problem by designing switch boxes that maximize the use of the vias. Compared to the previously used subset switch box, our best switch box reduces the number of vias by about 49% and area-delay product by about 9%. For the second problem, we utilize the difference in power densities between CLBs and some of the hard blocks in modern FPGAs to distribute the power more uniformly across the FPGA. The peak temperature in a two-layer FPGA reduces by about 16degC after our change. 相似文献

12.

最大平行结构的kalman滤波器的硬件电路设计

陈刚郭立史洪生杨毅《微电子学与计算机》2006,23(6):34-37,41

文章提出了一种基于FPGA的kalman滤波器的硬件实现结构。由于设计应用了高性能的算数处理单元和流水线结构，获得高效的结果，每次的处理周期大约是57．98ns，比其软件实现快了3到4个数量级。同时有利用了资源共享技术，使其面积比没有使用资源共享的情况下减少了45％左右。相似文献

13.

一种用FPGA实现的FIR滤波器结构 总被引：1，自引：0，他引：1

金昕黄捷刘韬《微电子学》1999,29(1)

A digital FIR filter architecture implemented in FPGA is described．The FIR architecture is based on a pipelined multiply-add-accumulator（MAC）which employs carry-save array．To save the delay time and hardware resources，multiplier uses the partial products generated by modified Booth algorithm．The FIR architecture is written in VHDL，and is synthesized into FPGA．The synthesis result shows that the proposed FIR architecture can run at 50 MHz clock rate in FPGA XC4025e-2．相似文献

14.

Architecture and FPGA Design of Dichotomous Coordinate Descent Algorithms

《IEEE transactions on circuits and systems. I, Regular papers》2009,56(11):2425-2438

In the areas of signal processing and communications, such as antenna-array beamforming, adaptive filtering, multiuser and multiple-input–multiple-output (MIMO) detection, channel estimation and equalization, echo and interference cancellation, and others, solving linear systems of equations often provides an optimal performance. However, this is also a very complicated operation that designers try to avoid by proposing different suboptimal techniques. The dichotomous coordinate descent (DCD) algorithm allows linear systems of equations to be solved with high computational efficiency. In this paper, we present architectures and field-programmable gate-array (FPGA) designs of two variants of the DCD algorithm, which are known as cyclic and leading DCD algorithms. For each of these techniques, we present serial designs, group-2 and group-4 designs, as well as a design with parallel update of the residual vector for the cyclic DCD algorithm. These designs have different degrees of parallelism, thus enabling a tradeoff between FPGA resources and computation time. The serial designs require the smallest FPGA resources; they are well suited for applications where many parallel solvers are required, e.g., for detection in MIMO–orthogonal-frequency-division-multiplexing communication systems. The parallelism introduced in the proposed group-2 and group-4 designs allows faster convergence to the true solution at the expense of an increase in FPGA resources. The design with parallel update of the residual vector provides the fastest convergence speed; however, if the system size is high, it may result in a significant increase in FPGA resources. The proposed fixed-point designs provide an accuracy performance that is very close to the performance of floating-point counterparts and require significantly lower FPGA resources than techniques based on QR decomposition. 相似文献

15.

High-throughput Block Turbo Decoding: From Full-parallel Architecture to FPGA Prototyping 总被引：1，自引：0，他引：1

Camille Leroux Christophe Jégo Patrick Adde Michel Jézéquel 《Journal of Signal Processing Systems》2009,57(3):349-361

Ultra high-speed block turbo decoder architectures meet the demand for even higher data rates and open up new opportunities for the next generations of communication systems such as fiber optic transmissions. This paper presents the implementation, onto an FPGA device of an ultra high throughput block turbo code decoder. An innovative architecture of a block turbo decoder which enables the memory blocks between all half-iterations to be removed is presented. A complexity analysis of the elementary decoder leads to a low complexity decoder architecture for a negligible performance degradation. The resulting turbo decoder is implemented on a Xilinx Virtex II-Pro FPGA in a communication experimental setup which also includes an innovative parallel product encoder. The implemented block turbo decoder processes input data at 600 Mb/s. The component code is an extended Bose, Ray-Chaudhuri, Hocquenghem (eBCH(16,11)) code. Some solutions to reach even higher data rates are finally presented. 相似文献

16.

Co-Synthesis to a Hybrid RISC/FPGA Architecture

Maya B. Gokhale Janice M. Stone Edson Gomersall 《The Journal of VLSI Signal Processing》2000,24(2-3):165-180

Hybrid architectures combining conventional processors with configurable logic resources enable efficient coordination of control with datapath computation. With integration of the two components on a single device, housekeeping tasks and, optionally, loop control and data-dependent branching, can be handled by the conventional processor, while regular datapath computation occurs on the configurable hardware. This paper describes a novel approach to programming such hybrid devices that gives the programmer control over mapping of data and computation between conventional processor and configurable logic. With a simple set of pragma and intrinsic function directives, the NAPA C language provides for manual control over perhaps the most important aspect of programming such hybrid devices. Alternatively, as experience is gained about tradeoffs between the two computational resources, mapping directives may eventually be generated by an external tool. The paper further describes a research prototype compiler that targets the hybrid processor model, with a concrete implementation for the National Semiconductor NAPA1000 chip. The NAPA C compiler parses the mapping directives, performs semantic analysis, and co-synthesizes a conventional processor executable combined with a configuration bit stream for the configurable logic. Two major compiler phases, the synthesis of pipelined loops and the datapath synthesis, are described in detail. 相似文献

17.

基于FPGA和LMS算法的系统建模 总被引：1，自引：1，他引：1

刘艳《现代电子技术》2010,33(2):76-79

自适应滤波器用于实现对未知系统的建模,用Matlab中的Simulink对LMS算法的实现方法进行仿真,在FP—GA中实现了LMS算法及其建模,并对FPGA设计的系统建模结果采用Matlab软件仿真,以增强Quartus的仿真功能,从而得到完整且直观的仿真结果。这种系统建模所采用的仿真、实现和验证方法同样适用于消除宽带信号中的窄带干扰,实现自适应谱线增强以及自适应均衡等,具有一定通用性。相似文献

18.

一种基于FPGA的并行流水线FIR滤波器结构 总被引：5，自引：0，他引：5

王黎明刘贵忠刘龙刘洁瑜《微电子学》2004,34(5):582-585,588

提出了一种在FPGA器件上实现的流水线并行FIR滤波器结构。首先比较了FIR滤波器三种硬件实现所用的资源,然后在理论上推出该流水线并行结构滤波器的实现方法及其可行性,给出了硬件实现模块。实验结果表明,这种改进滤波器结构实现的算法可以灵活地处理综合的面积和速度的约束关系,使设计达到最优。相似文献

19.

Architecture level optimization of 3-dimensional tree-based FPGA

Vinod Pangracious Emna Amouri Zied Marakchi Habib Mehrez 《Microelectronics Journal》2014

We describe a methodology to design and optimize Three-dimensional (3D) Tree-based FPGA by introducing a break-point at particular tree level interconnect to optimize the speed, area, and power consumption. The ability of the design flow to decide a horizontal or vertical network break-point based on design specifications is a defining feature of our design methodology. The vertical partitioning is organized in such a way to balance the placement of logic blocks and switch blocks into multiple tiers while the horizontal partitioning optimizes the interconnect delay by segregating the logic blocks and programmable interconnect resources into multiple tiers to build a 3D stacked Tree-based FPGA. We finally evaluate the effect of Look-Up-Table (LUT) size, cluster size, speed, area and power consumption of the proposed 3D Tree-based FPGA using our home grown experimental flow and show that the horizontal partitioned 3D stacked Tree-based FPGA with LUT and cluster sizes equal to 4 has the best area-delay product to design and manufacture 3D Tree-based FPGA. 相似文献

20.

EBCOT双上下文窗口并行编码及FPGA实现 总被引：1，自引：0，他引：1

严青郭炜《微电子学与计算机》2007,24(5):179-183

JPEG2000编码系统中，EBCOT的编码速度已经成为整个系统编码效率的瓶颈。通过研究EBCOT编码原理和通道并行算法的编码过程，提出了双上下文窗口位并行的EBCOT系数位建模方法。详细说明了使用该算法的系数位建模系统的硬件结构。系数位编码系统有效减少了编码时钟周期数，并在FPGA上进行了功能验证。相似文献