期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An architecture of high-performance frequency and phase synthesis

Mair H. Liming Xiu 《Solid-State Circuits, IEEE Journal of》2000,35(6):835-846

Frequency synthesis has many applications in today's commercial electronic and telecommunication system design. Some techniques exist which can be used to generate a frequency that is an integer or fractional multiple of a reference frequency. This architecture is used to generate a signal of any desired frequency in a certain range from multiple reference signals with same frequency but different phases. These reference signals may come from a voltage-controlled oscillator (VCO) which is close looped with a reference clock by a phase-lock loop (PLL). This architecture provides some unique features, superior quality, and ease of implementation. In some cases, the synthesized frequency is time-average frequency. The signal can be treated as a carrier signal frequency modulated by another signal. Various phase-shifted versions and duty cycle versions of this signal can also be generated from this architecture. This architecture also has direct application to spread spectrum clock generation 相似文献

2.

Power Management of Voltage/Frequency Island-Based Systems Using Hardware-Based Methods

《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2009,17(3):427-438

Shrinking technology nodes combined with the need for higher clock speeds have made it increasingly difficult to distribute a single global clock across a chip while meeting the power requirements of the design. Globally asynchronous locally synchronous (GALS) design style can help achieve low power consumption and modularity of a design while greatly reducing the number of global interconnects. Such multiple clock domain architectures can benefit from having frequency/voltage values assigned to each domain based on workload requirements. The work presented in this paper proposes a new hardware-based approach to dynamically change the frequencies and potentially voltages of a voltage-frequency island (VFI) system driven by a dynamic workload. This technique tries to change the frequency of a synchronous island such that it will have efficient power utilization while satisfying performance constraints. In recent years, there have been major developments, both in industry and academia, in the field of multiprocessor systems. Such multiprocessor systems are very good candidates for VFI design style implementation, where one or more processors can be part of a single VFI. To demonstrate the feasibility of our proposed method, we have implemented a multiprocessor system for a field-programmable gate array (FPGA) platform that uses independently generated clocks for each processor. The results from the FPGA platform confirm the claim that the power consumption of a system can potentially be reduced while maintaining the performance of many applications. Our work concentrates primarily on embedded systems, but the idea can be explored for general-purpose computing as well. 相似文献

3.

Hardware Elliptic Curve Cryptographic Processor Over$rm GF(p)$

《IEEE transactions on circuits and systems. I, Regular papers》2006,53(9):1946-1957

A novel hardware architecture for elliptic curve cryptography (ECC) over$ GF(p)$is introduced. This can perform the main prime field arithmetic functions needed in these cryptosystems including modular inversion and multiplication. This is based on a new unified modular inversion algorithm that offers considerable improvement over previous ECC techniques that use Fermat's Little Theorem for this operation. The processor described uses a full-word multiplier which requires much fewer clock cycles than previous methods, while still maintaining a competitive critical path delay. The benefits of the approach have been demonstrated by utilizing these techniques to create a field-programmable gate array (FPGA) design. This can perform a 256-bit prime field scalar point multiplication in 3.86 ms, the fastest FPGA time reported to date. The ECC architecture described can also perform four different types of modular inversion, making it suitable for use in many different ECC applications. 相似文献

4.

一种基于ME算法的RS译码器VLSI高速实现方法

马健王卫民《电子科技》2011,24(4):17-19

针对ME算法VLSI结构进行了分析,提出ME算法的流水线及最小化VLSI结构,以满足数据处理速率不断提高的需求。并利用该算法实现结构设计了一种低资源占用率、低成本的高速RS译码器。逻辑综合及仿真结果表明,基于Altera公司CycloneII系列FPGA的RS(255,239)译码器,工作时钟达210 MHz,可满足数据速率1.68 Gb·s^-1的编译码要求。相似文献

5.

A versatile linear insertion sorter based on an FIFO scheme

Roberto Perez-Andrade Rene Cumplido Claudia Feregrino-Uribe Fernando Martin Del Campo 《Microelectronics Journal》2009,(12):1705-1713

A linear sorter based on a first-in first-out (FIFO) scheme is presented. It is capable of discarding the oldest stored datum and inserting the incoming datum while keeping the rest of the stored data sorted in a single clock cycle. This type of sorter can be used as a co-processor or as a module in specialized architectures that continuously require to process data for non-linear filters based on order statistics. This FIFO sorting process is described by four different parallel functions that exploit the natural hardware parallelism. The architecture is composed of identical processing elements; thus it can be easily adapted to any data lengths, according to the specific application needs. The use of compact identical processing elements results in a high performance yet small architecture. Some examples are presented in order to understand the functionality and initialization of the proposed sorter. The results of synthesizing the proposed architecture targeting a field programmable gate array (FPGA) are presented and compared against other reported hardware-based sorters. The scalability results for several sorted elements with different bits widths are also presented. 相似文献

6.

Core与总线系统的异步通信接口设计

薛乐平付宇卓谢凯年《微电子学与计算机》2006,23(7):111-115

文章基于GALS（Globally Asynchronous Locally Synchronous）设计理念，提出一个Core的异步接口设计模型：门控时钟停Core机制、握手机制、电平转脉冲逻辑等构成的异步控制信号处理模型：异步FIFO和双FIFO结构构成的异步数据处理模型。此结构允许Core和总线系统在完全异步的时钟域上工作。FPGA验证结果表明．该模型能正确地实现两者问的信号同步，并能满足具体应用的带宽需求。相似文献

7.

基于延迟锁相环和锁频环结构的全数字同步倍频器

下载免费PDF全文

曹玉梅梁珍珍赵海军《电子器件》2018,41(1)

针对现有基于PLLs/DLLs的全数字化同步倍频器结构存在的不足,本文提出了基于一种双环结构的全数字同步倍频器。它由延迟锁相环和锁频环共享一个共同的参考时钟信号(FREF)构成,不需要任何模拟组件。它可以采用Verilog-HDL语言设计,可在Altera DE2-70开发板上实现合成,而且可以很容易地适应于不同的FPGA系列以及作为一个集成电路实现,同时也可用于为分布式数字处理系统以及片上系统的片内/片间通信提供时钟参考;实验结果表明,本文所提出的结构相比于现有的结构,能够获得更高频率的输出时钟信号,提供更好的频率分辨率、更好的抖动性能和高倍乘因子。相似文献

8.

FPGA implementation of high-speed neural network for power amplifier behavioral modeling

Mohammed Bahoura 《Analog Integrated Circuits and Signal Processing》2014,79(3):507-527

In this paper, a high-speed pipelined architecture of dynamic neural network is proposed for power amplifier behavioral modeling. This architecture is implemented on field programmable gate array (FPGA) using Xilinx system generator and Virtex-6 FPGA ML605 Evaluation Kit. The novelty of the proposed architecture is that it provides higher operating frequency, lower output latency, and less required resources. These improvements are obtained by reducing the bit-width data and by efficiently redistributing the inserted pipelining delays. The new pipelined architecture is evaluated and compared to the conventional and pseudo-conventional ones in terms of the resource utilization, the maximum operating frequency, and the modeling performances using the 16-QAM modeled test signal. This architecture is verified using JTAG hardware co-simulation both for single step and free-running clock modes. 相似文献

9.

嵌入式Flash CISC/DSP微处理器的研究与实现 总被引：1，自引：0，他引：1

下载免费PDF全文

卢结成丁丁丁晓兵朱少华《电子学报》2003,31(8):1252-1254

本文研究一种新的既具有微控制器功能,又有增强DSP功能的高性能微处理器的实现架构.在统一的增强CISC指令集下,我们将基于哈佛和寄存器-寄存器结构的微处理器模块和单周期乘法/累加器、桶形移位寄存器、无开销循环及跳转硬件支持模块、硬件地址产生器等DSP功能模块以及嵌入式Flash Memory和指令队列缓冲器有机的集成起来,在统一架构下通过单核实现CISC/DSP微处理器,有效地提高了处理器的性能.该微处理器采用0.35μm CMOS工艺实现,芯片面积为25mm².在80M工作频率下,动态功耗为425mW,峰值数据处理能力可达80MIPS.该处理器核可满足片上系统(SOC)对高性能处理器的需求. 相似文献

10.

FPGA芯片内数字时钟管理器的设计与实现

李文昌李平杨志明李威王鲁豫《半导体技术》2011,36(11):848-852

在FPGA芯片内,数字时钟管理器(DCM)不可或缺,DCM主要完成去时钟偏移、频率综合和相位调整的功能,其分别由延迟锁相环(DLL)、数字频率合成器(DFS)以及数字相移器(DPS)三个模块来实现。对这三个模块的原理及设计进行了详细地阐述,并给出了仿真结果,该DCM电路通过了0.13μm工艺流片。测试结果表明,在低频模式下,该DCM能工作在24～230 MHz之间;在高频模式下,该DCM能工作在48～450 MHz之间,其输入及输出抖动容忍度在低频模式下能达到300 ps,在高频模式下能达到150 ps。相似文献

11.

Speed and area tradeoffs in cluster-based FPGA architectures

Marquardt A. Betz V. Rose J. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2000,8(1):84-93

One way to reduce the delay and area of field-programmable gate arrays (FPGAs) is to employ logic-cluster-based architectures, where a logic cluster is a group of logic elements connected with high-speed local interconnections. In this paper, we empirically evaluate FPGA architectures with logic clusters ranging in size from 1 to 20, and show that compared to architectures with size 1 clusters, architectures with size 8 clusters have 23% less delay (30% faster clock speed) and require 14% less area. We also show that FPGA architectures with large cluster sizes can significantly reduce design compile time-an increasingly important concern as the logic capacity of FPGA's rises. For example, an architecture that uses size 20 clusters requires seven times less compile time than an architecture with size 1 clusters 相似文献

12.

From synchronous to GALS: A new architecture for FPGAs

René Gagné Jean Belzile 《Microelectronics Journal》2009,40(11):1657-1666

The conflictual demand of faster and larger designs is increasingly difficult to answer by the advances of solid state technology alone. At some point, it is expected that designers and manufacturers will have to give up the traditional synchronous design methodology for a Globally Asynchronous Locally Synchronous (GALS) one. Such changes imply more synchronization constraints, but also more flexibility. Consequently, this paper proposes a novel Field-Programmable Gate Arrays (FPGA) architecture that is compatible with existing devices and that can also support GALS designs. The main objective is simple: the proposed architecture must appear unchanged for synchronous design, but it must also include a minimal amount of basic components to prevent metastability for efficient asynchronous communications. Thus, the paper presents the constraint equations required to implement such a circuit. It also presents a pausible clock generator application and simulation results for the proposed architecture. All results demonstrate that with a few additional customized circuits, a standard FPGA cell can become appropriate for GALS methodologies. 相似文献

13.

A New FPGA for DSP Applications Integrating BIST Capabilities

Alex Gonsales Marcelo Lubaszewski Luigi Carro Michel Renovell 《Journal of Electronic Testing》2004,20(4):423-431

This work proposes a new FPGA architecture, to meet the requirements of signal processing and testing of current system-on-chip designs. The proposed architecture provides the hardware reuse and the reconfigurability advantages of an FPGA, not only for the system functionality, but also for the system testing, while keeping the performance level required by current signal processing applications. This paper presents the new FPGA model, along with preliminary experimental results that clearly show the possible advantages at the system level of merging design and test in a reconfigurable device. 相似文献

14.

Design of a high speed parallel encoder for convolutional codes

A. Msir A. Dandache B. Lepley 《Microelectronics Journal》2004,35(2):151-166

This paper presents the design of high speed parallel architectures for convolutional encoders and its implementation on FPGA devices. Convolutional codes are widely used in telecommunication applications to improve the data transmission reliability over noisy chanels.The architecture proposed here combines parallel and pipelining techniques. A purely parallel approach can increase the number of processed bits per clock cycle. Unfortunately, the critical path propagation delay increases with the parallelism level. Consequently, the operating clock frequency decreases which in turn can dramatically limit the benefit of parallelization. This drawback can be significantly reduced using pipelining techniques. As a result, the critical path depends no more on the parallelism level.The encoder architectures have been implemented on FPGA devices of the Altera Flex10KE family. Bit rates up to 6.61 Gbits/s have been achieved on 32-bit parallel implementations. 相似文献

15.

A Case Study for NoC-Based Homogeneous MPSoC Architectures 总被引：1，自引：0，他引：1

《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2009,17(3):384-388

The many-core design paradigm requires flexible and modular hardware and software components to provide the required scalability to next-generation on-chip multiprocessor architectures. A multidisciplinary approach is necessary to consider all the interactions between the different components of the design. In this paper, a complete design methodology that tackles at once the aspects of system level modeling, hardware architecture, and programming model has been successfully used for the implementation of a multiprocessor network-on-chip (NoC)-based system, the NoCRay graphic accelerator. The design, based on 16 processors, after prototyping with field-programmable gate array (FPGA), has been laid out in 90-nm technology. Post-layout results show very low power, area, as well as 500 MHz of clock frequency. Results show that an array of small and simple processors outperform a single high-end general purpose processor. 相似文献

16.

An efficient multiplier-less architecture for 2-D convolution with quadrant symmetric kernels

Ming Z. Vijayan K. 《Integration, the VLSI Journal》2007,40(4):490-502

Design of a high-performance digital architecture for computing 2-D convolution utilizing the quadrant symmetry of the kernels is proposed in this paper. Pixels in the four quadrants of the kernel region with respect to an image pixel are considered simultaneously for computing the partial results of the convolution sum. The new architecture performs computations in the logarithmic domain by utilizing novel multiplier-less log₂ and inverse-log₂ modules. An effective data-handling strategy is developed in conjunction with the logarithmic modules to eliminate the necessity of multipliers in the architecture. The systolic architecture employs parallel and pipelined processing and is able to produce one output every clock cycle. The new design resulted in approximately 40% reduction in hardware resource when compared to the approach of multiplier-based quadrant symmetric architecture. The proposed architecture design is capable of performing convolution operations for 63.3, 1024×1024 frames or 66.4 million outputs per second with 22×22 kernel in a Xilinx's Virtex 2v2000ff896-4 FPGA at maximum clock frequency of 66.4 MHz. The error analysis performed in two image-processing applications of edge detection and noise filtering shows that the hardware implementation with proposed design provides accurate results similar to the software implementation. 相似文献

17.

集成式石英谐振加速度计倍频电路设计

下载免费PDF全文

刘迪田文杰金鑫陈福彬《压电与声光》2021,43(5):636-639

谐振式加速度计可以将加速度转换为频率信号,在导航、姿态控制等加速度计的应用领域,采集信号需要限定在较短时间内,为了满足应用的要求,基于一种单基片集成式石英谐振器,通过现场可编程门阵列(FPGA)实现了一种针对集成式石英谐振加速度计的倍频电路设计方案,包括时钟自适应模块和锁相环。时钟自适应模块根据当前输入信号产生锁相环基准时钟并将输入信号进行倍频。离心机加速度测试结果表明,当测量时间由1 s缩短为0.125 s时,传感器标度因数为3 173 Hz/g(g=9.8 m/s²),线性相关系数R²=0.999 32,与未倍频时相比,标度因数与线性度基本保持不变,所设计的倍频电路可应用于石英谐振加速度计的信号处理及数据采集系统中。相似文献

18.

Automatic Design of Reconfigurable Domain-Specific Flexible Cores

Compton K. Hauck S. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2008,16(5):493-503

Reconfigurable hardware is ideal for use in systems-on-a-chip (SoC), as it provides both hardware-level performance and post-fabrication flexibility. However, any one architecture is rarely equally optimized for all applications. SoCs targeting a specific set of applications can greatly benefit from incorporating customized reconfigurable logic instead of generic field-programmable gate-array (FPGA) logic. Unfortunately, manually designing a domain-specific architecture for every SoC would require significant design time. Instead, this paper discusses our initial efforts towards creating a reconfigurable hardware generator capable of automatically creating flexible, yet domain-specific, designs. Our tests indicate that our generated architectures are more than 5times smaller than equivalent FPGA implementations and nearly as area-efficient as standard cell designs. We also use a novel technique employing synthetic circuit generation to demonstrate the flexibility of our architecture generation techniques. 相似文献

19.

全数字延时锁定环及其应用 总被引：4，自引：0，他引：4

罗翔鲲《电子工程师》2004,30(6):22-24,43

介绍了一种区别于锁相环(PLL)和基于压控延迟线(VCDL)的延时锁定环(DLL)、全部由纯数字电路实现的DLL电路.该电路用于消除时钟时延,全数字的结构使其无条件稳定,不会累积相位误差,而且具有良好的噪声敏感度、较低的功耗和抖动性能.使其在时延补偿和时钟调整的应用中具有优势,并可全部嵌入单个芯片中.文中分析了全数字DLL的工作原理及其结构,给出了其在现场可编程门阵列(FPGA)中的应用. 相似文献

20.

Overcurrent relay on a FPGA chip

Mahmoud A. Manzoul Prasad Modali 《Microelectronics Reliability》1995,35(7)

A new hardware approach for implementing overcurrent relays is presented in this paper. An overcurrent relay is implemented on a field programmable gate array (FPGA) chip (Xilinx's XC3020-50-PC84C). The hardware design of the overcurrent relay is based on a three-stage pipelined architecture. A relationship that describes the time-current characteristics of the relay in terms of the clock frequency of the chip is developed. 相似文献