期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

徐立升徐根倩马正欣宋早迪蒋秀波周冬冬秦智超《电讯技术》2013,53(12)

针对八通道采样器AD9252的高速串行数据接口的特点,提出了一种基于FPGA时序约束的高速解串方法。使用Xilinx公司的FPGA接收高速串行数据,利用FPGA内部的时钟管理模块DCM、位置约束和底层工具Planahead实现高速串并转换中数据建立时间和保持时间的要求,实现并行数据的正确输出。最后通过功能测试和时序测试,验证了设计的正确性。此方法可适用于高端和低端FPGA,提高了系统设计的灵活性,降低了系统的成本。相似文献

2.

Delay-insensitive gate-level pipelining

S. C. R. F. J. S. M. D. 《Integration, the VLSI Journal》2001,30(2)

Gate-level pipelining (GLP) techniques are developed to design throughput-optimal delay-insensitive digital systems using NULL convention logic (NCL). Pipelined NCL systems consists of combinational, registration, and completion circuits implemented using threshold gates equipped with hysteresis behavior. NCL combinational circuits provide the desired processing behavior between asynchronous registers that regulate wavefront propagation. NCL completion logic detects completed DATA or NULL output sets from each register stage. GLP techniques cascade registration and completion elements to systematically partition a combinational circuit and allow controlled overlapping of input wavefronts. Both full-word and bit-wise completion strategies are applied progressively to select the optimal size grouping of operand and output data bits. To illustrate the methodology, GLP is applied to a case study of a 4-bit×4-bit unsigned multiplier, yielding a speedup of 2.25 over the non-pipelined version, while maintaining delay insensitivity. 相似文献

3.

Partitioning and pipelining for performance-constrainedhardware/software systems

Bakshi S. Gajski D.D. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1999,7(4):419-432

In order to satisfy cost and performance requirements, digital signal processing and telecommunication systems are generally implemented with a combination of different components, from custom-designed chips to off-the-shelf processors. These components vary in their area, performance, programmability and so on, and the system functionality is partitioned amongst the components to best utilize this tradeoff. However, for performance critical designs, it is not sufficient to only implement the critical sections as custom-designed high-performance hardware, but it is also necessary to pipeline the system at several levels of granularity. We present a design flow and an algorithm to first allocate software and hardware components, and then partition and pipeline a throughput-constrained specification amongst the selected components. This is performed to best satisfy the throughput constraint at minimal application-specific integrated-circuit cost. Our ability to incorporate partitioning with pipelining at several levels of granularity enables us to attain high throughput designs, and also distinguishes this paper from previously proposed hardware/software partitioning algorithms 相似文献

4.

A mesochronous pipelining scheme for high-performance digital systems

Tatapudi S.B. Delgado-Frias J.G. 《IEEE transactions on circuits and systems. I, Regular papers》2006,53(5):1078-1088

A novel mesochronous pipelining scheme is described in this paper. In this scheme, data and clock travel together. At any given time a pipeline stage could be operating on more than one data wave. The clock period in the proposed pipeline scheme is determined by the pipeline stage with largest difference between its minimum and maximum delays. This is a significant performance gain compared to conventional pipeline scheme where clock period is determined by the stage with the largest delay. A detailed analysis of the clock period constraints is provided to show the performance gains and Speedup of mesochronous pipelining over other pipelining schemes. Also, the number of pipeline stages and pipeline registers is small. The clock distribution scheme is simple in the mesochronous pipeline architecture. An 8 /spl times/ 8-bit carry-save adder multiplier has been implemented in mesochronous pipeline architecture using modest TSMC 180-nm (drawn length 200 nm) CMOS technology. The multiplier architecture and simulation results are described in detail in this paper. The pipelined multiplier is able to operate on a clock period of 350 ps (2.86 GHz). This is a Speedup of 1.7 times over conventional pipeline scheme, with fewer pipeline stages and pipeline registers. 相似文献

5.

Cut-through switching, pipelining, and scheduling for networkevacuation

Tassiulas L. 《Networking, IEEE/ACM Transactions on》1999,7(1):88-97

A general model of a virtual circuit network consisting of a number of servers and a number of traffic classes is considered. A traffic class is identified by the sequence of servers that should be visited and the corresponding service rates before a message (customer) of the class leaves the network. The following cases are distinguished: (1) the messages need nonpreemptive service; (2) the service of a message can be preempted at any time; (3) pipelining of the service in a sequence of servers is allowed; and (4) pipelining is not allowed. All of these cases arise in different transmission switching techniques and scheduling schemes. A fluid model that emerges when both preemption and pipelining are allowed is considered. Scheduling schemes in the fluid model are compared with corresponding ones in the network with nonpreemptive service and no pipelining. The problem of evacuating the network from an initial backlog without further arrival is identified in the fluid model. Based on that, a policy with nearly optimal evacuation time is identified for the store-and-forward case. Finally, scheduling with deadlines is considered and it is shown that in the fluid model, the evacuation problem is equivalent to a linear programming problem. The evacuation times under different work-conserving policies are considered in specific examples 相似文献

6.

A technique for the design of systolic arrays with bit-level pipelining

B. B. Madan S. R. Parker M. Zubair 《Circuits, Systems, and Signal Processing》1987,6(2):139-151

This paper describes a method for designing systolic structures with bit-level pipelining. The proposed technique starts with the signal flow graph representation of a given algorithm. A new signal flow graph rule, called the gain transfer rule is introduced to achieve bit-level pipelining. Using this approach, systolic arrays with bit-level pipelining are derived for a general recursive digital filter and a convolver. The proposed technique is quite general and has also been applied to obtain systolic structures for other problems such as vector transformation. In comparison with some previously reported designs, the new architectures are characterized by simpler basic processing cells and faster data throughput rate or smaller chip area requirements.The work of the first two authors was supported by an NRC Resident Research Associateship. 相似文献

7.

Fast multiplication in VLSI using wave pipelining techniques

Fabian Klass Michael J. Flynn Ad J. Van De Goor 《The Journal of VLSI Signal Processing》1994,7(3):233-248

Wave pipelining is a design methodology that can increase the clock frequency of digital systems. Also known asmaximum-rate pipelining, it has long been considered a technique for approaching the physical speed limit of a digital circuit. Unlike conventional pipelining, wave pipelining does not require internal clocked elements to increase throughput. The synchronization of internal computations is achieved by balancing inherent RC delays of combinational logic elements, thus allowing circuits to be pipelined at a very fine-grain level. In this article, we describe the design of a 16×16 wave-pipelined multiplier using a 1.0 μm CMOS process. The multiplier is designed using a conventional static CMOS technology. Simulation results show a speedup of about 7× over a nonpipeline implementation. 相似文献

8.

Some experiments about wave pipelining on FPGA's

Boemo E.I. Lopez-Buedo S. Meneses J.M. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1998,6(2):232-237

Wave pipelining offers a unique combination of high speed, low latency, and moderate power consumption. The construction of wave pipelines is benefited by the use of gates and buffers with data-independent delays and the knowledge of the interconnection delays. These two features are present in several SRAM-based field programmable gate arrays (FPGA's): look-up tables (LUT's) allow the designer to mask the delay of different gates and combinational functions, and the timing characteristics of each wire segment are a priori known. This work describes a set of experiments about wave pipelining on FPGA's. The results show that a 13-LUT logic depth circuit mapped on an XC4005PC84-6 runs as high as 85 MHz (single phase clocking) or 80 MHz (intentionally skewed clocking), exhibiting a latency of 95 ns. This high throughput/latency ratio is unattainable using classic pipelining 相似文献

9.

Self-timed pipelining using latest arriving signal detection

Jin-Ku Kang 《Electronics letters》2001,37(10):615-617

A self-timed pipelining methodology using latest arriving signal detection is presented. The self-timing control block in the algorithm consists of a self-timing signal generator and pipelining latches. The computation completion of a logic block can be detected and the data latched by the pulse-type self-timing signal for further processing. Using the algorithm, a 32-bit carry look-ahead adder is implemented. Simulation results show that the adder can operate at 800 MHz in 0.25 μm CMOS technology 相似文献

10.

基于时间提前量与网络结构关联的高铁LTE覆盖优化探讨

季安平《电信工程技术与标准化》2017,30(4)

结合高铁线上LTE网络的测试结果,同步利用网管指标对合宁高铁安徽段沿线的高铁基站采用基于时间提前量(TA)的网络覆盖距离估算.通过路测数据、基站覆盖半径与高铁沿线站址结构综合判断高铁沿线基站异常覆盖环境,快速提出RF及功率优化建议.使网络优化工作更具有针对性,提高LTE网络优化的工作效率. 相似文献

11.

A wireless MAC protocol using implicit pipelining

Xue Yang Vaidya N.H. 《Mobile Computing, IEEE Transactions on》2006,5(3):258-273

In distributed multiple access control protocols, two categories of overhead are usually associated with contention resolution. One is channel idle overhead, where all contending stations are waiting to transmit. Another is collision overhead, which occurs when multiple contending stations attempt to transmit simultaneously. Either idle overhead or collision overhead being large, contention resolution algorithm would be inefficient. Prior research work tries to minimize both the idle and the collision overheads using various methods. In this paper, we propose to apply "pipelining" techniques to the design of multiple access control protocol so that channel idle overhead could be (partially) hidden and the collision overhead could be reduced. While the concept of pipelined scheduling can be applied to various MAC protocol designs in general, in this paper, we focus on its application to IEEE 802.11 DCF. In particular, an implicitly pipelined dual-stage contention resolution MAC protocol (named DSCR) is proposed. With IEEE 802.11, the efficiency of contention resolution degrades dramatically with the increasing load due to high probability of collision. Using the implicit pipelining technique, DSCR hides the majority of channel idle time and reduces the collision probability, hence, improves channel utilization, average access delay, and access energy cost over 802.11 significantly both in wireless LANs and in multihop networks. The simulation results, as well as some analysis, are presented to demonstrate the effectiveness of DSCR. 相似文献

12.

Improved clustered look-ahead pipelining algorithm with minimumorder augmentation

Kyung Hi Chang 《Signal Processing, IEEE Transactions on》1997,45(10):2575-2579

The author compares the overall performance of clustered look-ahead (CLA) filters with minimum order augmentation and scattered look-ahead (SLA) filters. To optimize the domain searching procedure and to improve the numerical performance, an improved CLA algorithm with minimum order augmentation, which is especially beneficial for high-Q high-speed digital filters, is proposed. This algorithm is optimized in the aspect of minimizing both the augmented pipelining order and undesirable quantization effects simultaneously. The proposed CLA structure with minimum order augmentation turns out to be very suitable for the applications that require high-speed and numerical stability 相似文献

13.

Circuit power optimization using pipelining and dual-supply voltage assignment

《Integration, the VLSI Journal》2019

Power is one of the most important metrics in the modern integrated circuit design. We optimize the circuit power using two major approaches, pipelining and dual-supply voltage (dual-V_dd) assignment. To improve power efficiency, we have designed a new pipelining to reduce the number of gates need to be assigned to the high supply voltage when combined with the dual-V_dd assignment. Our overall design is tested on a set of standard ISCAS-85 benchmark circuits using an industrial cell library. An average power saving of more than 10% under a specified target delay is observed. 相似文献

14.

化学清洗时机的确定

辛小燕罗钫《洗净技术》2004,2(5):31-34

为了解决工厂常遇到的对设备进行化学清洗时机难以确定的问题，从设备安全、正常生产和节能的角度考虑，提出包括新装设备的清洗，定期清洗，根据结垢量决定清洗时机，根据运行参数决定清洗时机，从经济上考虑等确定清洗时机的方法及其依据，推导出有关的计算公式。据此，工厂可从化学清洗获得更大效益。相似文献

15.

Optimizing throughput and resource utilization using pipelining: Transformation based approach

Miodrag Potkonjak Jan Rabaey 《The Journal of VLSI Signal Processing》1994,8(2):117-130

A simple formulation of pipelining: Pipelining withN stages is equivalent to retiming where the number of delays on all inputs or all outputs, but not both, is increased byN is used as the basis for a convenient and efficient treatment of pipelining in the design of application specific computers.Pipelining according to the objective function (throughput or resource utilization) and the latency is introduced. For two polynomial complexity pipelining classes, optimal algorithms are presented. For two other classes both proofs of NP-completeness and efficient probabilistic algorithms are presented. Both theoretical and experimental properties of pipelining are discussed and a relationship with other transformations is explored. Due to similar formulations for both software pipelining and the pipelining presented here, all results can be easily modified for use in compilers for general purpose computers. We have also developed a polynomial complexity algorithm for determining the iteration bound.This work was done while the first author was at the University of California, Berkeley. 相似文献

16.

Efficient operator pipelining in a bit serial genetic algorithmengine

Bland I.M. Megson G.M. 《Electronics letters》1997,33(12):1026-1028

The authors propose a bit serial pipeline used to perform the genetic operators in a hardware genetic algorithm. The bit-serial nature of the dataflow allows the operators to be pipelined, resulting in an architecture which is area efficient, easily scaled and is independent of the lengths of the chromosomes. An FPGA implementation of the device achieves a throughput of >25 million genes per second 相似文献

17.

Macro pipelining based scheduling on high performance heterogeneousmultiprocessor systems

Banerjee S. Hamada T. Chau P.M. Fellman R.D. 《Signal Processing, IEEE Transactions on》1995,43(6):1468-1484

Presents a technique for pipelining heterogeneous multiprocessor systems, macro pipelining based scheduling. The problem can be identified as a combination of optimal task/processor assignment to pipeline stages as well as a scheduling problem. The authors propose a new technique based on iterative applications of partitioning and scheduling schemes whereby the number of pipeline stages are identified and the scheduling problem is solved. The pipeline cycle is optimized in two steps. The first step finds a global coarse solution using the ratio cut partitioning technique. This is subsequently improved by the iterative architecture driven partitioning and the repartitioning and time axis relabeling techniques of the second step. The authors have considered a linear interprocessor communication cost model in scheduling. The proposed technique is applied to several examples. They find that for these examples, the proposed macro pipelining based scheduling can improve the throughput rate several times that of the conventional homogeneous multiprocessor scheduling algorithms 相似文献

18.

基于CORDIC算法的流水型DDS设计与研究

江金浓谢扩军《现代电子技术》2012,35(22):104-106

在分析CORDIC算法原理基础上,提出了一种基于CORDIC算法的流水型DDS结构,用以取代传统的ROM查找表法。同时对输入角度进行预处理,对迭代结果进行后处理,实现了整个周期的三角函数计算。设计采用verilog语言描述,在QuartusⅡ9．0下编译综合,以及Modelsim-altera6．4进行了仿真。结果表明,该算法比传统算法具有计算角度范围大、高速度和低资源的优势。相似文献

19.

基于FPGA流水线结构并行FFT的设计与实现

王英喆杜蓉《电子设计工程》2015,23(4)

根据实时信号处理的需求,提出了一种基于FPGA的512点流水线结构快速傅里叶变换(FFT)的设计方案,采用4个蝶形单元并行处理,在Xilinx公司的Virtex7系列的FPGA上完成设计.处理器将基2算法与基4算法相结合,蝶形运算时把乘法器IP核的旋转因子输入端固定为常数,而中间结果用FIFO缓存.采用硬件描述语言verilog完成设计,并进行综合、布局布线,测试结果与MATLAB仿真结果相吻合. 相似文献

20.

Timing Recovery for Backplane Ethernet

Wei Zhang Spencer R.R. 《IEEE transactions on circuits and systems. I, Regular papers》2007,54(8):1711-1723

The dominant solutions for single-chip multi-port backplane Ethernet transceivers utilize a dual-loop design - a combination of a single master phase-locked loop (PLL) and multiple slave delay-locked loops (DLL). Each transmitter or receiver port has its own DLL, which delays or advances a copy of the master clock from the master PLL to generate its own clock signal for synchronization. The DLLs are typically implemented using current-mode logic phase interpolators. This paper presents an alternative solution to this synchronization problem. Instead of moving the sampling phase, timing recovery is done by changing the group delay of the receiver-side forward equalizer by rotating its tap coefficients. The standard least-mean-square algorithm is used for coefficient rotation. This solution is equivalent to a first-order PLL/DLL, which suffers from steady-state timing offset when there is a frequency offset between the transmitter and the receiver. However, the degradation in performance caused by a frequency offset is significantly reduced by using a coefficient-rotation digital-signal processor capable of detecting and reducing the offset. With a practical frequency accuracy specification of plusmn100 ppm, the improved performance can approach that of the PLL/DLL dual-loop solution. 相似文献