首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Improvements in gallium arsenide materials technology have led to the rapid development of GaAs MIC, CCD, and digital IC technologies in the last several years. In this paper we consider the additional capabilities afforded by the inherent piezoelectric properties of GaAs. The primary emphasis of the work is on surface acoustic wave (SAW) device configurations using MESFET and Schottky-barrier diode fabrication techniques which are compatible with the eventual monolithic integration of electronic devices on the same substrate. The GaAs SAW technology described here provides a means for achieving electronically variable delay, high-Q resonator structures for VHF/UHF oscillator frequency control, and real-time signal processing operations such as convolution and correlation. Prototype device designs and performance are described, includlng two-port GaAs SAW resonators with Q's as large as 13 000 at 118 MHz and a programmable GaAs SAW PSK correlator capable of signal correlation at 10-MHz chip rates. Further GaAs SAW device development required for increasing the operating frequency range to 500 MHz and processing bandwidth to 100 MHz is indicated.  相似文献   

2.
Earlier study by this author has shown that traffic can be captured in multicore architectures with negligible overhead, thus introducing the topic of multicore parallelization into packet traffic analysis. Designs that involve locking or message passing are considered default in parallel processing, which is incompatible with packet traffic with its high throughput and small grain. This paper proposes a new design for shared memory which can be used by any number of processes completely lockfree. Analysis of processing overhead and C/C + + implementation of the design are presented. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

3.
4.
A detailed investigation of the switching characteristics for high frequency power devices based on different technologies has been provided: BJT, MOSFET, and MESFET/HEMT structures are considered. Advantages of GaAs power switches over silicon ones have been established and illustrated. Hybrid technology prototypes of DC-to-DC power converters operating above 10 MHz and exclusively using GaAs components have been realized: for a nonoptimized boost converter operating at 100 MHz, a power efficiency of 54% has been achieved with a 6 V/12 V conversion ratio and an output power of 1.5 W. For optimized prototypes, using high frequency assembly techniques, an efficiency of 80% at 50 MHz, 74% at 100 MHz and 60% at 250 MHz have been obtained with 6 V/12 V and 3 W  相似文献   

5.
New applications demand very high processing power when run on embedded systems. Very Long Instruction Word (VLIW) architectures have emerged as a promising alternative to provide such processing capabilities under the given energy budget. However, in this new VLIW-based architectures, the register file is a very critical contributor to the overall power consumption and new approaches have to be proposed to reduce its power while preserving system performance. In this paper, we propose a novel joint hardware-software approach that reduces the leakage energy in the register files of these embedded VLIW architectures. This approach relies upon an energy-aware register assignment method and a hardware support that creates sub-banks in the global register file that can be switched on/off at run time. Our results indicate energy savings in the register file, after considering the overhead of the added extra hardware, up to 50% for modern multimedia embedded applications without performance degradation. We illustrate this approach using real-life applications running on these processors. We also illustrate the tradeoff between the area overhead vs. the gains in the leakage energy for the different strategies.  相似文献   

6.
This survey paper reviews numerous high-level transformation techniques which can be applied at the algorithm or the architecture level to improve the performance of digital signal and image processing architectures and circuits implemented using VLSI technology. Successful design of VLSI signal and image processors requires careful selection of algorithms, architectures, implementation styles, and synthesis techniques. High-level transformations can play an important role in reducing silicon area or power at the same speed or in increasing the speed for same area. These transformations can also increase the suitability of an algorithm for a particular architectural style. The transformation techniques reviewed in this paper include pipelining, parallel processing, retiming, unfolding, folding, look-ahead, relaxed look-ahead, associativity, distributivity, and reduction in strength.  相似文献   

7.
Modular division operation has important application in public-key cryptosystems. It is the most complex and time-consumed operation in RSA and ECC. Its secure and efficient implementation greatly affects the secure and performance of these cryptosystems. In this paper, a modular division algorithm embedding with error detection is proposed. Four computing types of ASIC implementation architectures (Type-8, Type-16, Type-32, Type-64) are explored to seek the optimal tradeoff among error detection ratio, time overhead and hardware overhead. These implementation architectures are modeled in Verilog language and synthesized using Synopsys Design Compiler with OSU 90 nm CMOS standard cell library. Experiment results show that the proposed Type-64 can get almost 100% error detection probability with an average of 24.71% extra area overhead and 0.52% time overhead. In addition, for the implementation of single modular division module, the proposed Type-64 architecture saves 60.74% area overhead on average with a slight decrease of throughput rate compared with the state-of-the-art re- search. This implementation not only greatly reduces the area overhead of modular division but also improves the security of modular division implementation.  相似文献   

8.
This paper examines the design of a 32-b GaAs Fast RISC microprocessor (F-RISC/I). F-RISC/I is a single chip GaAs Heterojunction MESFET (HMESFET) processor targeted for implementation on a multichip module (MCM) together with cache memories. The CPU architecture, circuit design. Implementation, and testing are optimized for a seven-stage instruction pipeline implemented with GaAs super-buffered FET logic (SBFL). We have been able to verify novel GaAs SBFL standard cells and compare measured CPU performance with performance estimates based on circuit and device models. The prototype 32-b microprocessor has been implemented using an automated standard cell approach because of time constraints and fabricated using an experimental process by Rockwell International. The CPU chip integrates 92340 transistors on a 7×7 mm2 die and dissipates 6.13 W at 180 MHz. Test results from a prototype fabrication run have demonstrated the operation of the ALU, the program counter, and the register file with delays below 6, 5, and 3.4 ns, respectively. The successful modeling and verification indicate that a 0.5 μm HMESFET implementation of F-RISC/I could achieve a peak performance of 350 MHz. The wiring delays account for 42% of the critical path delay  相似文献   

9.
The evolution of modern communications satellites from the older cable-in-the-sky concept toward more intelligent architectures, exploiting onboard processing (OBP) techniques involving various technologies to improve the system performance and flexibility, is addressed. The key components in advanced communications satellite repeaters compatible with integrated optical device implementation are presented. The relevant device technologies and fabrication techniques are examined, and integrated optical circuit configurations that can be applied to OBP are described with reference to recent experimental data. Particular attention is given to optical beamforming networks. Areas for further research and development are suggested  相似文献   

10.
We present parallel algorithms and array architectures for pyramid vector quantization (PVQ) [1] for use in image coding in low-power wireless systems. PVQ presents an alternative to other quantization methods which is especially suitable for symmetric peer-to-peer communications like video-conferencing. But, both the encoding and decoding algorithms have data-dependent iteration bounds and data-dependent dependencies which prevent efficient parallelization of the algorithms for either hardware or software implementations. We perform an algorithmic transformation [2] to convert the data-dependent regular algorithms to equivalent data-independent algorithms. The resulting regular algorithms exhibit modular and regular structures with minimal control overhead; hence, they are well suited for VLSI array implementation in ASIC or FPGA technologies. Based on our parallel algorithms and systematic design methodologies [3], we develop linear array architectures. Both encoder and decoder architectures consist of L identical processors with local interconnections and provide O(L) speed-up over a sequential implementation, where L is the dimension of a vector. The architectures achieve 100% processor utilization and permit power savings through early completion. A combined encoder-decoder architecture is also presented.  相似文献   

11.
This article presents PAPRICA-3, a VLSI-oriented architecture for real-time processing of images and its implementation on HACRE, a high-speed, cascadable, 32-processors VLSI slice. The architecture is based on an array of programmable processing elements with the instruction set tailored to image processing, mathematical morphology, and neural networks emulation. Dedicated hardware features allow simultaneous image acquisition, processing, neural network emulation, and a straightforward interface with a hosting PC.HACRE has been fabricated and successfully tested at a clock frequency of 50 MHz. A board hosting up to four chips and providing a 33 MHz PCI interface has been manufactured and used to build BEATR IX, a system for the recognition of handwritten check amounts, by integrating image processing and neural network algorithms (on the board) with context analysis techniques (on the hosting PC).  相似文献   

12.
High efficiency video coding (HEVC) video codec applies different techniques in order to achieve high compression ratios and video quality that supports real-time applications. One of the critical techniques in HEVC is the Context adaptive Binary Arithmetic Coding (CABAC) which is type of entropy coding. CABAC comes at the cost of increased computational complexity, especially for parallelization and pipeline of these blocks: binarization, context modeling and binary arithmetic encoding. The Binarization (BZ) and de-Binarization (DBZ) methods are considered as important techniques in HEVC CABAC encoder and decoder respectively. Indeed, an important goal is to get high throughput in hardware architectures of CABAC BZ and DBZ in order to achieve high resolution applications. This work is the only one found on recent literature which focuses on design and implementation of full BZ and full DBZ compatible with H.265 and H.264. Consequently, a hardware architectures of BZ and DBZ are designed and implemented by using VHDL language, targeted an FPGA virtex4 xc4vsx25-12ff668 board and emulated with ModelSim. As a result, the implementation of BZ and DBZ can process 2 bins/cycle for each syntax element when operated at 697.83 MHz and 789.26 MHz, respectively. The proposed designs exhibits an improved high-throughput of 1395.66 Mbins/s for BZ and 1578.52 Mbins/s for the DBZ. The obtained Area Efficiencies in our proposed BZ and DBZ are about 0.544 Mbins/s/slices and 0.606 Mbins/s/slices, respectively, and it is better than many recent works.  相似文献   

13.
Smart antennas in software radio base stations   总被引:1,自引:0,他引:1  
The application of adaptive antenna techniques to fixed-architecture base stations has been shown to offer wide-ranging benefits, including interference rejection capabilities or increased coverage and spectral efficiency. Unfortunately, the actual implementation of these techniques to mobile communication scenarios has traditionally been set back by two fundamental reasons. On one hand, the lack of flexibility of current transceiver architectures does not allow for the introduction of advanced add-on functionalities. On the other hand, the often oversimplified models for the spatiotemporal characteristics of the radio communications channel generally give rise to performance predictions that are, in practice, too optimistic. The advent of software radio architectures represents a big step toward the introduction of advanced receive/transmit capabilities. Thanks to their inherent flexibility and robustness, software radio architectures are the appropriate enabling technology for the implementation of array processing techniques. Moreover, given the exponential progression of communication standards in coexistence and their constant evolution, software reconfigurability will probably soon become the only cost-efficient alternative for the transceiver upgrade. This article analyzes the requirements for the introduction of software radio techniques and array processing architectures in multistandard scenarios. It basically summarizes the conclusions and results obtained within the ACTS project SUNBEAM, proposing algorithms and analyzing the feasibility of implementation of innovative and software-reconfigurable array processing architectures in multistandard settings  相似文献   

14.
2.5Gb/s SDH/SONET传送开销处理器芯片实现   总被引:1,自引:1,他引:0  
设计了一种2.5Gb/s同步光纤网络SDH/SONET中传送开销处理器芯片.采用双向4路总线流水线结构,77.76MHz的系统时钟,即可实时处理2.5Gb/s的SDH/SONET数据.支持STM-16、4路STM-4和STM-1的再生段开销和复用段开销处理以及STS-48、4路STS-12和STS-3的段开销和线路开销处理.采用TSMC 0.13μm工艺流片,电路规模约48万门,技术指标符合ITU-T标准.  相似文献   

15.
Emerging applications for portable wireless voice and data communications systems are requiring increased data rates and functionality. Meeting cost and performance goals requires careful attention to system level design and partitioning such that appropriate technologies are employed in cost-effective solutions. New circuit designs and techniques are required to meet size, power, and regulatory restrictions. This provides an exciting opportunity for GaAs, silicon, and passive component technologies. This review paper will discuss factors influencing the choice of which technology is best suited to a particular application and present several system level architectures of radio-based communication systems. The paper will illustrate appropriate applications of GaAs, silicon, and passive integrated circuit technologies. A summary is given that highlights the relative strengths and weaknesses of each technology to date  相似文献   

16.
This paper describes an application in high-performance signal processing using reconfigurable computing engines: a 250 MHz cross correlator for radio astronomy. Experimental results indicate that complementary metal-oxide-semiconductor (CMOS) field programmable gate arrays (FPGA's) can perform useful computation at 250 MHz. The notion of an “event horizon” for FPGA's leads to clear design constraints for high-speed application developers, and can be applied to a variety of real-time signal processing algorithms. Recent estimates indicate that higher performance FPGA's available early in 1998 can attain speeds of over 300 MHz using 20% fewer logic elements than current designs. The results of this design work provide important clues on how to improve FPGA architectures for signal processing at hundreds of MHz. Direct routing channels between logic elements can significantly increase performance. Routing architectures with four-way symmetry allow for rotations and reflections of subcircuits needed for optimal packing density. Experimental results indicate that clock buffering often limits the top speed of the FPGA. Wave pipelining of the clock distribution network may improve FPGA performance  相似文献   

17.
This article describes a method for increasing the sampling rate of efficient polyphase arbitrary resampling FIR filters. An FPGA proof of concept prototype of this architecture has been implemented in a Xilinx Kintex-7 FPGA which is able to convert the sampling rate of a signal from 500 MHz to 600 MHz. This article compares this new architecture with other best known efficient resampling architectures implemented on the same FPGA. The area usage on the FPGA shows that our proposed implementation is very proficient in high bandwidth applications without requiring significantly more resources on the FPGA. A theoretical calculation of the resampling error introduced on a modulated data stream is provided to evaluate the new architecture against other existing resampling architectures.  相似文献   

18.
刘勇鹏  王锋  卢凯  刘勇燕 《电子学报》2012,40(2):223-229
在大规模并行计算系统中,并行检查点触发大量结点同时保存计算状态,造成巨大文件存储空间开销,以及对通信和存储系统的巨大访问压力.数据压缩可以缩小检查点文件尺寸,从而降低存储空间开销以及对通信和存储系统的访问压力.但是,它也带来额外的压缩计算开销.本文针对异构并行计算系统,提出流水线式并行压缩检查点技术,采用一系列优化技术来降低压缩引入的计算延时,包括:流水线式双重写缓存队列、文件写操作的合并、GPU加速的流水压缩算法和GPU资源的多进程调度,等等.本文介绍了该技术在天河一号系统中的实现,并对所实现的检查点系统进行综合评测.实验数据表明该方法在大规模异构并行计算系统中是可行、高效、实用的.  相似文献   

19.
In the context of motion estimation for video sequences processing, variable block size algorithms, like the Adaptive Block Matching Algorithm (ABMA), have been proposed to match better “objects in motion” compared to the classical BMA. However, the variable block size grid derivation and the related motion estimation relies on a regularization process which implies heavy iterative and inter-dependent computations. Though the parallelization of the BMA is straightforward, the ABMA needs a deeper analysis before its implementation in a distributed environment: this is the goal of this paper. We first designed a modelization of the motion estimation of ABMA. This model can lead to several different distributed versions. A specific distributed model, with one master and several slaves, is then described. An implementation of this model has been realized and experimentations demonstrate a linear speedup with respect to the number of processors.  相似文献   

20.
This paper describes the design and implementation of the massively parallel processor based on the matrix architecture which is suitable for portable multimedia applications. The proposed architecture in this paper achieves the high performance of 40 GOPS in the case of consecutive fixed-point 16-bit additions at 200MHz clock frequency and the small power dissipation of 250mW. In addition, 1Mbit SRAM for data registers and 2048 2-bit-grained processing elements connected by a flexible switching network are integrated in the small area of 3.1 mm 2 in 90nm CMOS low standby technology. These design techniques and architectures described in this paper are attractive for realizing area-efficient, energy-efficient, and high-performance multimedia processors  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号