首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
AVS插值算法的一种高效的硬件结构设计与实现   总被引:2,自引:0,他引:2  
提出了AVS解码系统中帧间运动补偿插值算法的一种面向FPGA/ASIC的硬件结构设计.阐述了插值过程的各功能单元的结构,给出了仿真结果及硬件规模.结果表明本文提出的结构设计支持720×576,4:2:0,30FPS的视频在54MHz最低工作频率下的实时解码,是一种适合于集成的高效并行VLSI结构设计.  相似文献   

2.
一种用于脱机手写体汉字识别的   总被引:1,自引:0,他引:1  
对于脱机手写体汉字识别来说,速度一直是很重要的一个问题.而其中输入汉字特征向量和样本库中海量模板匹配占用了很多的识别时间,约为系统识别时间的4/5,是提高汉字识别速度的瓶颈.研究了基于多层流水线,采用多组计算单元、多体存储器和多执行单元的并行处理结构,充分利用硬件特性,并行处理模板匹配.并在此基础之上,设计了手写体汉字识别专用芯片,进行了逻辑模拟,且用FPGA实现了该设计,结果证明,采用此种方法使汉字识别速度大幅度提高.  相似文献   

3.
高延敏 《微电子学》1992,22(4):31-34
本文介绍了ASIC设计自动化最新工具——FPGA开发系统的软、硬件支撑环境,FPGA的概况,特点和基本结构,FPGA系列器件和工作频率以及在微机FPGA开发系统上如何进行ASIC电路的设计,最后给出一个设计实例的流程。  相似文献   

4.
This paper proposes a new quantitative and systematic design methodology for high-speed interpolation/ averaging ADCs. The methodology consists of a new mathematical BW/gain model derived from the pre-amps arrays static model and new small-signal model, a new offset modeling mechanism that can accurately estimate and help to reduce the offset, and hence a new systematic design flow. The methodology enables a quantitative and systematic analysis, conducts iterative and accurate calculation realized in MATLAB, and finally leads to an optimized ADC design that reaches a guaranteed optimum full-chip performance with given specs (i.e., resolution, speed, input range, power, input CM, etc.). The methodology shows significant advantage over the traditional trial-and-error ADC design approach with respect to performance and design efficiency, and is much more reliable in nanometer technologies. The proposed methodology was fully validated in silicon: a 4-bit 5GSps interpolation/averaging ADC is fabricated in a 65 nm CMOS technology and the measurement results show that our ADC has indeed achieved an outstanding and comparable overall performance compared to reported state-of-the art ADCs.  相似文献   

5.
Wireless Networks - In this paper, an extended continuous class-F power amplifier (PA) is investigated, designed, and fabricated. The new auxiliary parameter $$left(beta +alpha...  相似文献   

6.
Fabrication cost of application-specific integrated circuits (ASICs) is exponentially rising in deep submicron region due to rapidly rising non-recurring engineering cost. Field programmable gate arrays (FPGAs) provide an attractive alternative to ASICs but consume an order of magnitude higher power. There is a need to explore ways of reducing FPGA power consumption so that they can also be employed in ultra low power (ULP) applications instead of ASICs. Subthreshold region of operation is an ideal choice for ULP low-throughput FPGAs. The routing of an FPGA consumes most of the chip area and primarily determines the circuit delay and power consumption. There is a need to design moderate-speed ULP routing switches for subthreshold FPGA. This article proposes a novel subthreshold FPGA routing switch box (SB) that utilises the leakage voltage through transistor as biasing voltage which shows 69%, 61.2% and 30% improvement in delay, power delay product and delay variation, respectively, over conventional routing SB.  相似文献   

7.
Recent studies have shown that On-Chip Interconnects (OCI) architecture represents one of the most important component that determines the overall performance of future System-on-Chip (SoC). In order to improve the performance of a specific SoC application domain, the OCI architecture must be optimized at design/run time. Different OCI-based architectures have been recently proposed, the most recent ones are fractal-based or self-similar topologies. In this paper, we present a customization approach by adding strategic links targeted to match large application workload. Simulations results show the effectiveness of this method to achieve better performance compared to the basic OCI architectures. Furthermore, fractal based OCIs perform well almost in all traffic patterns because of their attractive properties.  相似文献   

8.
9.
介绍了一种数字式无线内窥镜的系统方案及其胶出并实现了用于该数模混合专用芯片的FPGA验证系统及验证流程.为了进行芯片系统级低功耗设计,验证系统完成了体内硬件部分的能量测试.  相似文献   

10.
CMOS for the mixed-mode applications has gained much interest recently. While the International Technology Roadmap for Semiconductors provides two different scaling guidelines for the analog and digital circuit operation using the bulk MOSFET, there are no well-defined scaling guidelines for improving the analog performance of silicon-on-insulator (SOI) MOSFETs. This paper presents a systematic and quantitative comparison between the analog characteristics of the bulk and SOI technology. The intrinsic gain, f/sub T/ and g/sub m//I/sub ds/ ratio are considered as a metric for this comparison. It is shown that, even for the operating frequencies in the range of gigahertz (where the ac kink effect is suppressed), analog performance of SOI devices is inferior to that of the bulk devices due to the capacitive drain-to-body coupling. Based on our study, we show that hat the gate-workfunction engineering (close to mid-gap workfunction) is essential in fully depleted SOI (FDSOI) devices for improving analog performance. The analog performance of partially depleted SOI (PDSOI) devices can be improved by using body-tied structures. An increased gate control in double-gate MOSFETs can provide very high output resistance for short-channel devices.  相似文献   

11.
12.
This paper presents a methodology which can be used to implement any decimator symmetric/antisymmetric (S/A) finite impulse response (FIR) filter. Two varieties are developed: a classic distributed arithmetic (CDA) based and a modified distributed arithmetic (MDA) based one. Both exploit the polyphase structure and the symmetry/antisymmetry of the filter and are evaluated in terms of area efficiency, speed and power consumption. The choice of the algorithm depends on the performance metrics targeted. The methodology has been applied to implement the filter bank CDF9/7 which constitutes a one dimensional (1D) and one level discrete wavelet transform (DWT). The filter bank also known as the bior4.4 biorthogonal wavelets is recommended by the JPEG2000 standard for lossy compression of images and video. The architecture has been implemented on an Altera field programmable gate array (FPGA) and the simulations run in Matlab, Modelsim and Altera Quartus II. The results prove the efficiency of the algorithms and show the tradeoff between the area occupied, the throughput and the power consumption.  相似文献   

13.
马晓骏  童家榕 《微电子学》2004,34(3):326-329,333
针对在FPGA芯片中的应用特点,设计了一种边界扫描电路,应用于自行设计的FPGA新结构之中。该电路侧重于电路板级测试功能的实现,兼顾芯片功能的测试;同时,加入了器件编程功能。在电路设计中采用单触发器链寄存器技术,节省芯片面积。版图设计采用0.6μm标准CMOS工艺,并实际嵌入FPGA芯片中进行流片。该电路可实现测试、编程功能,并符合IEEE1149.1边界扫描标准的规定,测试结果达到设计要求。  相似文献   

14.
15.
Exploiting specific properties of the algorithm, a high-throughput pipelined architecture is introduced to implement the H.264/AVC deblocking filter. The architecture was synthesized in 0.18 μm technology and the clock frequency and area are 400 MHz and 16.8 Kgates, respectively. Also, it is able to filter 217 and 55 Frames per second (Fps) for Full- and Ultra-HD videos, respectively. The introduced architecture outperforms similar ones in terms of frequency (1.8× up to 4×), throughput, (1.5× up to 3.8×), and Fps. Moreover, extensions to support different sample bit-depths and chroma formats are included. Also, experimental results for different FPGA families are offered.  相似文献   

16.
This paper describes an architecture for a high-performance switching fabric that can accommodate circuit-switched and packet-switched traffic in a unified manner. The switch fabric is self-routeing and uses fixed-length minipackets within the switching fabric for all types of connections. Its kernel architecture is based on a routeing topology with individual connection paths from all inputs to all outputs and with FIFO queuing at each output. Owing to the disjoint connection paths, there is no internal blocking, and because of output queueing, output port blocking is prevented to a great extent. The uniformity in architecture allows construction of any size fabric from a single basic module which could be realized on a single chip. Larger-size configurations can be realized either as single-stage or multistage configuration. The second part of this paper discusses performance aspects and gives results and dimensioning guidelines for both circuit-switched and packet-switched traffic.  相似文献   

17.
In this paper, we propose a methodology for accelerating application segments by partitioning them between reconfigurable hardware blocks of different granularity. Critical parts are speeded-up on the coarse-grain reconfigurable hardware for meeting the timing requirements of application code mapped on the reconfigurable logic. The reconfigurable processing units are embedded in a generic hybrid system architecture which can model a large number of existing heterogeneous reconfigurable platforms. The fine-grain reconfigurable logic is realized by an FPGA unit, while the coarse-grain reconfigurable hardware by our developed high-performance data-path. The methodology mainly consists of three stages; the analysis, the mapping of the application parts onto fine and coarse-grain reconfigurable hardware, and the partitioning engine. A prototype software framework realizes the partitioning flow. In this work, the methodology is validated using five real-life applications. Analytical partitioning experiments show that the speedup relative to the all-FPGA mapping solution ranges from 1.5 to 4.0, while the specified timing constraints are satisfied for all the applications.  相似文献   

18.
阐述了一种基于FPGA的数字相位检测电路的设计方案.电路还具数字频率测量功能,由于在频率测量电路中采用了量程自动转换功能,因此具有测量范围大、精度高、稳定性好等优点,实用性较强.  相似文献   

19.
Many useful DSP algorithms have high dimensions and complex logic. Consequently, an efficient implementation of these algorithms on parallel processor arrays must involve a structured design methodology. Full-search block-matching motion estimation is one of those algorithms that can be developed using parallel processor arrays. In this paper, we present a hierarchical design methodology for the full-search block matching motion estimation. Our proposed methodology reduces the complexity of the algorithm into simpler steps and then explores the different possible design options at each step. Input data timing restrictions are taken into consideration as well as buffering requirements. A designer is able to modify system performance by selecting some of the algorithm variables for pipelining or broadcasting. Our proposed design strategy also allows the designer to study time and hardware complexities of computations at each level of the hierarchy. The resultant architecture allows easy modifications to the organization of data buffers and processing elements-their number, datapath pipelining, and complexity-to produce a system whose performance matches the video data sample rate requirements.  相似文献   

20.
A novel VLSI (Very Large Scale Integration) methodology based on the hierarchical design of computational and system blocks is presented. The underlying algorithms used are shown to optimise the area-time complexity (AT2) of the computational units and at the system design level. The technique is illustrated for a matrix-matrix multiplication by using an image processing window convolver. This paper describes the performance of the recursive design technique comparing it to a typical systolic array, and demonstrates how data word size and convolution size may be expanded by movement up the architectural hierarchy. A prototype CAD (Computer Aided Design) autolayout program is described which maps directly into the hierarchical design environment. Using such design aids, flexible and correct designs may be generated which offer very simple data flow and highly local interconnection, with high performance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号