首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 875 毫秒
1.
《Real》2001,7(2):203-217
This paper presents a VLSI architecture to implement the forward and inverse two dimensional Discrete Wavelet Transform (DWT), to compress medical images for storage and retrieval. Lossless compression is usually required in the medical image field. The word length required for lossless compression makes too expensive the area cost of the architectures that appear in the literature. Thus, there is a clear need for designing a cost-effective architecture to implement the lossless compression of medical images using DWT. The data path word length has been selected to ensure the lossless accuracy criteria leading a high speed implementation with small chip area. The pyramid algorithm is reorganized and the algorithm locality is improved in order to obtain an efficient hardware implementation. The result is a pipelined architecture that supports single chip implementation in VLSI technology. The implementation employs only one multiplier and 352 memory elements to compute all scales what results in a considerable smaller chip area (45 mm2) than former implementations. The hardware design has been captured by means of the VHDL language and simulated on data taken from random images. Implemented in a 0.7 μm technology, it can compute both the forward and inverse DWT at a rate of 3.5 512×512 12 bit images/s corresponding to a clock speed of 33 MHz. This chip is the core of a PCI board that will speedup the DWT computation on desktop computers.  相似文献   

2.
Corner detection is a low-level feature detection operator that is of great use in image processing applications, for example, optical flow and structure from motion by image correspondence. The detection of corners is a computationally intensive operation. Past implementations of corner detection techniques have been restricted to software. In this paper we propose an efficient very large-scale integration (VLSI) architecture for detection of corners in images. The corner detection technique is based on the half-edge concept and the first directional derivative of Gaussian. Apart from the location of the corner points, the algorithm also computes the corner orientation and the corner angle and outputs the edge map of the image. The symmetrical properties of the masks are utilized to reduce the number of convolutions effectively, from eight to two. Therefore, the number of multiplications required per pixel is reduced from 1800 to 392. Thus, the proposed architecture yields a speed-up factor of 4.6 over conventional convolution architectures. The architecture uses the principles of pipelining and parallelism and can be implemented in VLSI.  相似文献   

3.
This contribution focuses on different topics that are covered by the special issue titled “Real-Time Motion Estimation for image and video processing applications” and which incorporate GPUS, FPGAs, VLSI systems, DSPs, and Multicores, among other platforms. The guest editors have solicited original contributions, which address a wide range of theoretical and practical issues related to high-performance motion estimation image processing including, but not limited to: real-time matching motion estimation systems, real-time energy-based motion estimation systems, gradient-based motion estimation systems, optical flow estimation systems, color motion estimation systems, multi-scale motion estimation systems, optical flow and motion estimation systems, analysis or comparison of specialized architectures for motion estimation systems and real-world applications.  相似文献   

4.
To reinforce the growth of computer graphics, hardware architectures must be developed to support the rapid growth in display resolution that will occur in the next few years. The architectures must allow a high degree of application independence without requiring significant changes in the overhead software interface. The hardware architecture must be expandable, again avoiding significant software changes. Finally, the architecture must be implemented in VLSI for reasons of cost and speed. This article will describe a graphics hardware architecture that meets these criteria.  相似文献   

5.
The major concerns of VLSI designers in the past were performance, area, reliability and cost. Power was only a secondary issue. In recent years, however, power, area, and speed have become equally important. There are many reasons for this new trend. Primarily, rapid advancement in semiconductor technology over the past decade has enabled designers to integrate many digital CMOS circuits on a single chip. However, the desirability of using these circuits in portable operations has necessitated the development of low-power technology. Portable applications range from desktop computers and audio-video based multimedia products to personal digital assistants and personal communicators. These systems demand both complex functionality and low power, which make their design challenging. The hierarchical energy analysis tool lets designers quickly estimate power consumption of various data-path architectures, enabling a power consumption comparison at a high level before the layout design is carried out  相似文献   

6.
Most Western Governments (USA, Japan, EEC, etc.) have now launched national programmes to develop computer systems for use in the 1990s. These so-called Fifth Generation computers are viewed as “knowledge” processing systems which support the symbolic computation underlying Artificial Intelligence applications. The major driving force in Fifth Generation computer design is to efficiently support very high level programming languages (i.e. VHLL architecture).

Historycally, however, commercial VHLL architectures have been largely unsuccesful. The driving force in computer designs has principally been advances in hardware which at the present time means architectures to exploit very large scale integration (i.e. VLSI architecture).

This paper examines VHLL architectures and VLSI architectures and their probable influences on Fifth Generation computers. Interestingly the major problem for both architecture classes is parallelism; how to orchestrate a single parallel computation so that it can be distributed across an ensemble of processors.  相似文献   


7.
比特平面并行的EBCOT编码及其VLSI结构   总被引:10,自引:1,他引:10  
该文提出了比特平面并行处理的EBCOT编码算法 .通过分析JPEG2 0 0 0中EBCOT编码结构 ,指出每一个比特平面的编码信息可以同时获得 ,从而给出了比特平面并行处理的块编码方法 ,并且详细说明了实现的VLSI结构 .与现有的结构相比 ,该结构具有并行度高、避免编码位置的时钟浪费等特点 .从实验结果表明 ,比特平面并行处理方式所需的时钟周期最少 ,FPGA原型系统最高时钟频率可达 5 2MHz,图像质量达到了公布的JPEG2 0 0 0标准 .  相似文献   

8.
MQ编码算法的高复杂度,低吞吐率严重制约其应用.本文在分析连续码流编码更新规律的基础上,利用滑动窗口机制和概率统计规律预测区间变化,减少并行数据间的关联,设计出三种不同并行度的MQ编码VLSI结构.并在FPGA芯片上进行优化实现.实验结果表明,与单输入MQ编码器相比,三种结构能在不影响器件工作频率和编码效率的情况下,不同程度的提高系统的处理速率,为MQ编码的大规模应用提供了广泛的选择空问.  相似文献   

9.
In VLSI technology, redundancy is a commonly adopted technique to provide reconfiguration capabilities to regular architectures. This paper proves upper and lower bounds on the number of minimal fault patterns (minimal set of faulty processors) which affect a link-redundant linear array in an unrepairable way, for both the cases of bidirectional and unidirectional links.  相似文献   

10.
New reconfigurable computing architectures are introduced to overcome some of the limitations of conventional microprocessors and fine-grained reconfigurable devices (e.g., FPGAs). One of the new promising architectures are Configurable System-on-Chip (CSoC) solutions. They were designed to offer high computational performance for real-time signal processing and for a wide range of applications exhibiting high degrees of parallelism. The programming of such systems is an inherently challenging problem due to the lack of an programming model. This paper describes a novel heterogeneous system architecture for signal processing and data streaming applications. It offers high computational performance and a high degree of flexibility and adaptability by employing a micro Task Controller (mTC) unit in conjunction with programmable and configurable hardware. The hierarchically organized architecture provides a programming model, allows an efficient mapping of applications and is shown to be easy scalable to future VLSI technologies. Several mappings of commonly used digital signal processing algorithms for future telecommunication and multimedia systems and implementation results are given for a standard-cell ASIC design realization in 0.18 micron 6-layer UMC CMOS technology.  相似文献   

11.
基于Altera公司的低成本、高密度Cyclone Ⅱ系列FPGA实现了小波变换的VLSI架构设计,最大化减少了算法对片内存储器的需求,降低了功耗。由于设计能够对图像同时进行行列变换,系统处理速度快,为图像实时处理提供了基础。  相似文献   

12.
本文提出了一种基于细胞神经网络的刀具磨损图像处理方法.通过设计细胞神经网络参数,运用细胞神经网络对刀具的二值图像平滑滤波,边缘提取.通过仿真证明该方法是有效的,由于细胞神经网络易于用VLSI实现并且并行处理速度快,因此应用于刀具的磨损状态机器视觉检测中的图像处理是很有用的。  相似文献   

13.
14.
Farrens  M.K. Pleszhun  A.R. 《Computer》1991,24(1):65-70
The PIPE (parallel instruction with pipelined execution) processor, which is the result of a research project initiated to investigate high-performance computer architectures for VLSI implementation, is described. The lessons learned from the implementation are discussed. The most important result was the discovery that supporting architectural queues does not complicate the instruction issue logic and fees the processor clock rate from external memory speed influences. It was also found that the decision to support an instruction set with two instruction sizes and to allow consecutive two-parcel instruction issues profoundly affected the instruction fetch logic design. Other significant results concerned the issue logic, barrel shifter, cache control logic, and branch count  相似文献   

15.
At early design space exploration phases of architectures for Systems On a Chip (SOC) chip size estimation is of high interest. An accurate chip size estimation needs detailed knowledge of the transistor densities of a semiconductor process. This paper introduces a novel and simplified chip size estimator, which is independent of manufacturer specific process data. CMOS processes are characterized by only three parameters. These are the drawn gate length and the used numbers of metal layers for logic and for memories. The chip size estimator has been derived from a comprehensive analysis of realized VLSI chips. It has been investigated and confirmed either for published VLSIs as well as for latest SOC designs with 221 million transistors and 333 million transistors. The proposed model has been implemented as a web based tool and contributes to analytical modeling of cost and performance tradeoffs of SOC concepts.  相似文献   

16.
Two novel systolic architectures are presented in this paper for polynomial basis finite field multipliers. Using cut-set systolization technique and modified Booth’s recording, we have derived here an efficient realization of multiplexer-based bit-parallel systolic multipliers over GF(2m). Our multipliers save about 19% space complexity as compared to traditional multipliers, and involve nearly half of the time-complexity of the corresponding existing design. It is shown that the proposed systolic architectures have significantly lower time-area product than existing systolic multipliers. For cryptographic applications, our proposed architectures can have better the time and space complexity. Moreover, these new multipliers are highly regular, modular, and therefore, well-suited for VLSI implementation.  相似文献   

17.
We describe new architectures for the efficient computation of redundant manipulator kinematics (direct and inverse). By calculating the core of the problem in hardware, we can make full use of the redundancy by implementing more complex self-motion algorithms. A key component of our architecture is the calculation in the VLSI hardware of the Singular Value Decomposition of the manipulator Jacobian. Recent advances in VLSI have allowed the mapping of complex algorithms to hardware using systolic arrays with advanced computer arithmetic algorithms, such as the coordinate rotation (CORDIC) algorithms. We use CORDIC arithmetic in the novel design of our special-purpose VLSI array, which is used in computation of the Direct Kinematics Solution (DKS), the manipulator Jacobian, as well as the Jacobian Pseudoinverse. Application-specific (subtask-dependent) portions of the inverse kinematics are handled in parallel by a DSP processor which interfaces with the custom hardware and the host machine. The architecture and algorithm development is valid for general redundant manipulators and a wide range of processors currently available and under development commercially.  相似文献   

18.
VLSI technology has recently received increasing attention due to its high performance and high reliability. Designing a VLSI structure systematically for a given task becomes a very important problem to many computer engineers. In this paper, we present a method to transform a recursive computation task into a VLSI structure systematically. The main advantages of this approach are its simplicity and completeness. Several examples, such as vector inner product, matrix multiplication, convolution, comparison operations in relational database and fast Fourier transformation (FFT), are given to demonstrate the transformation procedure. Finally, we apply the proposed method to hierarchical scene matching. Scene matching refers to the process of locating or matching a region of an image with a corresponding region of another view of the same image taken from a different viewing angle or at a different time. We first present a constant threshold estimation for hierarchical scene matching. The VLSI implementation of the hierarchical scene matching is then described in detail.  相似文献   

19.
In this paper, we present a technique for measuring the amount of blur of an edge and using this information to determine the distance of a micromanipulator probe from a wafer surface in very large scale integration (VLSI) wafer probing. In this application, a soft and reliable touch of the probe with a metal pad in the wafer is a sensitive operation. The wafer is focused and several images of the probe while approaching the wafer are analyzed. In our theory, the amount of blur is calculated from the height of the step edge and the slope of the intensity profile at the zero crossing. Hence, our formula is simple and easy to use. We estimate the distance of the probe from the surface of the wafer and obtain a robust measure, i.e., one which is valid even in the presence of significant noise in the images. In order to validate our methods, we have experimented with various VLSI patterns as backgrounds.  相似文献   

20.
Dictionary machine is an important VLSI system performing high speed data archival operations. In this paper, we present a design which can efficiently implement dictionary machines in VLSI processor arrays. In order to effectively process the operations of dictionary machine, hexagonal mesh is selected as the host topology in which two different networks for update and query operation are embedded. The proposed design is simple to implement as well as allows high throughput  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号