期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Focal-Plane and Multiple Chip VLSI Approaches to CNNs

M. Anguita F. J. Pelayo E. Ros D. Palomar A. Prieto 《Analog Integrated Circuits and Signal Processing》1998,15(3):263-275

In this paper, three alternative VLSI analog implementations of CNNs are described, which have been devised to perform image processing and vision tasks: a programmable low-power CNN with embedded photo-sensors, a compact fixed-template CNN based on unipolar current-mode signals, and basic CMOS circuits to implement an extended CNN model using spikes. The first two VLSI approaches are intended for focal-plane image processing applications. The third one allows, since its dynamics is defined by process-independent local ratios and its input/outputs can be efficiently multiplexed in time, the construction of very large multiple chip CNNs for more complex vision tasks. 相似文献

2.

基于细胞神经网络的从阴影恢复形状的新方法 总被引：2，自引：0，他引：2

下载免费PDF全文

王怀颖于盛林冯强《电子学报》2006,34(11):2120-2124

细胞神经网络(CNN)是一种实时处理信号的大规模非线性模拟电路,它的连续时间特点以及局部互连特点使其可以进行并行计算,并且非常适用于超大规模集成电路(VLSI)的实现.本文针对从阴影恢复形状(SFS)问题,提出了一种基于硬件退火CNN的能量函数优化方法,并对该方法进行了详细分析,给出了实例的仿真结果,验证了该方法的有效性.该方法为并行处理算法,具有运算量小、易于大规模VLSI集成实现,且能够克服局部极小等优点,可以使SFS问题得到实时的处理. 相似文献

3.

Modular Cellular Neural Network Structure for Wave‐Computing‐Based Image Processing

Mojtaba Karami Reza Safabakhsh Mohammad Rahmati 《ETRI Journal》2013,35(2):207-217

This paper introduces the modular cellular neural network (CNN), which is a new CNN structure constructed from nine one‐layer modules with intercellular interactions between different modules. The new network is suitable for implementing many image processing operations. Inputting an image into the modules results in nine outputs. The topographic characteristic of the cell interactions allows the outputs to introduce new properties for image processing tasks. The stability of the system is proven and the performance is evaluated in several image processing applications. Experiment results on texture segmentation show the power of the proposed structure. The performance of the structure in a real edge detection application using the Berkeley dataset BSDS300 is also evaluated. 相似文献

4.

A CMOS Implementation of a 1-Bit Multi-Cell Encoded—Cellular Neural Network

S.S. Villareal J.P. De Gyvez M.H. Weichold 《Analog Integrated Circuits and Signal Processing》2004,39(1):95-108

A new Cellular Neural Network (CNN)-based system has been implemented and tested to demonstrate the ability of this novel system to process large digital images more rapidly than its conventional CNN counterpart. The multi-cell encoded CNN processes the data of multiple single-data CNN cells within each multi-cell encoded cell enabling the new architecture it's advantages in loading, processing, unloading speed and in layout. A one bit (1B) 4-cell-encoded CNN was implemented to illustrate this new system. In this CMOS implementation, data of four neighboring conventional 1B CNN cells are encoded for processing within encoded cells. The CMOS circuits and circuit networks of one encoded cell are presented. Due to area limitations, each test chip includes the hardware for only one one-dimensional encoded cell. Experimental results of one-chip and two-chip 1B, multi-cell encoded systems are presented for connected-component detection and edge detection test cases. These results demonstrate the correct response of this implementation due to variations in template values, boundary values, and initial conditions. Interactions between encoded cell components and encoded cells are demonstrated for both dynamic and static responses. This CMOS implementation validates this new CNN architecture and provides a template for implementations using more advanced device technologies and circuits. 相似文献

5.

A Dedicated Multi-Chip Programmable System for Cellular Neural Networks

Mario Salerno Fausto Sargeni Vincenzo Bonaiuto 《Analog Integrated Circuits and Signal Processing》1999,18(2-3):277-288

Cellular Neural Networks (CNN's) represent a remarkable improvement in the hardware implementation of Artificial Neural Networks (ANN's). In fact, their regular structure and their local connectivity feature contribute to render this class of neural networks especially appealing for VLSI implementations. CNNs are widely applied in several fields, including image processing and pattern recognition. In this research, the authors already presented two fully digitally programmable CNN chips with 3×3 (3×3DPCNN chip) and 6×6 cells (6×6DPCNN chip) respectively. In this paper, a system with twenty of the latter chips will be presented. The main features of this electronic system consist of the full digital programmability of the templates, the digital input/output for logic operations, the analog outputs for dynamic analysis and the implementation of space-variant as well as space-invariant CNNs. 相似文献

6.

Hardware-accelerated design space exploration framework for communication systems

Markus Kock Sebastian Hesselbarth Martin Pfitzner Holger Blume 《Analog Integrated Circuits and Signal Processing》2014,78(3):557-571

The efficient hardware implementation of signal processing algorithms requires a rigid characterization of the interdependencies between system parameters and hardware costs. Pure software simulation of bit-true implementations of algorithms with high computational complexity is prohibitive because of the excessive runtime. Therefore, we present a field-programmable gate array (FPGA) based hybrid hardware-in-the-loop design space exploration (DSE) framework combining high-level tools (e.g. MATLAB, C++) with a System-on-Chip (SoC) template mapped on FPGA-based emulation systems. This combination significantly accelerates the design process and characterization of highly optimized hardware modules. Furthermore, the approach helps to quantify the interdependencies between system parameters and hardware costs. The achievable emulation speedup using bit-true hardware modules is a key enabling the optimization of complex signal processing systems using Monte Carlo approaches which are infeasible for pure software simulation due to the large required stimuli sets. The framework supports a divide-and-conquer approach through a flexible partitioning of complex algorithms across the system resources on different layers of abstraction. This facilitates to efficiently split the design process among different teams. The presented framework comprises a generic state of the art SoC infrastructure template, a transparent communication layer including MATLAB and hardware interfaces, module wrappers and DSE facilities. The hardware template is synthesizable for a variety of FPGA-based platforms. Implementation and DSE results for two case studies from the different application fields of synthetic aperture radar image processing and interference alignment in communication systems are presented. 相似文献

7.

A Programmable Imager for Very High Speed Cellular Signal Processing

Elisenda Roca Servando Espejo Rafael Domínguez-Castro Gustavo Liñán Ángel Rodríguez-Vázquez 《The Journal of VLSI Signal Processing》1999,23(2-3):305-318

In this paper a programmable imager with averaging will be described which is intended for averaging of different groups or sets of pixels formed by n × n kernels, n × m kernels or independent pixels of the array. This imager is a 64 × 64 array which uses passive pixels that can be randomly accessed. The read-outstage includes a sole charge amplifier with programmable gain, a sample-and-hold structure and an analog buffer. This read-out structure is different from other existing imagers with variable resolution since it uses a sole charge amplifier, whereas the normal structure is an operational amplifier per column plus a global operational amplifier. This structure will be described in detail indicating the advantages and disadvantages with respect to other imagers with averaging capabilities. This programmable resolution architecture can be more appropriate, and eventually, more efficient, when implementing very high speed Cellular Neural Network (CNN) processors in a CNN chipset—a mixed-signal hardware platform for CNN-based image processing. A significant processing time reduction can be obtained when decreasing the image resolution, and therefore the amount of information to be transferred to the CNN processor. This programmable resolution can also be used for fast image recognition and ulterior windowing at full resolution in a reduced area of the image, permitting a more accurate processing of the region of interest. In addition, full resolution images can still be obtained, as in commercial imagers which are usually included in CNN chipsets. 相似文献

8.

基于单电子晶体管图像信号处理应用研究

贾贵李建东《微电子学与计算机》2006,23(1):166-168,173

研究了单电子晶体管的特性，文章提出一种基于单电子晶体管阵列的传输特性实现CNN方法，设计构成CNN。仿真结果表明，所设计的硬件电路具有结构简单、功耗低、频率特性好．将其应用于图像处理具有一定的灵活性和通用性。相似文献

9.

Parallel pipeline implementation of wavelet transforms

Sava H. Fleury M. Downton A.C. Clark A.F. 《Vision, Image and Signal Processing, IEE Proceedings -》1997,144(6):355-360

Wavelet transforms have been one of the important signal processing developments in the last decade, especially for applications such as time-frequency analysis, data compression, segmentation and vision. Although several efficient implementations of wavelet transforms have been derived, their computational burden is still considerable. The paper describes two generic parallel implementations of wavelet transforms, based on the pipeline processor farming methodology, which have the potential to achieve real-time performance. Results show that the parallel implementation of the oversampled wavelet transform achieves virtually linear speedup, while the parallel implementation of the discrete wavelet transform (DWT) also outperforms the sequential version, provided that the filter order is large. The DWT parallelisation performance improves with increasing data length and filter order, while the frequency-domain implementation performance is independent of wavelet filter order. Parallel pipeline implementations are currently suitable for processing multidimensional images with data length at least 512 pixels 相似文献

10.

可编程细胞神经网络硬件实现及应用研究 总被引：2，自引：0，他引：2

下载免费PDF全文

刘常澍刘峰谢学智庞维珍《电子学报》2000,28(4):91-94

本文提出一种模板可编程细胞神经网络的硬件实现方法,设计构成CNN的细胞体电路、A模板电路和B模板电路,组成CNN并进行在图像处理中的应用研究.仿真结果表明,所设计的硬件电路具有结构简单、功耗低、频率特性好、模板参数可编程等特点,可以方便地构成各种规模的CNN,在图像处理应用中具有一定的灵活性和通用性. 相似文献

11.

Design of high parallel CNN accelerator based on FPGA for AIoT

林志坚高学伟陈小培祝志鹏杜小勇陈平平《中国邮电高校学报(英文版)》2022,29(5):1-9

相似文献

12.

Area-efficient 2-D shift-variant convolvers for FPGA-based digital image processing

Cardells-Tormo F. Molinet P.-L. 《Circuits and Systems II: Express Briefs, IEEE Transactions on》2006,53(2):105-109

Two-dimensional convolutions are local by nature; hence every pixel in the output image is computed using surrounding information, i.e., a moving window of pixels. Although the operation is simple, the hardware is conditioned by the fact that due to bandwidth efficiency full raster rows must be read from the external memory, and that a row-major image scan should be performed to support shift-variant convolutions. When extending the architectures developed in prior-art to support shift-variant convolutions, we realize that they require large amounts of on-chip memory. While this fact may not have a large cost increase in ASIC implementations, it makes field-programmable gate arrays (FPGA) implementations expensive or not feasible. In this paper, we propose several novel FPGA-efficient architectures for generating a moving window over a row-wise print path. Because the proposed concepts have different throughput and resource utilization, we provide a criteria to choose the optimum one for any design point. 相似文献

13.

Computing local minima and maxima of digital images in pipelineimage processing systems equipped with hardware comparators

Dinstein I. Fong-Lochovsky A.C. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》1988,76(3):286-287

An efficient technique is presented for computing minima (maxima) values of gray levels of pixels within rectangular windows. It is suitable for pipeline image processing systems having processors equipped with hardware comparators. The complexity of computing the minima and/or maxima of gray levels within M×N windows is determined 相似文献

14.

Fast block implementation of two-dimensional FIR digital filters by systolic arrays

BASIL G. MERTZIOS 《International Journal of Electronics》2013,100(6):1233-1246

A systolic block implementation is described of two-dimensional (2D) FIR and quarter-plane digital filters. Initially, a general 2D block realization model is presented, which does not assume any restricted relation with respect to the block lengths. A high degree of concurrency is achieved by exploiting the pipelining of the array processors in conjunction with the inherent parallelism of the block realization structures. The resulting systolic implementation is characterized by a high degree of modularity, regularity, repetitiveness and local communications and permits very high sampling rates. The increase of the block lengths of the implementation is analogous to the attained throughput rate, with respect to the cost of supporting hardware. The proposed systolic implementation is suitable for real-time image processing applications. 相似文献

15.

一种新的红外焦平面阵列盲元处理算法

代少升张天骐《压电与声光》2008,30(3):376-378

为了有效检测和补偿红外焦平面阵列(IRFPA)的盲元,提出一种IRFPA盲元即时处理的新算法。该算法利用红外系统实时成像过程中盲元与有效像元的窗口响应率存在的显著差异性,实现盲元的快速检测;依据图像信息的时空相关性实现盲元的在线补偿。最后给出了盲元处理的硬件实现过程。实验结果表明:该算法流程简单,通用性强,能够对系统工作过程中随机出现的盲元进行即时检测和补偿,在实际工程中具有较大的应用价值。相似文献

16.

基于FPGA的水平集图像分割算法加速器

刘野肖剑彪吴飞常亮周军《电子与信息学报》2021,43(6):1525-1532

水平集算法因其出色的性能,在图像分割领域中得到了广泛的应用。同时,与基于深度学习的图像分割算法相比,水平集算法不需要训练数据,大幅降低了数据标记带来的工作量。然而,目前水平集算法主要是基于软件开发,涉及大量复杂的计算,以及计算的多次迭代,导致较高的处理延时与功耗。为了加快水平集算法的处理速度和降低功耗,该文提出了一种基于FPGA的水平集图像分割算法加速器,其中包含4个设计创新点:任务级并行处理、图像分块像素级并行处理、全流水线处理架构、分时复用的梯度和散度算子处理。实验结果表明,与在CPU上执行的水平集算法相比,该文提出的硬件加速器处理速度提升10.7倍,功耗仅为2.2 W。相似文献

17.

Potential Anomaly Separation and Archeological Site Localization Using Genetically Trained Multi‐level Cellular Neural Networks

Erdem Bilgili I. Cem Gknar Ali Muhittin Albora Osman Nuri Uan 《ETRI Journal》2005,27(3):294-303

In this paper, a supervised algorithm for the evaluation of geophysical sites using a multi‐level cellular neural network (ML‐CNN) is introduced, developed, and applied to real data. ML‐CNN is a stochastic image processing technique based on template optimization using neighborhood relationships of the pixels. The separation/enhancement and border detection performance of the proposed method is evaluated by various interesting real applications. A genetic algorithm is used in the optimization of CNN templates. The first application is concerned with the separation of potential field data of the Dumluca chromite region, which is one of the rich reserves of Turkey; in this context, the classical approach to the gravity anomaly separation method is one of the main problems in geophysics. The other application is the border detection of archeological ruins of the Hittite Empire in Turkey. The Hittite civilization sites located at the Sivas‐Altinyayla region of Turkey are among the most important archeological sites in history, one reason among others being that written documentation was first produced by this civilization. 相似文献

18.

VLSI implementation of receptive fields with current-mode signal processing for smart vision sensors

Vivian Ward Marek Syrzycki 《Analog Integrated Circuits and Signal Processing》1995,7(2):167-179

Most of the early vision processes in vertebrate vision systems can be modelled by receptive fields in the retina. Building silicon retina ICs has been attempted in the past, but they have not reached a satisfactory conclusion due to technology constraints. Targeting a wafer-size smart vision sensor, we focus in this paper on researching the VLSI implementation of different receptive fields with dedicated functions. The microelectronic receptive field (MERF) is defined as a functional block of the larger system, performing a preprogrammed operation on visual input signals. The main component of MERF's are analog processors operating in current domain that use current signals from photodetectors to produce desired image processing function and to convert their outputs into frequency mode signals. Results from VLSI chips with various integrated implementations of receptive fields are presented. 相似文献

19.

Advanced graphics behind medical virtual reality: evolution ofalgorithms, hardware, and software interfaces

Soferman Z. Blythe D. John N.W. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》1998,86(3):531-554

Applications of virtual reality (VR) and augmented reality (AR) in medicine require real-time visualization and modeling of large three-dimensional data sets. Consequently, these applications require powerful computation, extensive high-bandwidth memory, and fast communication links. In the past, the manufacturers of medical imaging equipment produced their own special-purpose proprietary hardware for image processing and solid graphics. Due to the developments in computer hardware in general and in graphics accelerators in particular, there is a trend toward replacing the proprietary hardware off-the-shelf (OTS) equipment. Computer graphics itself has advanced in its quest for realism. Generic algorithms such as shading, texture mapping, and volume rendering have been developed to meet the resultant ever increasing requirements. Advances in both the OTS CPU and graphics hardware have enabled real-time implementations of these algorithms, thereby facilitating many of the medical VR/AR applications used today. The development of graphics libraries such as OpenGL has also been an important factor. These libraries provide an underlying portable software platform that optimizes the utilization of the available graphics hardware. OpenGL has become a standard graphics application programming interface, particularly for graphics-intensive applications, and more and more OTS systems provide hardware implementations of OpenGL commands. The review paper follows the evolution of these technologies and examines their crucial role in enabling the appearance of the current VR/AR applications in medicine and provides a look at current trends and future possibilities 相似文献

20.

CMOS realization of two-dimensional mixed analog-digital Hamming distance discriminator circuits for real-time imaging applications

Stéphane Badel Yusuf Leblebici 《Microelectronics Journal》2008,39(12):1817-1828

The architecture of an integrated Hamming artificial neural network, and its use as a versatile signal/image processing circuit is presented. The circuit operation relies on the charge-based processing of sum-of-products terms, complemented with digital post-processing. The synthesis of complex functions such as winner-(loser)-take-all, k-winner-(loser)-take-all, rank ordering are demonstrated with a minimal hardware overhead. Different operation modes and corresponding hardware configurations are presented. The VLSI realization of the core two-dimensional Hamming distance discriminator, and the chip measurements are discussed. As such, the presented Hamming discriminator is uniquely suitable for real-time image processing and alignment applications. 相似文献