首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We report the implementation of a prototype three-dimensional (3D) optoelectronic neural network that combines free-space optical interconnects with silicon-VLSI-based optoelectronic circuits. The prototype system consists of a 16-node input, 4-neuron hidden, and a single-neuron output layer, where the denser input-to-hidden-layer connections are optical. The input layer uses PLZT light modulators to generate optical outputs which are distributed over an optoelectronic neural network chip through space-invariant holographic optical interconnects. Optical interconnections provide negligible fan-out delay and allow compact, purely on-chip electronic H-tree type fan-in structure. The small prototype system achieves a measured 8-bit electronic fan-in precision and a calculated maximum speed of 640 million interconnections per second. The system was tested using synaptic weights learned off system and was shown to distinguish any vertical line from any horizontal one in an image of 4×4 pixels. New, more efficient light detector and small-area analog synapse circuits and denser optoelectronic neuron layouts are proposed to scale up the system. A high-speed, feed-forward optoelectronic synapse implementation density of up to 104/cm2 seems feasible using new synapse design. A scaling analysis of the system shows that the optically interconnected neural network implementation can provide higher fan-in speed and lower power consumption characteristics than a purely electronic, crossbar-based neural network implementation  相似文献   

2.
A self-learning neural network chip based on the branch-neuron-unit (BNU) architecture, which expands the scale of a neural network by interconnecting multiple chips without reducing performance, is described. The chip integrates 336 neurons and 28224 synapses with a 1.0-μm double-poly-Si double-metal CMOS technology. The operation speed is higher than 1×1012 connections per second per chip. It is estimated that the network scale can be expanded to several hundred chips. In the case of 200-chip interconnections, the network will consist of 3360 neurons and 5,644,800 synapses  相似文献   

3.
A charge coupled device (CCD)-based image processor that performs 2D filtering of a gray-level image with 20 programmable 8-b 7×7 spatial filters is described. The processor consists of an analog input buffer, 49 multipliers, and 49 8-b 20-stage local memories in a 29-mm 2 chip area. Better than 99.999% charge transfer efficiency and greater than 42-dB dynamic range have been achieved by the processor, which performs one billion arithmetic operations per second and dissipates less than 1 W when clocked at 10 MHz. The device is also suited for neural networks with local connections and replicated weights. Implementation of a specific neural network, the neocognitron, based on this CCD processor has been simulated. The effect of weight quantization imposed by use of this CCD device on the performance of the neocognitron is presented  相似文献   

4.
A new dynamical system and computational circuit is described and analyzed. The dynamics permits the construction of a Lyapunov function that ensures global convergence to a unique stable equilibrium. The analog circuit realization is of the neural network type, with N cells represented by high-gain amplifiers, global feedback, and at most 2N interconnections, where N is the number of inputs. A specific application (called "the K-selector") which signals the ranks of the K largest elements of input list and, in parallel the rank of the (K+1)th element, is designed and numerically tested. For a given density of the input elements, one obtains feasible separation intervals of output signals, i.e., good processing performances. The circuit requires an appropriate control source and suitable scaling of the input data.  相似文献   

5.
Using a combination of architecture optimization techniques and unconventional circuit designs, a 60 MHz decision-feedback equalizer (DFE) chip is presented for high-bit-rate digital modem applications. The equalizer can accommodate quaternary phase-shift keying (QPSK), and 16, 64, and 256 quadrature amplitude modulation (QAM) and achieves a peak throughput rate of 480 MB/s. The chip contains four complex-valued programmable filter taps and incorporates coefficient updating circuitry for implementing the LMS adaptive algorithm with user-selectable adaptation step sizes. Cut-set retiming architecture techniques were used so that the chips could be cascaded to implement longer equalizer lengths without any speed degradation, and a circuit design technique called adaptively biased pseudo-NMOS logic (APNL) was adopted to reduce on-chip critical-path delays. The fully parallel chip architecture achieves a computational throughput of 1.44 billion operations per second (GOPS)  相似文献   

6.
A prototype vision chip has been designed that incorporates a 20 × 64 array of processing elements on a 31 μm pitch. Each processor element includes 14 bits of digital memory in addition to seven analogue registers. Digital operands include NOR and NOT with operations of diffusion, subtraction, inversion and squaring available in the analogue domain. The cells of the array can be configured as an asynchronous propagation network allowing operations such as flood filling to occur with times of ~1 μs across the array. Exploiting this feature allows the chip to recognise the difference between closed and open shapes at 30,000 frames per second. The chip is fabricated in 0.18 μm CMOS technology.  相似文献   

7.
An analog computing-based systolic architecture which employs multiple neuroprocessors for high-speed early vision processing is presented. For a two-dimensional image, parallel processing is performed in the row direction and pipelined processing is performed in the column direction. The mixed analog/digital design approach is suitable for implementation of electronic neural systems. Local data computation is executed by analog circuitry to achieve full parallelism and to minimize power dissipation. Inter-processor communication is carried out in the digital format to maintain strong signal strength across the chip boundary and to achieve direct scalability in neural network size. For demonstration purposes, a compact and efficient VLSI neural chip that includes multiple neuroprocessors for high-speed digital image restoration is designed. Measured results of the programmable synapse, and statistical distribution of measured synapse conductances are presented. Based on these results, system-level analyses at 8-bit resolution are conducted. A 8.0×6.0-mm 2 chip from a 1.2-µm CMOS technology can accommodate 5 neuroprocessors and the speed-up factor over the Sun-4/75 SPARC workstation is around 450. This chip achieves 18 Giga connections per second.This research was partially supported by DARPA under Contract MDA 972-90-C-0037 and by TRW Inc., Samsung Electronics Co., Ltd., and NKK Corp.  相似文献   

8.
模糊汉明神经网络及其实现的研究   总被引:1,自引:0,他引:1       下载免费PDF全文
华强  郑启伦 《电子学报》2002,30(2):177-179
由于传统汉明神经网络未解决模式重叠和识别算法是否一定收敛的问题,也未充分利用输入模式与其他神经元之间的靠近程度信息,本文提出一种模糊汉明神经网络.模糊汉明神经网络可接受二值和非二值输入;使用模糊类隶属度子网解决模式重叠问题和充分利用靠近程度信息;采用比较子网保证算法的收敛和减少互连.其模块式的电路设计也便于网络的VLSI实现和扩展.  相似文献   

9.
A monolithic 64-tap digitally programmable analog transversal filter is described that uses an acoustic charge transport (ACT) tapped delay line and integrated GaAs MESFET circuits for coefficient storage and tap weight circuitry. The device has 6-b tap weights, an input sampling rate of 360 MHz, and an output tap spacing corresponding to an output sampling rate of 130 MHz. This results in the effective execution of 8×109 multiply and sum operations per second in a 38-mm2 chip that dissipates less than 2 W. This effective computational rate is limited in the present design by the spacing of the ACT delay line taps, which is dictated by the geometry of the tap weight circuits. The chip uses fully random-access tap weight memory, which is easier to interface to typical digital controllers than the usual shift-register storage approach. Tap address and tap weight data are applied as parallel 6-b words, and the data work is clocked into the address location by the application of an enable pulse. The tap weight circuits use monolithic capacitors and GaAs MESFET analog switches to realize a multiplying converter based on a C/2C ladder configuration with a sign-and-magnitude tap weight word format. A ladder accuracy of 7 b is achieved by compensating the ladder component values for parasitics  相似文献   

10.
A parallel digital optical cellular image processor (DOCIP) functionally comprises an array of identical I-bit processing elements or cells, a fixed interconnection network, and a control unit. Four interconnection network topologies are described, and include two variants of a mesh-connected array and two variants of a cellular hypercube network. The instruction sets of these single-instruction multiple-data (SIMD) machines are based on a mathematical morphological theory, binary image algebra (BIA), which provide an inherently parallel programming structure for their control. Physically, a DOCIP architecture uses a holographic optical element in a 3D free-space optical system to implement off-chip interconnections, and an optoelectronic spatial light modulator to implement a 2D array of nonlinear processing elements and (optionally) local on-chip interconnections. Two examples are given. The first, an experimental implementation of a single 54-gate cell of the DOCIP, uses an optically recorded hologram for within-cell optical interconnections, and a spatial light modulator for a 2D array of optically accessible gates. The second, a design for an efficient and more manufacturable architecture, uses a computer-generated diffractive optical element for cell-to-cell interconnections, and a 20 smart-pixel array of DOCIP cells, each cell having electronic logic and optical input/output  相似文献   

11.
Artificial Neural Networks are the massively parallel interconnection of simple processing elements. Computing times for the simulation of these parallel systems on today’s von-Neumann-computers increase with the squared number of processing elements. There is a need for application specific hardware. This paper describes various investigations of analog as well as digital hardware for neural networks. Possible solutions for the connection problem and different circuit designs will be explained. Then our cascadable digital circuit for the emulation of a biology-oriented, dynamic neural network will be presented.  相似文献   

12.
人工神经网络是现代信息处理领域的一个重要的方法。相对于软件实现 ,硬件实现方式能充分发挥神经网络并行处理的特点。用模拟电路实现神经网络电路形式简单、功耗低、速度快、占用芯片面积小 ,可以提高在神经网络芯片上神经元的集成度 ,神经元电路适合用模拟电路实现。文中综述了当前神经网络单元的模拟 VLSI实现的成果、新技术以及作者的工作成果。针对应用最广泛的线性和平方突触神经元 ,详细从权值存储单元、突触电路和阈值函数电路三方面来叙述。对各种实现方式的优缺点进行了比较 ,同时指出了神经网络实现电路中需要考虑的因素。最后 ,展望了用集成电路技术实现自学习神经网络的发展方向  相似文献   

13.
Although the neural network paradigms have the intrinsic potential for parallel operations, a traditional computer cannot fully exploit it because of the serial hardware configuration. By using the analog circuit design approach, a large amount of parallel functional units can be realized in a small silicon area. In addition, appropriate accuracy requirements for neural operation can be satisfied. Components for a general-purpose neural chip have been designed and fabricated. Dynamically adjusted weight value storage provides programmable capability. Possible reconfigurable schemes for a general-purpose neural chip are also presented. Test of the prototype neural chip has been successfully conducted and an expected result has been achieved.  相似文献   

14.
A high-speed analog VLSI image acquisition and pre-processing system has been designed and fabricated in a 0.35 mum standard CMOS process. The chip features a massively parallel architecture enabling the computation of programmable low-level image processing in each pixel. Extraction of spatial gradients and convolutions such as Sobel or Laplacian filters are implemented on the circuit. For this purpose, each 35 mum times 35 mum pixel includes a photodiode, an amplifier, two storage capacitors, and an analog arithmetic unit based on a four-quadrant multiplier architecture. The retina provides address-event coded output on three asynchronous buses: one output dedicated to the gradient and the other two to the pixel values. A 64 times 64 pixel proof-of-concept chip was fabricated. A dedicated embedded platform including FPGA and ADCs has also been designed to evaluate the vision chip. Measured results show that the proposed sensor successfully captures raw images up to 10 000 frames per second and runs low-level image processing at a frame rate of 2000 to 5000 frames per second.  相似文献   

15.
The authors present three VLSI chips-a processor (PU) chip, a cache memory (CU) chip, and a network control (NU) chip-for a large-scale parallel inference machine. The PU chip has been designed to be adapted to logic programming languages such as PROLOG. The CU chip implements a hardware support called `trial buffer' which is suitable for the execution of the PROLOG-like languages. The NU chip makes it possible to connect 256 processing elements in a mesh network. The parallel inference machine (PIM/m) runs a PROLOG-like network-based operating system called PIMOS as well as many applications and has a peak performance of 128 mega logical inferences per second (MLIPS). The PU chip containing 384000 transistors is fabricated in a 0.8 μm double-metal CMOS technology. The CU chip and the NU chip contain 610000 and 329000 transistors, respectively. They are fabricated in a 1.0 μm double-metal CMOS technology. A cell-based design method is used to reduce the layout design time  相似文献   

16.
提出了一种基于多态忆阻器的神经网络电路硬件实现方法。采用28 bit的惠普忆阻模型来构建存储权重的双忆阻稳定结构,结合了低功耗轨到轨运放技术以及寄存器技术,设计了模值与极性分离的绝对值电路,以及以忆阻器为核心、可进行正负浮点数运算的权值网络矩阵电路。通过Verilog-A编写激活单元,实现了多层忆阻神经网络。该电路采用并行输入和模拟信号处理方式,控制简单,无需中间数据缓存。实验结果表明,该方法有效提升了以忆阻器为核心的人工神经网络的稳定性和运行效率。  相似文献   

17.
Due to the variety of architectures that need be considered while attempting solutions to various problems using neural networks, the implementation of a neural network with programmable topology and programmable weights has been undertaken. A new circuit block, the distributed neuron-synapse, has been used to implement a 1024 synapse reconfigurable network on a VLSI chip. In order to evaluate the performance of the VLSI chip, a complete test setup consisting of hardware for configuring the chip, programming the synaptic weights, presenting analog input vectors to the chip, and recording the outputs of the chip, has been built. Following the performance verification of each circuit block on the chip, various sample problems were solved. In each of the problems the synaptic weights were determined by training the neural network using a gradient-based learning algorithm which is incorporated in the experimental test setup. The results of this work indicate that reconfigurable neural networks built using distributed neuron synapses can be used to solve various problems efficiently  相似文献   

18.
高性能的光子模拟处理芯片是微波光子处理系统的核心部件,文章通过优化光波导网络结构,实现了一种超宽带可重构的光子模拟运算芯片,通过配置拓扑网络结构实现了多种运算功能的任意切换以及同种功能的运算阶数可调谐。同时,研究了具有自配置能力的光学矩阵计算芯片,以及用于图像处理的片上光子卷积加速器。最后,对微波光子系统与人工智能的交叉融合进行了展望。  相似文献   

19.
设计了一种基于单片机的程控放大器.设计主要包括三大部分,即以单片机、模拟开关以及所连的电阻网络作为核心,键盘输入和液晶显示模块.采用CD4053模拟开关,STC89C52单片机,用LCD128*64进行实时显示放大倍数,从而达到程控放大的目的.  相似文献   

20.
《Spectrum, IEEE》1992,29(7):26-31
A 64 bit architecture called Alpha, which is the basis of the fastest available commercial reduced-instruction-set computer (RISC) chip, is discussed. The development process is detailed. Alpha's 25 year life expectancy envisions a 1000 fold performance increase to 400 billion instructions per second  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号