首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The general-purpose, highly parallel, cellular array processor (CAP) we developed features multiple-instruction stream, multiple-data stream (MIMD) processing and image display. Processor elements can number in several hundreds. The present system uses 256 processors. Each processor element consists of a general-purpose microprocessor, memory, and a special VLSI chip that performs parallel-processing-specific functions such as processor communication and synchronization. The VLSI has two 2M byte/s independent common bus interfaces for data broadcating and six 15M bit/s serial communication ports for local data communication. The chip also can process image data in real time for multiple processors. Use of the communication interfaces enables a variety of processor networks to be configured. One CAP application has been computer graphics, in which ray tracing is used to generate quality images.  相似文献   

2.
The current art of digital electronic implementation of neural networks is reviewed. Most of this work has taken place as digital simulations on general-purpose serial or parallel digital computers. Specialized neural network emulation systems have also been developed for more efficient learning and use. Dedicated digital VLSI integrated circuits offer the highest near-term future potential for this technology  相似文献   

3.
For real-time image-processing applications, a highly parallel system that exploits parallelism is desirable. A content addressable memory (CAM), or an associative processor, that can perform various types of parallel processing with words as the basic unit is a promising component for creating such a system because of its suitability for LSI implementation. Conventional CAM LSI's, however, have neither efficient function nor enough capacity for pixel-parallel processing. This paper describes a fully parallel 1-Mb CAM LSI. It has advanced functions for processing various pixel-parallel algorithms, such as mathematical morphology and discrete-time cellular neural networks. Moreover, since it has 16-K words, or processing elements (PEs), which can process 128×128 pixels in parallel, a board-sized pixel-parallel image-processing system can be implemented using several chips. A chip capable of operating at 56 MHz and 2.5 V was fabricated using 0.25-μm full-custom CMOS technology with five aluminum layers. A total of 15.5 million transistors have been integrated into a 16.1×17.0 mm chip. Typical power dissipation is 0.25 W. Processing performance of various update and data transfer operations is 3-640 GOPS. This CAM LSI will make a significant contribution to the development of compact, high-performance image-processing systems  相似文献   

4.
基于ADSP-21160的雷达脉冲压缩并行处理机的设计   总被引:9,自引:0,他引:9  
贺知明  黄巍  张剑  向敬成 《信号处理》2002,18(5):473-476
本文采用以多片通用DSP芯片ADSP-21160为核心建立并行处理机平台,通过多片并行FFT和IFFT运算,高效实现了频域数字脉冲压缩处理。在并行算法研究的基础上,设计并优化了一个高并行效率的雷达信号数字脉冲压缩系统,得出了相应的实验结果。  相似文献   

5.
A Kalman filter for tracking moving objects has been implemented on a TMS32010 digital signal processor. Tracking accuracy and quantization effects of the implementation have been measured by comparing the filter to one implemented on a general-purpose computer with a 32 bit word length. The filter design has been optimized to minimize the program memory requirements and execution speed. Although the filter has been implemented on a specific signal processing chip, the design is general enough to be applicable to any other digital signal processor. The filter can be used for tracking objects for industrial or other applications where range and bearing measurements are available. For motion on a plane, the filter can be used to track objects where the maximum system bandwidth is 1680 Hz; for three-dimensional motion the system bandwidth is 1120 Hz. Using the approach presented, higher system bandwidth can be accommodated through higher-speed digital signal processors  相似文献   

6.
通用神经网络硬件中神经元基本数学模型的讨论   总被引:26,自引:8,他引:26  
在介绍了作者实现通用神经网络硬件中应用的通用计算公式的基础上,提出了一种能同时模拟包括RBF与传统BP网络神经元在内的各种神经元通用的新的数学计算模型,并把基于这种通用数学计算模型的神经网络CASSANDRA-Ⅱ型神经计算机结构设计中并予以硬件实现.文中还讨论了它所模拟神经元网络的灵活性.  相似文献   

7.
人工神经网络是现代信息处理领域的一个重要的方法。相对于软件实现 ,硬件实现方式能充分发挥神经网络并行处理的特点。用模拟电路实现神经网络电路形式简单、功耗低、速度快、占用芯片面积小 ,可以提高在神经网络芯片上神经元的集成度 ,神经元电路适合用模拟电路实现。文中综述了当前神经网络单元的模拟 VLSI实现的成果、新技术以及作者的工作成果。针对应用最广泛的线性和平方突触神经元 ,详细从权值存储单元、突触电路和阈值函数电路三方面来叙述。对各种实现方式的优缺点进行了比较 ,同时指出了神经网络实现电路中需要考虑的因素。最后 ,展望了用集成电路技术实现自学习神经网络的发展方向  相似文献   

8.
Due to the variety of architectures that need be considered while attempting solutions to various problems using neural networks, the implementation of a neural network with programmable topology and programmable weights has been undertaken. A new circuit block, the distributed neuron-synapse, has been used to implement a 1024 synapse reconfigurable network on a VLSI chip. In order to evaluate the performance of the VLSI chip, a complete test setup consisting of hardware for configuring the chip, programming the synaptic weights, presenting analog input vectors to the chip, and recording the outputs of the chip, has been built. Following the performance verification of each circuit block on the chip, various sample problems were solved. In each of the problems the synaptic weights were determined by training the neural network using a gradient-based learning algorithm which is incorporated in the experimental test setup. The results of this work indicate that reconfigurable neural networks built using distributed neuron synapses can be used to solve various problems efficiently  相似文献   

9.
片AD73322L是AD公司推出的低成本.低功耗CMOS通用双通道模拟前端。可以同时进行AD和DA转换。本文介绍了芯片AD73322L的性能特点.功能、基本原理及其与DSP5416的接口设计。  相似文献   

10.
The architecture, the design, and the analog very large scale integration (VLSI) implementation of a feature extractor chip for optical character recognition (OCR) systems are described. The chip extracts a set of 112 feature values coded by current signals from a 32×24 digital pixel matrix, representing the input character. Such features are applied to a classifier (for example, a neural classifier) performing the recognition task. The measurements performed on that chip confirm its functionality. The chip can be used with a segmented and nonsegmented string of characters. A throughput of about 140 kChar/s is achieved for the segmented case, while a throughput of about 450 kChar/s is achieved for the nonsegmented case. The OCR architecture has been functionally validated. A set of numerical handwritten characters has been processed by the chip and the measured output features (after a normalization operation) have been used as input for neural network classifier; implemented by a software simulator which performs the recognition task. The resulting classification error rate (4.3%) has been successfully compared with those obtained by a high level model of this chip, and the results validate the entire architecture  相似文献   

11.
宋荣方 《微电子学》1996,26(6):378-381
人工神经网络的单片集成是一个受到广泛关注的问题。采用开关电容技术对此进行了研究。讨论了连续时间Hopfield神经元的两种开关电容实现方案。提出了利用开关电容网络来实现连续的Hopfield神经网络的新方法。该方法便于单片集成,且得到的网络结构具有对寄生参数不灵敏的优点  相似文献   

12.
A general-purpose gain/loss circuit is described. Its function is controlled by an 8-b digital word. It provides up to 256 0.1-dB steps in gain or loss. The circuit has two modes of incremental gain/loss steps (two sets of gain/loss values for bits in the control word). A ninth bit selects between gain and loss. The IC has three digital interfaces: serial, parallel clocked input, and parallel fixed input. The chip is fabricated in a 3-/spl mu/m CMOS n-well process. It requires a /spl plusmn/5-V power supply, and for maximum gain of 25.5 dB, the 0.1-dB large-signal bandwidth is 260 kHz.  相似文献   

13.
The growing interest in pulse-mode processing by neural networks is encouraging the development of hardware implementations of massively parallel networks of integrate-and-fire neurons distributed over multiple chips. Address-event representation (AER) has long been considered a convenient transmission protocol for spike based neuromorphic devices. One missing, long-needed feature of AER-based systems is the ability to acquire data from complex neuromorphic systems and to stimulate them using suitable data. We have implemented a general-purpose solution in the form of a peripheral component interconnect (PCI) board (the PCI-AER board) supported by software. We describe the main characteristics of the PCI-AER board, and of the related supporting software. To show the functionality of the PCI-AER infrastructure we demonstrate a reconfigurable multichip neuromorphic system for feature selectivity which models orientation tuning properties of cortical neurons  相似文献   

14.
A chip implementing the coordinate rotation digital computer (CORDIC) algorithm is described. It contains a 10-MHz 16-b fixed-point CORDIC arithmetic unit, 2-kb RAM, a controller, and input/output (I/O) registers. A modified data-path architecture allows cross-wire free data flow. The chip design involved development of optimized carry-select adders and a modified programmable-logic-array (PLA) cell layout, which allows speed increase in single-layer metal technology. The authors designed, fabricated, and tested a general-purpose fully parallel programmable CORDIC chip in CMOS technology and developed optimal iteration sequences  相似文献   

15.
This paper describes a 51.2-GOPS video recognition processor, which achieves real-time multiple processing of in-vehicle video recognition applications in software, while at the same time satisfying power efficiency requirements of an in-vehicle device. The chip integrates 128 RISC microprocessors, each operating at 100 MHz, into a single chip. Hardware configurations of the chip are enhanced for supporting efficient execution of extended C language codes of algorithms based on four basic parallel methods. The results of a benchmark test using a weather-robust lane mark and vehicle detection application show that the processor achieves a four times better performance while it consumes less than 1/20 of peak power consumption compared with a 2.4-GHz general-purpose processor.  相似文献   

16.
The design of a six-neuron chip using 1.3-μm CMOS gate-array technology is described. With these neuro-chips, the authors developed a general-purpose neural-network system that can simulate a wide range of neural networks, including Hopfield-type networks, back propagation networks, and many others. The system consists of several neuro-boards and a host computer. Each neuro-board contains 72 neuro-chips, which constitute a network of 54 neurons with 2916 excitatory and 2916 inhibitory synapses. The computer can read and write various registers in the neuro-board, learning algorithms can be executed, and synaptic strength can be easily updated. A hierarchical bus structure of time-sharing buses connects each of the neurons on the wafer. As fabricated, the neuro-WSI uses 0.8-μm, three-level-metal CMOS gate-array technology  相似文献   

17.
The authors present the architecture of a general-purpose broadband-ISDN (B-ISDN) switch chip and, in particular, its novel feature: the weighted round-robin cell (packet) multiplexing algorithm and its implementation in hardware. The flow control and buffer management strategies that allow the chip to operate at top performance under congestion are given, and the reason why this multiplexing scheme should be used under those circumstances is explained. The chip architecture and how the key choices were made are discussed. The statistical performance of the switch is analyzed. The critical parts of the chip have been laid out and simulated, thus proving the feasibility of the architecture. Chip sizes of four to ten links with link throughput of 0.5 to 1 Gb/s and with about 1000 virtual circuits per switch have been realized. The results of simulations of the chip are presented  相似文献   

18.

Recent advances in general-purpose graphics processing units (GPGPUs) have resulted in massively parallel hardware that is widely available to achieve high performance in desktop, notebook, and even mobile computer systems. While multicore technology has become the norm of modern computers, programming such systems requires the understanding of underlying hardware architecture and hence posts a great challenge for average programmers, who might be professionals in specific domains, but not experts in parallel programming. This paper presents a GUI tool called GPUBlocks that can facilitate parallel programming on multicore computer systems. GPUBlocks is developed based on the OpenBlocks framework, an extendable tool for graphical programming, to construct the GUI-based programming environment for CUDA and OpenCL parallel computing platforms. Programmers simply need to drag-n-drop blocks, fill the fields of the blocks, and connect them according to array or matrix computations that are specified by algorithms. GPUBlocks can then translate block-based code to CUDA or OpenCL programs. Furthermore, a couple of optimization constructs have also been offered for rapid program optimization. Experimental results have shown that the generated CUDA and OpenCL programs can achieve reasonable speedups on GPUs. Consequently, GPUBlocks can be used as a tool for fast prototyping of GPU applications or a platform for educational parallel programming.

  相似文献   

19.
A flexible and reconfigurable signal processing ASIC architecture has been developed, simulated, and synthesized. The proposed architecture compares favorably to classical DSP and FPGA solutions. It differs from general-purpose reconfigurable computing (RC) platforms by emphasizing high-speed application-specific computations over general-purpose flexibility. The proposed architecture can he used to realize any one of several functional blocks needed for the physical layer implementation of data communication systems operating at symbol rates in excess of 125 Msymbols/s. Multiple instances of a chip based on this architecture, each operating in a different mode, can be used to realize the entire physical layer of high-speed data communication systems. The architecture features the following modes (functions): real and complex FIR/IIR filtering, least mean square (LMS)-based adaptive filtering, discrete Fourier transforms (DFT), and direct digital frequency synthesis (DDFS) at up to 125 Msamples/s. All of the modes are mapped onto a common, regular data path with minimal configuration logic and routing. Multiple chips operating in the same mode can be cascaded to allow for larger blocks  相似文献   

20.
In this article, recent research activities on the development of electronic neural networks in Japan are reviewed. Most of the largest Japanese electronic companies have developed VLSI neural chips using analog, digital or optoelectronic circuits. They have run various neural networks on them. Recently, in Japan, digital approach becomes active. Several fully-digital VLSI chips for on-chip BP learning have been developed, and 2.3 GCUPS (Giga Connection Updates per Second) learning speed has already been attained. Although the numbers of neurons and synapses containable in single digital chips are small, a large neural network can be developed by cascading the chips. By cascading 72 chips, a fully interconnected PDM (Pulse Density Modulating) digital neural network system has been developed. The behavior of the system follows simultaneous nonlinear differential equations and the processing speed amounts to 12 GCPS (Giga Connections per Second).Intensive researches on analog and optoelectronic approaches have also been carried out in Japan. An analog VLSI neural chip attains 28 GCUPS on-chip learning speed and 1 TCPS (Tera Connections per Second) processing speed for Boltzmann machine with 1 bit digital output. For the optoelectronic approach, although the network size is small, 640 MCUPS BP learning speed has been attained.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号