首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
Design and construction of an array processor that performs autocorrelation functions is presented. The architecture of this system offers speed and avoids complexity. Random access memory with shifting across zero techniques and a high speed address generator are used. Performance is measured for different sizes of arrays and compared with required time of processing the same arrays using software.  相似文献   

The multiprocessor system ATOMS has been used to solve 3D Navier-Stokes problems. ATOMS was originally designed at AT&T Bell Laboratories for Molecular Dynamics calculations. However, certain hardware features were included in the design to permit data transfer between processor boards as closely coupled linear array of processors. In this mode we refer to the multiprocessor system as DNSP (AT&T's/Delft Navier-Stokes Processor).

An algorithm for calculating the buoyancy-driven laminar/turbulent flow in a 3D cavity has been implemented on the DNSP. For this algorithm an efficiency of 35% (which amounts to 14 Mflops) is obtained. This high efficiency can be reached thanks to the strong coupling between the algorithm and the architecture of the multiprocessor system. The speed is obtained at very low cost, resulting in a cost/performance ratio for the DNSP which is at least an order of magnitude lower than (mini-)supercomputers.  相似文献   

Increasingly, 3D graphics is becoming the rule rather than the exception in applications such as games, CAD/CAM, and video production. Some LSIs provide rendering capabilities, but require an additional CPU to perform essential geometry transformations. Fujitsu's chip set solves that problem using two processors to render 300,000 polygons per second (for flat-shaded triangles with texture)-performance comparable to that of advanced game machines  相似文献   

A new type of high performance array processor system is presented in this paper.Unlikethe conventional host-peripheral array processor systems,this system is designed with afunctionally distributed approach.The design philosophy is described first.Then the hardwareorganizations of two concrete systems,namely:150-AP and GF-10/12,including thecommunication between processors are shown.Some attractive system performances for usersprograms are also given.  相似文献   

Interconnection becomes one of main concerns in current and future microprocessor designs from both performance and consumption. Three-dimensional integration technology, with its capability to shorten the wire length, is a promising method to mitigate the interconnection related issues. In this paper we implement a novel high-performance processor architecture based 3D on-chip cache to show the potential performance and power benefits achievable through 3D integration technology. We separate other logic module and cache module and stack 3D cache with the processor which reduces the global interconnection, power consumption and improves access speed. The performance of 3D processor and 3D cache at different node is simulated using 3D Cacti tools and theoretical algorithms. The results show that comparing with 2D, power consumption of the storage system is reduced by about 50%, access time and cycle time of the processor increase 18.57% and 21.41%, respectively. The reduced percentage of the critical path delay is up to 81.17%.  相似文献   

A processor architecture for 3D graphics   总被引:1,自引:0,他引:1  
The DLX/3DCP architecture that uses a method of parallel processing on 3-D vectors to overcome the problem of the large number of floating-point operations required in 3-D graphics which limits the performance of graphics systems is described. The architecture's design offers general-purpose programmability from the high-level object-oriented language C++ and generates performance expected only from dedicated special-purpose hardware. Results that show the architecture's performance on graphics operations are presented and compared to the performance of other RISC processors  相似文献   

This paper describes a real-time vision system (RVS) architecture and performance and its use of an integrated memory array processor (IMAP) prototype. This prototype integrates eight 8-bit processors and a 144-kbit SRAM on a single chip. The RVS was developed with 64 IMAP prototypes connected in series in a 512 processor-system configuration. A host workstation can access the memory on the IMAP prototypes directly through a random access port. Images are inputted and outputted at high speed through serial access ports. The RVS performance is shown in real-time road-image processing and in a neural network simulation, as well as in low-level image processing algorithms, such as filtering, histograms, discrete cosine transform (DCT), and rotation. The RVS image processing is shown to be much faster than the video rate.  相似文献   

A new architecture is presented to support the general class of real-time large-vocabulary speaker-independent continuous speech recognizers incorporating language models. Many such recognizers require multiple high-performance central processing units (CPU's) as well as high interprocessor communication bandwidth. This array processor provides a peak CPU performance of 2.56 giga-floating point operations per second (GFLOPS) as well as a high-speed communication network. In order to efficiently utilize these resources, algorithms were devised for partitioning speech models for mapping into the array processor. Also, a novel scheme is presented for a functional partitioning of the speech recognizer computations. The recognizer is functionally partitioned into six stages, namely, the linear predictive coding (LPC) based feature extractor, mixture probability computer, (phone) state probability computer, word probability computer, phrase probability computer, and traceback computer. Each of these stages is further subdivided as many times as necessary to fit the individual processing elements (PE's). The functional stages are pipelined and synchronized with the frame rate of the incoming speech signal. This partitioning also allows a multistage stack decoder to be implemented for reduction of computation  相似文献   

为实现由汽车类外观设计专利图像到三维模型的恢复,结合汽车类外观专利图像的特点,提出融合从明暗恢复形状(SFS)与轮廓线法的汽车类外观专利图像三维重建方法.结合右视图及主视图的轮廓对由SFS得到的模型进行深度校正,得到较为精确的单幅图像三维模型;利用轮廓法恢复物体的整体模型,通过计算各单幅图像模型及整体模型的位置坐标,进行各模型之间的对齐操作,使两种方法得到的模型融为一体,得到融合模型.实验结果表明,与其它方法相比,该方法得到的三维模型在整体与细节方面都得到较大提升.  相似文献   

Characteristics of positron emission tomography (PET) images that limit human ability to accurately perceive the information the images contain are discussed. They are relatively low spatial resolution, a lack of apparent anatomical information and the expression of metabolic activity in terms of brightness levels (gray levels), which are not efficiently determined by the human visual system. These affect how clearly the 3-D structures contained in the reconstructed 3-D images can be seen. The use of pseudocolor to visualize different levels of activity expressed by brightness and shading to accentuate depth and shape information is described. To further enhance the brightness contrast of a surface with its neighboring areas, stereo and motion were used as depth cues  相似文献   

Yun  Zhu  Jiang  Lin  Wang  Shuai  Huang  Xingjie  Song  Hui  Li  Xueting 《Multimedia Tools and Applications》2018,77(3):3639-3657

With the rapid growth of the amount of computations and power consumption, there is a pressing need for a high power-efficiency architecture, which takes account of computational efficiency and flexibility of application. This paper proposes a type of array-processor architecture for multimedia application which is programmable and self-reconfigurable and consists of 1024 thin-core processing elements (PE). The performance and power dissipation are demonstrated with different multimedia application algorithms such as hash, and fractional motion estimation (FME). The results show that the proposed architecture can provide high performance with less energy consumption using parallel computation.


Three-dimensional medical images reconstructed from a series of two-dimensional images produced by computerized tomography, magnetic resonance imaging, etc., present a valuable tool for modern medicine. Usually, the interresolution between two cross sections is less than the intraresolution within each cross section. Therefore, interpolations are required to create a 3D visualization. Many techniques, including voxel-based and patch tiling methods, apply linear interpolations between two cross sections. Although those techniques using linear interpolations are economical in computation, they need much cross-sectional data and are unable to enlarge because of aliasing. Hence, the techniques that apply two-dimensional nonlinear interpolation functions among cross sections were proposed. In this paper, we introduce the curvature sampling of the contour of a medical object in a CT (computerized tomography) image. Those sampled contour points are the candidates for the control points of Hermite surfaces between each pair of cross sections. Then, a nearest-neighbor mapping of control points between every two cross sections is used for surface formation. The time complexity of our mapping algorithm is O(m + n), where m and n are the numbers of control points of two cross sections. It is much faster than Kehtarnavaz and De Figueiredo's merge method, whose time complexity is O(n3m2).  相似文献   

提出一种新的岩石三维图像裂缝提取算法。首先对三维岩石孔隙模型的每个连通分量执行表面重建、拉普拉斯网格平滑、网格简化等操作。根据三角网格面积和网格单位法向量方向特征,将三角网格划分为不同类别。利用形状因子判定每个三角网格类构成的三维空间结构是否具有裂缝特征。对具有裂缝特征的三维空间结构所包含的体素点集执行形态学膨胀操作,并与原始三维岩石孔隙模型连通分量的体素点集进行逻辑与操作,与操作结果即岩石裂缝。实验结果表明,该方法具有较好的裂缝提取效果。  相似文献   

Parallel 2-d convolution on a mesh connected array processor   总被引:2,自引:0,他引:2  
In this correspondence, a parallel 2-D convolution scheme is presented. The processing structure is a mesh connected array processor consisting of the same number of simple processing elements as the number of pixels in the image. For most windows considered, the number of computation steps required is the same as that of the coefficients of a convolution window. The proposed scheme can be easily extended to convolution windows of arbitrary size and shape. The basic idea of the proposed scheme is to apply the 1-D systolic concept to 2-D convolution on a mesh structure. The computation is carried out along a path called a convolution path in a systolic manner. The efficiency of the scheme is analyzed for windows of various shapes. The ideal convolution path is a Hamiltonian path ending at the center of the window, the length of which is equal to the number of window coefficients. The simple architecture and control strategy make the proposed scheme suitable for VLSI implementation.  相似文献   

This paper presents a parallel natural language processing system implemented on a marker-passing parallel AI computer, the Semantic Network Array Processor (SNAP). Our system uses a memory-based parsing approach in which parsing is viewed as a memory search process. Linguistic information is stored as phrasal patterns in a semantic network knowledge base distributed over the memory of the parallel computer. Parsing is performed by recognizing and linking phrasal patterns that reflect a sentence interpretation. This is achieved by propagating markers over the distributed network. We have developed a system capable of processing newswire articles from a particular domain. The paper presents the structure of the system, the memory-based parsing method used, and the performance results obtained.<>  相似文献   

卷积神经网络具有参数大、运算量大的特点,当将其具体应用在移动端设备时,需要在满足帧率(速度)的前提下,尽量减少功耗与芯片面积.考虑满足现有移动端网络的兼容性、性能和面积等因素,设计一个基于3D可扩展PE阵列的CNN加速器.该加速器兼容3×3卷积、3×3深度可分离卷积、1×1卷积和全连接层,其PE阵列能根据具体应用的网络...  相似文献   

设计了一种基于聚酰亚胺薄膜的三维生物刺激微电极阵列,用于植入式人造视网膜应用.采用非硅MEMS技术,在柔性衬底上制备出具有生物相容性和化学稳定性,电极高度为80 μm的生物刺激电极阵列,通过PDMS牺牲层实现器件从基底的完整释放.实验中器件以聚酰亚胺和PDMS封装,电极柱和焊盘均镀金,从而提高电极的生物相容性.采用三电极法对微电极进行了电化学性能测试,在10-1~105Hz频率范围内,其阻抗为1.5~0.3 kΩ.制造出的器件尺寸小,质量轻,可靠性高,机械柔性好,符合生物电刺激要求.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号