期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The use of vector instructions of a processor architecture for emulating the vector instructions of another processor architecture

K. A. Batuzov 《Programming and Computer Software》2017,43(6):366-372

The complexity of software is ever increasing, and it requires more and more computational resources for its execution. A way to satisfy these requirements is the use of vector instructions that can operate with fixed-length vectors of data of the same. A method for representing vector instructions of one processor architecture in terms of the vector instructions of another architecture during the dynamic binary translation is proposed. An implementation of this method that includes the translation of vector addition and memory access increased the performance of the QEMU emulator by a factor greater than three on an artificial example and 12% on a real-life application. 相似文献

2.

Design of a multitask neurovision processor

George K. Knopf Madan M. Gupta 《Journal of Mathematical Imaging and Vision》1992,2(2-3):233-250

The architecture of a biologically motivated visual-information processor that can perform a variety of tasks associated with the early stages of machine vision is described. The computational operations performed by the processor emulate the spatiotemporal information-processing capabilities of certain neural-activity fields found along the human visual pathway. The state-space model of the neurovision processor is a two-dimensional nural network of densely interconnected nonlinear processing elements PE's. An individual PE represents the dynamic activity exhibited by a spatially localized population of excitatory and inhibitory nerve cells. Each PE may receive inputs from an external signal space as well as from the neighboring PE's within the network. The information embedded within the signal space is extracted by the feedforward subnet. The feedback subnet of the neurovision processor generates useful steady-state and temporal-response characteristics that can be used for spatiotemporal filtering, short-term visual memory, spatiotemporal stabilization, competitive feedback interaction, and content-addressable memory. To illustrate the versatility of the multitask processor design for machine-vision applications, a computer simulation of a simplified vision system for filtering, storing, and classifying noisy gray-level images in presented. 相似文献

3.

Matrix multiplication by diagonals on a vector/parallel processor

Niel K. Madsen Garry H. Rodrigue Jack I. Karush 《Information Processing Letters》1976,5(2):41-45

相似文献

4.

Design of a Lukasiewicz rule-driven fuzzy processor

E. Frías-Martínez 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2002,7(1):65-71

相似文献

5.

Parallel-loop-execution technology for implementation on vector processor

O. V. Lukinova 《Cybernetics and Systems Analysis》1993,29(2):247-249

相似文献

6.

Block jacobi preconditioning of the conjugate gradient method on a vector processor

《国际计算机数学杂志》2012,89(1-4):71-89

The preconditioned conjugate gradient method is well established for solving linear systems of equations that arise from the discretization of partial differential equations. Point and block Jacobi preconditioning are both common preconditioning techniques. Although it is reasonable to expect that block Jacobi preconditioning is more effective, block preconditioning requires the solution of triangular systems of equations that are difficult to vectorize. We present an implementation of block Jacobi for vector computers, especially for the Cray Y-MP/264, and discuss several techniques to improve vectorization. We present these in a progression to show the effect on performance. For the model problem, resulting from a self-adjoint operator, the final implementation of one block Jacobi step uses almost the same amount of time as one point Jacobi step on the Cray Y-MP/264 despite the solution of triangular systems. 相似文献

7.

一种用于浮点DSP的流水线结构DMA设计

宣志斌夏杰张树丹于宗光薛忠杰《微计算机信息》2008,24(32)

本文提出了一种用于32位浮点DSP处理器的改进型DMA结构.采用两级数据流水线结构,外设与内部存储器的数据传输速率比原来提高了一倍.使用verilog HDL语言对其进行编码和仿真,仿真结果表明工作频率达到250MHz以上,满足设计要求. 相似文献

8.

多态并行处理器中的线程管理器设计 总被引：2，自引：2，他引：2

钱博文李涛韩俊刚杨婷刘玉荣《电子技术应用》2014,(2):30-32

基于多态并行处理器提出了一种硬件线程管理器,支持MIMD模式8个线程管理操作和SIMD模式SC控制器统一管理两种工作模式,实现了线程级并行计算;可以监测各个线程的工作情况以及近邻通信寄存器和路由器的状态;能够在通信时停止、切换、启动线程,记录每个线程的工作状态,同时避免了因数据阻塞带来的等待问题,能够最大程度地提高单个处理器的执行效率。相似文献

9.

处理器供电电源的设计

《微型机与应用》2017,(10)

介绍了DC/DC开关稳压电源系统的设计,电源的拓扑采用全桥电路图拓扑、倍流同步整流方式。设计了一款为工业处理器供电的板载电源产品,进行了功率器件的选型并对影响电源效率的主要功率损失进行了分析,完成此款电源产品的PCB设计。最终的分析结果显示,此款电源产品的电性能参数符合客户的预期效果,并成功应用在工业处理器供电设备上。相似文献

10.

Vorton dynamics: a case study of developing a fluid dynamics model for a vector processor

M.J. Kascic 《Parallel Computing》1984,1(1):35-44

The raw performance of vector processors such as the CDC CYBER-205 has been well documented. The ability to apply this raw power to ever more complex algebraic algorithms has been reported in [9]. The final step in making computers of this class truly the revolutionary tools they are claimed to be is to develop whole applications that perform at a significant fraction of the raw power. This involves two distinct subclasses of problems. On the one hand, there are those pre-existing applications that must be mapped onto vector processors in such a way that not only is performance maintained, but also a (sometimes vague) set of computational boundary conditions of the user community is satisfied. On the other hand, there are those models which are developed ab initio with machines such as the CYBER-205 in mind. The development of solutions to problems in the former class involves psychology and politics as well as mathematics and computer science. We limit ourselves here to reporting on an example of the latter class, viz. a model to study a particular fluid-dynamic phenomenon, that was specifically designed with the CYBER-205 in mind. 相似文献

11.

Design and verification of a stack processor virtual component

Stadler M. Thalmann M. Rower T. Kaeslin H. Felber N. Fichtner W. 《Micro, IEEE》2001,21(2):69-80

Hardware and software codesign and flexibility requirements often necessitate embedded application-specific instruction-set processors in system-on-chip designs. Spaceman, a reusable stack-processor virtual component, offers a customer-configurable instruction set; parameterizable bus widths, stack depths, and stack access ranges; and selectable bus interfaces 相似文献

12.

多核同时多线程处理器的线程调度器设计

《电子技术应用》2016,(1):19-21

多核同时多线程处理器(SMT_PAAG)是用于图形、图像及数字信号处理的一种多核处理器。基于这种处理器提出了一种硬件线程调度器,该调度器采用同时多线程技术,最多可同时执行四个线程,支持八个线程阻塞模式下的快速上下文切换。这样避免了因阻塞带来的等待问题,能够有效提高处理器的工作效率和资源利用率。通过在处理器上运行图形处理算法进行性能评测。结果表明,SMT-PAAG处理器通过挖掘指令级并行和线程级并行,将处理器的性能提高了69.25%。相似文献

13.

多态并行处理器的数据通信和路由器的设计 总被引：2，自引：1，他引：2

海虎李涛韩俊刚杨婷《电子技术应用》2014,(8)

随着多核技术的发展,核间通信问题面临新的挑战,核间通信性能决定了整个多核处理器的性能。通过分析多核处理器的数据通信需求,提出了一种适用于多态并行处理器的数据通信结构。该结构采用邻接共享寄存器实现的核间近邻通信和路由器硬件加速结构实现的远程通信两种数据通信方式,远程通信机制的路由器使用输入缓存机制实现,采用经典的确定性路由算法——XY路由算法实现了路由计算,加入多播和容错技术,采用专用的仲裁机制简化了设计复杂度。这些改进降低了处理器的核间通信延迟和功耗,提高了多态并行处理器的性能。相似文献

14.

基于FPGA的新型浮点FFT处理器设计

范展梁国龙刘洋《电子技术应用》2008,34(5):23-26

针对现有FFT算法结构复杂、难以并行扩展的问题,提出了一种改进的FFT算法,在此基础上设计了一种基于浮点运算的FFT处理器,并进行了仿真验证。结果表明,新算法大大简化了系统结构,减少了系统的硬件开销,非常容易并行实现,且显著提高了运算效率,完成一次N点的FFT运算只需要N/2个时钟,完全满足实时信号处理的要求。相似文献

15.

Computations with symmetric, positive definite and band matrices on a parallel vector processor

Zahari Zlatev 《Parallel Computing》1988,8(1-3):301-312

Computations involving symmetric, positive definite and band matrices are kernel operations in the numerical treatment of many models arising in science and engineering. It is desirable to achieve a high level of performance when such operations are to be carried out on a vector processor. If the operations are performed by rows or columns (as in the EXTENDED BLAS subroutines), then the loops are vectorized but the speed of computations, measured in Mflops, is not very high, because the arrays involved are normally short. Therefore the computations should be organized by diagonals. Furthermore, some special devices are to be applied in order to unrol the loops. Finally, one should be careful with the storage scheme. It is demonstrated that if (i) the computations are organized by diagonals, (ii) the main loops are unrolled and (iii) the storage scheme is such that the work with some zero-elements is avoided, then the speed of computations is nearly the same as that obtained in the computations with dense matrices. If a particular vector machine is in use (in our case a CRAY X-MP computer), then the speed can be increased further by (iv) coding some basic operations in machine language and (v) using the different processors of the vector computer in parallel. The efficiency of the exploitation of the special features of the particular computer that is to be used is also illustrated by numerical examples.

Kernel subroutines performing matrix-vector multiplications are described. Representative tests are used to demonstrate the efficiency of these kernels. 相似文献

16.

一种脉冲多普勒雷达数字信号处理机的设计 总被引：1，自引：0，他引：1

王旭何佩琨毛二可《电子技术应用》2008,34(2):39-41

针对某型脉冲对多普勒雷达的信号处理要求,设计了一种全数字化信号处理机。该信号处理机采用"ADC+FPGA+DSP+存储器"结构,具有体积小、重量轻、功耗低、可靠性高等优点。重点讨论了信号处理中数据采集、脉冲积累及目标检测的方法和实现。相似文献

17.

一种多线程轻核机器中进程管理的硬件实现 总被引：2，自引：0，他引：2

王维李涛韩俊刚《电子技术应用》2013,39(3):40-43

提出了一种多线程轻核处理器的进程管理器硬件设计。为了得到更好的效果,该进程管理器拥有一个内建的事件管理器来监测等待进程的触发条件,进程的调度也采用硬件实现。所设计的并行轻核处理器的任务管理器由ALU、存储系统和内置路由器构成,用来处理进程。相似文献

18.

时钟共享多线程处理器通信机制的设计与实现

《电子技术应用》2016,(3):42-46

多核多线程处理器~([1])是并行技术的一个发展方向,基于多核多线程处理器,提出了一种时钟共享多线程处理器。该处理器有近邻通信和线程间通信两种通信机制,近邻通信采用近邻共享FIFO来传递信息,线程间通信通过线程间共享存储来传递信息,这样可以提高处理器的资源利用率和并行执行能力。相似文献

19.

时钟共享多线程处理器SIMD控制器设计与实现

《电子技术应用》2016,(11):29-32

针对图形图像处理器中指令与数据加载以及数据收集的问题,设计和实现了一种时钟共享多线程处理器中的SIMD控制器,完成相关SIMD指令的发送、数据的加载和数据的收集。该控制器以实现高效的数据级并行计算为目标,采用有限状态机实现了前向处理单元、行控制器和列控制器的设计。实验结果表明,所设计的专用硬件电路能够有效提高图形图像处理器处理并行数据的能力。相似文献

20.

一种基于单片机系统的移动存储器的设计与应用

梁西银马小倩兰建平《微型机与应用》2012,31(1):30-32,36

设计了一种基于单片机系统的数据采集移动存储器,其容量可达几兆位,传输速度可达1Mb／s。为更好地解决智能仪器中非实时数据的采集、存储以及与计算机之间数据交换的问题提供了一种新的方法和思路。相似文献