期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

朱文付宇卓谢憬《微电子学与计算机》2009,26(6)

描述了一种改进型可重构处理器--GRCC(General Reconfigurable Coprocessor).该处理器能够使用一般通用RISC处理器的协处理器接口,通过与通用处理器的协处理器指令通信,达到辅助主处理器进行大规模密集计算的目的.着重介绍了DCT算法在GRCC中的映射与实现,仿真结果显示,GRCC能达到6倍以上于通用处理器的性能,并在实现复杂度、运行效率与通用性中达到了一个权衡. 相似文献

2.

DReAC:一种新型动态可重构协处理器 总被引：1，自引：1，他引：0

下载免费PDF全文

宋宇鲲高明伦邓红辉王锐胡永华《电子学报》2007,35(5):833-837

本文提出了一种应用于数据并行和高密度计算任务的新型动态可重构协处理器——DReAC.DReAC可以独立地以并行或流水工作模式重构协处理器内部数据路径,完成主处理器分配的任务.DReAC由全局控制器、计算阵列和阵列数据缓冲区三部分组成.文中简要介绍了DReAC系统模型,并使用该模型模拟了部份典型算法在DReAC中的实现.仿真结果表明,在典型的多媒体和信号处理应用中,DReAC能够达到通用处理器的10倍以上的速度,甚至在某些应用中远优于其他可重构处理器的性能. 相似文献

3.

空间太阳望远镜星载数据处理系统中的动态可重构协处理器研究

下载免费PDF全文

蔡洪波金声震《电子学报》2005,33(9):1717-1719

本文提出了一种为空间太阳望远镜星载数据处理系统而设计的动态可重构协处理器方案,该方案利用4bits粒度可重构阵列将传统的基于指令流的运算方式变为基于数据流与配置流的运算方式,并通过指令流水实现了动态可重构单元与主处理器的协同工作.文章最后还给出了该方案在Xilinx XC2V3000上的实现及该实现用于乘法和1024点复数快速傅立叶变换时的性能. 相似文献

4.

一种基于嵌入式微处理器SIMD核的可重构数据总线设计

王光《电子产品世界》2012,(11):32-34

提出了一种基于可重构总线的数据并行体系结构。首先,针对现代多媒体处理中存在的问题,提出了一种基于可重构总线的一维处理单元阵列体系结构;其次,设计各处理单元之间的通信模块以及处理元之间的数据传递方式,即可重构数据总线的设计;最后,通过对几种常用的图像处理算法的验证,表明基于可重构总线的一维SIMD体系结构在逻辑上具有可行性。相似文献

5.

顺序形态图像处理器的硬件实现

张波张焕春经亚枝《电子与信息学报》2004,26(12):1856-1862

该文在阐述了灰度图像顺序形态变换的基础上,介绍了顺序形态变换硬件实现的图像处理系统。该系统采用DSP+FPGA的框架结构,利用FPGA的可重构特性将其中一片FPGA作为协处理器可以实现不同的图像处理功能。文中将软硬件实现的顺序形态图像处理图片在处理效果和速度两个方面作了比较。算法在FPGA芯片上的高速实现特征使数学形态学在图像实时处理领域的应用成为可能。相似文献

6.

完全搜索块匹配和图像空域滤波的可重构芯核

田辉杨华中汪蕙《微电子学》2002,32(6):401-404

研究了一种用于图像领域的可重构微芯核结构,它可以支持运动估计算法中的完全搜索块匹配算法,同时也支持图像处理中的大部分空域模板滤波处理算法.在实现这两种算法时,也分别有一定的可重构性.该结构是一种并行处理结构,具有相当高的处理速度.文章提出的数据存储结构解决了并行处理数据与外部存储器的交换问题. 相似文献

7.

一种嵌入式SIMD协处理器地址产生器的设计

周国昌沈绪榜王忠车德亮《微电子学与计算机》2006,23(11):4-7

文章介绍了一种新的嵌入式SIMD协处理器地址产生器.该地址产生器主要完成地址计算和协处理器指令的场抽取功能.为了提高协处理器的性能,地址产生器中设计了新的传送路径.该传送路径能够不通过地址产生器中的ALU而把数据送入寄存器中,这个传送路径能够减少ldN指令的一个延迟周期.在SMIC0.18微米标准库单元下,该地址产生器的延迟能够满足周期为10ns的协处理器. 相似文献

8.

一种面向流应用加速的可重构协处理器

下载免费PDF全文

曹姗李兆麟《微电子学》2016,46(1):86-89

以图形处理、数字信号处理等为代表的流应用,对微处理器提出了高并行度、高性能和高带宽的要求。针对流应用加速的流处理器体系架构得到了广泛研究。流体系结构大多集成大量的功能单元、开发多层次并行和存储来加速流应用,但同时增加了系统功耗和芯片面积。分析和比较了近年来主流的流处理器架构,提出了一种用于流应用加速的可重构协处理器。该协处理器针对流应用特点,实现了数据级和指令级并行,并集成了多个可以动态配置的运算单元,可动态配置其运算类型和数据类型,提升系统灵活性,降低芯片面积。针对典型算法,该处理器实现了更高的加速比,综合后延时为9.74 ns,功耗为63.69 mW。相似文献

9.

一种基于亚字并行的可重构阵列设计

陶文卿《信息技术》2010,(4):55-58

设计了一种面向多媒体处理的8×8可重构处理阵列,并在该阵列基础上,对其粒度进行改进,提出了一种基于亚字并行的改进型可重构阵列设计思路.该设计根据图像处理中的算法的位宽特点,实现了一种数据的高位和低位可以同时运算的可重构阵列单元,有效提高数据的并行度,使得阵列的处理速度得到了显著的提高.在典型的图像处理中,这种改进型可重构阵列的处理能力较原来增加了一倍. 相似文献

10.

一种新的图像处理系统的研究 总被引：1，自引：1，他引：0

李长乐刘玉斌赵杰《半导体光电》2010,31(2)

针对当前图像处理系统存在的处理性能和系统灵活性等问题,提出了一种采用可重构技术和图像并行处理技术实现的图像处理系统。研究了动态可重构技术理论及可重构系统的特点,并且研究了图像并行处理系统的设计及算法实现的方法,分析了目前图像处理系统中存在的问题,利用FPGA(Field)可以多次重复配置的特性,设计了可重构图像并行处理系统。同时,在研究了分布式算法的基础上,实现了图像处理算法。设计了采用多IP核实现图像并行处理系统。系统可以根据计算任务的不同,并同时考虑到并行处理系统负载平衡性,设置不同的计算节点数量,达到了既能够满足系统的需求,又可以节约硬件成本的效果。通过实验,验证了系统的可行性。相似文献

11.

Image processing on a memory array architecture

P. T. Balsara M. J. Irwin 《The Journal of VLSI Signal Processing》1991,2(4):313-324

In this paper we examine the usefulness of a simple memory array architecture to several image processing tasks. This architecture, called theAccess Constrained Memory Array Architecture (ACMAA) has a linear array of processors which concurrently access distinct rows or columns of an array of memory modules. We have developed several parallel image processing algorithms for this architecture. All the algorithms presented in this paper achieve a linear speed-up over the corresponding fast sequential algorithms. This was made possible by exploiting the efficient local as well as global communication capabilities of the ACMAA. 相似文献

12.

Parallel image processing with the block data parallel architecture 总被引：2，自引：0，他引：2

Alexander W.E. Reeves D.S. Gloster C.S. Jr. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》1996,84(7):947-968

Many digital signal and image processing algorithms can be speeded up by executing them in parallel on multiple processors. The speed of parallel execution is limited by the need for communication and synchronization between processors. In this paper, we present a paradigm for parallel processing that we call the block data flow paradigm (BDFP). The goal of this paradigm is to reduce interprocessor communication and relax the synchronization requirements for such applications. We present the block data parallel architecture which implements this paradigm, and we present methods for mapping algorithms onto this architecture. We illustrate this methodology for several applications including two-dimensional (2-D) digital filters, the 2-D discrete cosine transform, QR decomposition of a matrix and Cholesky factorization of a matrix. We analyze the resulting system performance for these applications with regard to speedup and efficiency as the number of processors increases. Our results demonstrate that the block data parallel architecture is a flexible, high-performance solution for numerous digital signal and image processing algorithms 相似文献

13.

A VLSI Architecture for Image Registration in Real Time

Gupta N. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2007,15(9):981-989

Image registration is an ubiquitous task occurring in countless image analysis applications. A dedicated implementation of image registration algorithms is the best approach to meet the intensive computation requirements of implementing image registration schemes in real time. This paper presents an efficient VLSI architecture for real-time implementation of image registration algorithms using an exhaustive search method. Normalized cross correlation function, mean square error, and blue screen technique algorithms are implemented for image registration. The architecture is based on a data flow design that allows sequential inputs but performs parallel processing. Based on the architecture, a programmable chip can be designed for image registration. Chips can be cascaded to achieve better performance and sizes of both the search and the reference image which can vary with time from a small to a very large value. 相似文献

14.

基于FPGA的直方图均衡实时并行算法及新架构实现 总被引：2，自引：1，他引：1

刘延任永杰李群伟王琳《红外技术》2010,32(3)

直方图均衡处理在图像增强领域中发挥了重要的作用,并且也是其他算法的重要基础.实现了串行算法拆分成并行算法的过程以及给出了利用FPGA为平台构建图像直方图均衡的实时处理架构的解决方案. 相似文献

15.

用COTS多处理机实现红外成像跟踪系统 总被引：5，自引：2，他引：5

崔春明王天冠等《红外与毫米波学报》2002,21(4):261-265

研究了一种基于COTS多处理器的实时红外多目标成像跟踪处理系统，详细描述了在COTS多处理器上实现的跟踪处理算法，同时给出了整个系统的软硬件框架，这种基于编程的图像处理系统具有高效，灵活的特点，修改起来非常方便，该系统自研制成功以来已进行了多次试验，取得了良好的结果。相似文献

16.

红外与可见光图像实时配准融合系统 总被引：13，自引：3，他引：10

刘卫光郭师红周利华《红外技术》2004,26(5):66-71

描述了一个自主研制的基于实时分布式多处理机的图像配准和融合系统的设计与实现方案。本系统是具有并行计算机体系结构的通用高速实时图像融合处理系统,选择VxWorks实时操作系统和VEM64x总线的软硬件平台,采用AD公司新型的TS101DSP处理器为核心,多DSP处理器分布并行进行处理,完成多源图像实时高速配准和融合需要进行的大量运算,CPLD芯片完成了采集控制以及多传感器视频同步。由于采用了基于高性能DSP的实时嵌入式系统和通用标准化总线结构设计,该系统可以灵活地应用多种配准和融合算法来实现可见光和红外双通道数字图像的高速实时融合处理,比较好地解决多尺度图像配准融合算法的大数据量计算处理与系统实时性要求之间的矛盾,为多传感器实时图像配准融合处理系统的研制奠定了良好的技术基础。相似文献

17.

The PANOPTIC Camera: A Plenoptic Sensor with Real-Time Omnidirectional Capability

Hossein Afshari Laurent Jacques Luigi Bagnato Alexandre Schmid Pierre Vandergheynst Yusuf Leblebici 《Journal of Signal Processing Systems》2013,70(3):305-328

A new biologically-inspired vision sensor made of one hundred “eyes” is presented, which is suitable for real-time acquisition and processing of 3-D image sequences. This device, named the Panoptic camera, consists of a layered arrangement of approximately 100 classical CMOS imagers, distributed over a hemisphere of 13 cm in diameter. The Panoptic camera is a polydioptric system where all imagers have their own vision of the world, each with a distinct focal point, which is a specific feature of the Panoptic system. This enables 3-D information recording such as omnidirectional stereoscopy or depth estimation, applying specific signal processing. The algorithms dictating the image reconstruction of an omnidirectional observer located at any point inside the hemisphere are presented. A hardware architecture which has the capability of handling these algorithms, and the flexibility to support additional image processing in real time, has been developed as a two-layer system based on FPGAs. The detail of the hardware architecture, its internal blocks, the mapping of the algorithms onto the latter elements, and the device calibration procedure are presented, along with imaging results. 相似文献

18.

Top down structured parallelisation of embedded image processingapplications

Downton A.C. Tregidgo R.W.S. Cuhadar A. 《Vision, Image and Signal Processing, IEE Proceedings -》1994,141(6):431-437

The authors present a general system design method which is intended to support parallelisation of complete image processing applications using MIMD processors. The approach is based upon the utilisation of a generic system level parallel processor architecture, the `pipeline processor farm'(PPF), and is applicable to any embedded application with continuous input/output. The design method is illustrated using applications from the fields of computer vision and image coding. The design model accommodates several commonly exploited parallel processing paradigms, maps conveniently to the software structure of most image processing algorithms, provides incrementally scalable performance, and enables upper-bound speedups to be easily estimated from profiling data generated by the original sequential implementation of the application. It is believed that the approach has significant application in parallel embedded systems design, in the development environment, and in simulation work for computationally intensive image coding algorithms 相似文献

19.

A real-time multitarget tracking system with robust multichannel CNN-UM algorithms 总被引：1，自引：0，他引：1

Timar G. Rekeczky C. 《IEEE transactions on circuits and systems. I, Regular papers》2005,52(7):1358-1371

This paper introduces a tightly coupled topographic sensor-processor and digital signal processor (DSP) architecture for real-time visual multitarget tracking (MTT) applications. We define real-time visual MTT as the task of tracking targets contained in an input image flow at a sampling-rate that is higher than the speed of the fastest maneuvers that the targets make. We utilize a sensor-processor based on the cellular neural network universal machine architecture that permits the offloading of the main image processing tasks from the DSP and introduces opportunities for sensor adaptation based on the tracking performance feedback from the DSP. To achieve robustness, the image processing algorithms running on the sensor borrow ideas from biological systems: the input is processed in different parallel channels (spatial, spatio-temporal and temporal) and the interaction of these channels generates the measurements for the digital tracking algorithms. These algorithms (running on the DSP) are responsible for distance calculation, state estimation, data association and track maintenance. The performance of the proposed system is studied using actual hardware for different video flows containing rapidly moving maneuvering targets. 相似文献

20.

A High Speed VLSI Architecture for Handwriting Recognition

Francesco Gregoretti Roberto Passerone Leonardo Maria Reyneri Claudio Sansoé 《The Journal of VLSI Signal Processing》2001,28(3):259-278

This article presents PAPRICA-3, a VLSI-oriented architecture for real-time processing of images and its implementation on HACRE, a high-speed, cascadable, 32-processors VLSI slice. The architecture is based on an array of programmable processing elements with the instruction set tailored to image processing, mathematical morphology, and neural networks emulation. Dedicated hardware features allow simultaneous image acquisition, processing, neural network emulation, and a straightforward interface with a hosting PC.HACRE has been fabricated and successfully tested at a clock frequency of 50 MHz. A board hosting up to four chips and providing a 33 MHz PCI interface has been manufactured and used to build BEATR IX, a system for the recognition of handwritten check amounts, by integrating image processing and neural network algorithms (on the board) with context analysis techniques (on the hosting PC). 相似文献