期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Orthogonal multiprocessor sharing memory with an enhanced mesh for integrated image understanding

《CVGIP: Image Understanding》1991,53(1):31-45

This paper proposes a new parallel architecture, which has the potential to support low-level image processing as well as intermediate and high-level vision analysis tasks efficiently. The integrated architecture consists of an SIMD mesh of processors enhanced with multiple broadcast buses, and MIMD multiprocessor with orthogonal access buses, and a two-dimensional shared memory array. Low-level image processing is performed on the mesh processor, while intermediate and high-level vision analysis is performed on the orthogonal multiprocessor. The interaction between the two levels is supported by a common shared memory. Concurrent computations and I/O are made possible by partitioning the memory into disjoint spaces so that each processor system can access a different memory space. To illustrate the power of such a two-level system, we present efficient parallel algorithms for a variety of problems from low-level image processing to high-level vision. Representative problems include matrix based computations, histogramming and key counting operations, image component labeling, pyramid computations, Hough transform, pattern clustering, and scene labeling. Through computational complexity analysis, we show that the integrated architecture meets the processing requirements of most image understanding tasks. 相似文献

2.

Optimal Use of Mixed Task and Data Parallelism for Pipelined Computations

《Journal of Parallel and Distributed Computing》2000,60(3):297-319

This paper addresses optimal mapping of parallel programs composed of a chain of data parallel tasks onto the processors of a parallel system. The input to the programs is a stream of data sets, each of which is processed in order by the chain of tasks. This computation structure, also referred to as a data parallel pipeline, is common in several application domains, including digital signal processing, image processing, and computer vision. The parameters of the performance for such stream processing are latency (the time to process an individual data set) and throughput (the aggregate rate at which data sets are processed). These two criteria are distinct since multiple data sets can be pipelined or processed in parallel. The central contribution of this research is a new algorithm to determine a processor mapping for a chain of tasks that optimizes latency in the presence of a throughput constraint. We also discuss how this algorithm can be applied to solve the converse problem of optimizing throughput with a latency constraint. The problem formulation uses a general and realistic model of intertask communication and addresses the entire problem of mapping, which includes clustering tasks into modules, assigning of processors to modules, and possible replicating of modules. The main algorithms are based on dynamic programming and their execution time complexity is polynomial in the number of processors and tasks. The entire framework is implemented as an automatic mapping tool in the Fx parallelizing compiler for a dialect of High Performance Fortran. 相似文献

3.

基于计算机视觉的Transformer研究进展 总被引：1，自引：0，他引：1

下载免费PDF全文

刘文婷卢新明《计算机工程与应用》2022,58(6):1-16

Transformer是一种基于自注意力机制、并行化处理数据的深度神经网络.近几年基于Transformer的模型成为计算机视觉任务的重要研究方向.针对目前国内基于Transformer综述性文章的空白,对其在计算机视觉上的应用进行概述.回顾了Transformer的基本原理,重点介绍了其在图像分类、目标检测、图像分割... 相似文献

4.

A template polynomial approach for image processing and visual recognition

Kai Qian Prabir Bhattacharya 《Pattern recognition》1992,25(12):1505-1515

A polynomial approach to the representation of gray images for machine vision is described. An algebraic system is developed where a polynomial in two variables with real coefficients represents a gray image and it is shown that most of the standard image processing tasks like smoothing, edge detection, rotation and magnification can be done by operating certain polynomials called template polynomials. This method is also applied to connected component labelling, shape decomposition, template matching, and the skeletonization of a gray image without a priori thresholding. A technique is developed to decompose a template and do parallel processing. 相似文献

5.

ParaC:面向GPU平台的图像处理领域的编程框架

卢兴敬刘雷贾海鹏冯晓兵武成岗《软件学报》2017,28(7):1655-1675

GPGPU加速器是当前提高图像处理算法性能的主流加速平台,但是,在GPGPU平台上,同一个程序充分利用硬件体系结构特征和软件特征的优化版本与简单实现版本在性能上会有数量级的差异。GPGPU加速器具有多维多层的大量执行线程和层次化存储体系结构,后者的不同层次具有不同的容量、带宽、延迟和访问权限。同时,图像处理应用程序具有复杂的计算操作、边界处理规则和数据访问特性。因此,任务的并发执行模式、线程的组织方式和并发任务到设备的映射不仅影响到程序的并发度、调度、通信和同步等特性,而且也会影响到访存的带宽、延迟等。因此,GPGPU平台上的程序优化是一个困难、复杂且效率较低的过程。本文提出基于语言扩展的领域编程模型：ParaC。ParaC编程环境利用高层语言扩展描述的程序语义信息,自动分析获取应用程序的操作信息、并发任务间的数据重用信息和访存信息等程序特征,同时结合硬件平台特征,利用基于领域先验知识驱动的编译优化模型自动生成GPGPU平台上的优化代码,最后,利用源源变换编译器生成标准OpenCL程序。本文在测试用例上的实验结果表明,ParaC在GPGPU平台上自动生成的优化版本相对于手工优化版本的加速比最高达到3.22倍,但代码行数只是后者的1.2%到39.68%。相似文献

6.

Distributed versus parallel computing

A. Ramsay 《Artificial Intelligence Review》1986,1(1):11-25

The elegant but simple von Neumann single processing design of computers has been challenged by new applications of databases, computer vision and speech where a multi-processing system seems more suited to such tasks. We look at three ways in which parallel machines may be used: for general purpose computing, for algorithms which are not well suited to von Neumann machines and for exploring forms of computation which cannot reasonably be dealt with on von Neumann machines. 相似文献

7.

Global Optimization for Mapping Parallel Image Processing Tasks on Distributed Memory Machines

《Journal of Parallel and Distributed Computing》1997,45(1):29-45

Many parallel algorithms and library routines for computer vision and image processing (CVIP) tasks on distributed-memory multiprocessors are available. The typical image distribution may use column, row, and block based mapping. Integrating a set of library routines for a CVIP application requires a global optimization to determine the data mapping of individual tasks by considering inter-task communication. The main difficulty in deriving the optimal image data distribution for each task is that CVIP task computation may involve loops, and the number of processors available and the size of the input image may vary at the run time. In this paper, a CVIP application is modeled using a task chain with imperfectly nested loops, specified by conventional visual languages such asKhorosandExplorer. A mapping algorithm is proposed that optimizes the average run-time performance for CVIP applications with nested loops by considering the data redistribution overheads and possible run-time parameter variations. A taxonomy of CVIP operations is provided and used for further reducing the complexity of the algorithm. Experimental results on both low-level image processing and high-level computer vision applications are presented to validate this approach. 相似文献

8.

A vision-taste interference model and the EEG measurement

Hisaya Tanaka Yuichi Sato 《Artificial Life and Robotics》2011,16(3):393-397

Taste cognition can be interfered with by visual information, but the mechanism by which this happens has not been clarified. We assumed an interference model in the processes of taste and vision information. The model was tested with frequency analysis on EEG and using the switch response time. The tasks were matched/miss-matched between taste and vision information about orange juice and apple juice. There were changes in the α waves that originated in the visual processing of a juice package, and changes in the β waves that originated in the taste processing. There is the possibility of a parallel processing mechanism in the vision-taste interference. 相似文献

9.

A concept of dynamically reconfigurable real-time vision system for autonomous mobile robotics 总被引：1，自引：0，他引：1

Aymeric De Cabrol Thibault Garcia Patrick Bonnin Maryline Chetto 《国际自动化与计算杂志》2008,5(2):174-184

This paper describes specific constraints of vision systems that are dedicated to be embedded in mobile robots. If PC-based hardware architecture is convenient in this field because of its versatility, flexibility, performance, and cost, current real-time operating systems are not completely adapted to long processing with varying duration, and it is often necessary to oversize the system to guarantee fail-safe functioning. Also, interactions with other robotic tasks having more priority are difficult to handle. To answer this problem, we have developed a dynamically reconfigurable vision processing system, based on the innovative features of Cleopatre real-time applicative layer concerning scheduling and fault tolerance. This framework allows to define emergency and optional tasks to ensure a minimal quality of service for the other subsystems of the robot, while allowing to adapt dynamically vision processing chain to an exceptional overlasting vision process or processor overload. Thus, it allows a better cohabitation of several subsystems in a single hardware, and to develop less expensive but safe systems, as they will be designed for the regular case and not rare exceptional ones. Finally, it brings a new way to think and develop vision systems, with pairs of complementary operators. 相似文献

10.

A robust eigenspace method for obtaining feature values in high-speed massively parallel vision systems

Toshiharu Mukai Noboru Ohnishi 《Machine Vision and Applications》2000,12(4):197-202

Image-processing systems, each consisting of massively parallel photodetectors and digital processing elements on a monolithic circuit, are currently being developed by several researchers. Some earlyvision-like processing algorithms are installed in the vision systems. However, they are not sufficient for applications because their output is in the form of pattern information, so that, in order to respond to input, some feature values are required to be extracted from the pattern. In the present paper, we propose a robust method for extracting feature values associated with images in a massively parallel vision system. 相似文献