共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
O. L. Perevozchikova V. G. Tulchinsky R. A. Yushchenko 《Cybernetics and Systems Analysis》2006,42(4):559-569
Problems of high-performance processing of mass cluster data are considered. Estimates of execution times of parallel data
processing programs and a heuristic algorithm of optimization of cluster architectures for such problems are proposed.
__________
Translated from Kibernetika i Sistemnyi Analiz, No. 4, pp. 117–129, July–August 2006. 相似文献
3.
An increasing awareness of the need for high speed parallel processing systems for image analysis has stimulated a great deal of interest in the study of such systems. These studies have focussed primarily on specific algorithms and while they demonstrated the utility of such an approach, few general principles have evolved. As a result, it is still uncertain how one may go about addressing a given application. This paper first presents techniques for formulating parallel image processing tasks by focussing on one or more components of an image processing environment. Then a parallel processing model is proposed which specifies the interaction among tasks formulated in this manner. The techniques and model enable one to determine constraints on architectural features required to achieve predefined performance levels and compare and contrast different formulations. 相似文献
4.
Efficient parallel processing of image contours 总被引:1,自引:0,他引:1
Chen L.T. Davis L.S. Kruskal C.P. 《IEEE transactions on pattern analysis and machine intelligence》1993,15(1):69-81
Describes two parallel algorithms for ranking the pixels on a curve in O (log N ) time using either an EREW or CREW PRAM model. The algorithms accomplish this with N processors for a √N ×√N image. After applying such an algorithm to an image, it is possible to move the pixels from a curve into processors having consecutive addresses. This is important because one can subsequently apply many algorithms to the curve (such as piecewise linear approximation algorithms or point in polygon tests) using segmented scan operations (i.e. parallel prefix operations). Scan operations can be executed in logarithmic time on many interconnection networks, such as hypercube, tree, butterfly, and shuffle exchange machines as well as on the EREW PRAM. The algorithms were implemented on the hypercube structured Connection Machine, and various performance tests were conducted 相似文献
5.
This paper presents several static and dynamic data decomposition techniques for parallel implementation of common computer vision algorithms. These techniques use the distribution of features in the input data as a measure of load for data decomposition. Experimental results are presented by implementing algorithms from a motion estimation system using these techniques on a hypercube multiprocessor. Normally in a vision system a sequence of algorithms is employed in which output of an algorithm is input to the next algorithm in the sequence. The distribution of features computed as a by-product of the current task is used to repartition the data for the next task in the system. This allows parallel computation of feature distribution, and therefore the overhead of estimating the load is kept small. It is observed that the communication overhead to repartition data using these run-time decomposition techniques is very small. It is shown that significant performance improvements over uniform-block-oriented partitioning schemes are obtained. 相似文献
6.
7.
E. V. Rusin 《Pattern Recognition and Image Analysis》2009,19(3):559-561
An experimental library of image processing for multiprocessor computers SSCC_PIPL is described in this paper. The principles of formation, adopted architectural solutions, and results of test experiments are presented. 相似文献
8.
Prieto M. Llorente I.M. Tirado F. 《Parallel and Distributed Systems, IEEE Transactions on》2000,11(11):1141-1150
The aim of this paper is to study the effect of local memory hierarchy and communication network exploitation on message sending and the influence of this effect on the decomposition of regular applications. In particular, we have considered two different parallel computers, a Cray T3E-900 and an SGI Origin 2000. In both systems, the bandwidth reduction due to non-unit-stride memory access is quite significant and could be more important than the reduction due to contention in the network. These conclusions affect the choice of optimal decompositions for regular domains problems. Thus, although traditional 3D decompositions lead to lower inherent communication-to-computation ratios and could exploit more efficiently the interconnection network, lower dimensional decompositions are found to be more efficient due to the data decomposition effects on the spatial locality of the messages to be communicated. This increasing importance of local optimisations has also been shown using a well-known communication-computation overlapping technique which increases execution time, instead of reducing it as we could expect, due to poor cache memory exploitation. 相似文献
9.
In this paper, we introduce a novel framework for low-level image processing and analysis. First, we process images with very simple, difference-based filter functions. Second, we fit the 2-parameter Weibull distribution to the filtered output. This maps each image to the 2D Weibull manifold. Third, we exploit the information geometry of this manifold and solve low-level image processing tasks as minimisation problems on point sets. For a proof-of-concept example, we examine the image autofocusing task. We propose appropriate cost functions together with a simple implicitly-constrained manifold optimisation algorithm and show that our framework compares very favourably against common autofocus methods from literature. In particular, our approach exhibits the best overall performance in terms of combined speed and accuracy. 相似文献
10.
User transparency: a fully sequential programming model for efficient data parallel image processing
Although many image processing applications are ideally suited for parallel implementation, most researchers in imaging do not benefit from high‐performance computing on a daily basis. Essentially, this is due to the fact that no parallelization tools exist that truly match the image processing researcher's frame of reference. As it is unrealistic to expect imaging researchers to become experts in parallel computing, tools must be provided to allow them to develop high‐performance applications in a highly familiar manner. In an attempt to provide such a tool, we have designed a software architecture that allows transparent (i.e. sequential) implementation of data parallel imaging applications for execution on homogeneous distributed memory MIMD‐style multicomputers. This paper presents an extensive overview of the design rationale behind the software architecture, and gives an assessment of the architecture's effectiveness in providing significant performance gains. In particular, we describe the implementation and automatic parallelization of three well‐known example applications that contain many fundamental imaging operations: (1) template matching; (2) multi‐baseline stereo vision; and (3) line detection. Based on experimental results we conclude that our software architecture constitutes a powerful and user‐friendly tool for obtaining high performance in many important image processing research areas. Copyright © 2004 John Wiley & Sons, Ltd. 相似文献
11.
R. M. Sotnezov 《Pattern Recognition and Image Analysis》2009,19(3):469-477
Genetic algorithms for the search for minimal covering of a Boolean matrix are developed and studied. This problem arises
in image recognition if methods of combinatorial (logical) analysis of information are used to synthesize recognizing procedures. 相似文献
12.
An increasing awareness of the need for high speed parallel processing systems for image analysis has stimulated a great deal of interest in the design and development of such systems. Efficient processing schemes for several specific problems have been developed providing some insight into the general problems encountered in designing efficient image processing algorithms for parallel architectures. However it is still not clear what architecture or architectures are best suited for image processing in general, or how one may go about determining those which are. An approach that would allow application requirements to specify architectural features would be useful in this context. Working towards this goal, general principles are outlined for formulating parallel image processing tasks by exploiting parallelism in the algorithms and data structures employed. A synchronous parallel processing model is proposed which governs the communication and interaction between these tasks. This model presents a uniform framework for comparing and contrasting different formulation strategies. In addition, techniques are developed for analyzing instances of this model to determine a high level specification of a parallel architecture that best ‘matches’ the requirements of the corresponding application. It is also possible to derive initial estimates of the component capabilities that are required to achieve predefined performance levels. Such analysis tools are useful both in the design stage, in the selection of a specific parallel architecture, or in efficiently utilizing an existing one. In addition, the architecture independent specification of application requirements makes it a useful tool for benchmarking applications. 相似文献
13.
Domain decomposition for parallel processing of spatial problems 总被引:2,自引:0,他引:2
Spatial models often are not used to their fullest potential because they have massive computational requirements. Existing workstations and microcomputers often must solve these models in batch mode and, consequently, decision makers are unable to explore and resolve complex spatial problems in an interactive and graphical environment similar to that provided by general purpose business software. Parallel processing can solve spatial models at high speed, however, greatly decreasing turnaround times and enabling decision makers quickly to see the results of revising parameters and criteria. To reap these benefits in a parallel processing environment, researchers must recast modelling procedures from their existing sequentially-oriented form to one in which parallelism can be exploited. This process, referred to as domain decomposition, is a fundamental enterprise in parallel spatial modelling. Domain decomposition for spatial problems can be structured by a set of general principles which are described and illustrated using an example from location-allocation modelling. 相似文献
14.
Two-dimensional data obtained from a histological cross-section of a tissue can be utilized to obtain three-dimensional information by the methods of quantitative stereology. The resulting quantitative information is useful in both experimental studies and whole-animal investigations for regulatory and safety purposes. Quantitative stereologic analysis requires considerable data collection and calculation and is thus practical only through the use of computer hardware and software. We have previously reported the development of a program, STEREO, which compiles data from carcinogenesis experiments, recording information from tissue sections for the estimation of the number of altered hepatic foci (AHF) per liver and the volume fraction of AHF in liver on a three-dimensional basis. The data file itself was built by measuring tissue and focal transections through a slide-reading process that involved the manual use of a digitizer. In order to increase the speed and efficiency of the analytical process, we have integrated the STEREO program with a public domain software, Scion Image. This software integration involves two portions: the building macros and the interface. Macros for quantitative stereology used in Scion Image were written to customize and simplify the measurement and to generate data needed for building each of the data files. An interface program, BuildFi.exe, was developed to receive data generated from Scion Image and to align sequential tissue plots from up to four serial sections stained with different markers. As a result, the user can store data on a disk in the format of the STEREO data files. By combining STEREO with Scion Image, the slide-reading process is simplified and can be performed automatically. It has proven to be more objective, time saving, and efficient than all earlier versions. 相似文献
15.
This research compares the performance of various heuristics and one metaheuristic for unrelated parallel machine scheduling problems. The objective functions to be minimized are makespan, total weighted completion time, and total weighted tardiness. We use the least significant difference (LSD) test to identify robust heuristics that perform significantly better than others for a variety of parallel machine environments with these three performance measures. Computational results show that the proposed metaheuristic outperforms other existing heuristics for each of the three objectives when run with a parameter setting appropriate for the objective. 相似文献
16.
Models of parallel computations are considered for a wide class of data processing programs. Properties of programs are investigated and approaches to parallelizing sequential data processing programs and designing parallel programs are proposed. Computation optimizing problems are formulated.Translated from Kibernetika, No. 4, pp. 1–8, 42, July–August, 1989. 相似文献
17.
An efficient parallel architecture is proposed for high-performance multimedia data processing using multiple multimedia video processors (MVP; TMS320C80), which are fully programmable general digital signal processors (DSP). This paper describes several requirements for a multimedia data processing system and the system architecture of an image computing system called the KAIST Image Computing System (KICS). The performance of the KICS is evaluated in terms of its I/O bandwidth and the execution time for some image processing functions. An application of the KICS to the real-time Moving Picture Expert Group 2 (MPEG-2) encoder is introduced. The programmability and the high-speed data-access capability of the KICS are its most important features as a high-performance system for real-time multimedia data processing. 相似文献
18.
The main objective of this paper is to describe a realistic framework to understand parallel performance of high-dimensional
image processing algorithms in the context of heterogeneous networks of workstations (NOWs). As a case study, this paper explores
techniques for mapping hyperspectral image analysis techniques onto fully heterogeneous NOWs. Hyperspectral imaging is a new
technique in remote sensing that has gained tremendous popularity in many research areas, including satellite imaging and
aerial reconnaissance. The automation of techniques able to transform massive amounts of hyperspectral data into scientific
understanding in valid response times is critical for space-based Earth science and planetary exploration. Using an evaluation
strategy which is based on comparing the efficiency achieved by an heterogeneous algorithm on a fully heterogeneous NOW with
that evidenced by its homogeneous version on a homogeneous NOW with the same aggregate performance as the heterogeneous one,
we develop a detailed analysis of parallel algorithms that integrate the spatial and spectral information in the image data
through mathematical morphology concepts. For comparative purposes, performance data for the tested algorithms on Thunderhead
(a large-scale Beowulf cluster at NASA’s Goddard Space Flight Center) are also provided. Our detailed investigation of the
parallel properties of the proposed morphological algorithms provides several intriguing findings that may help image analysts
in selection of parallel techniques and strategies for specific applications.
相似文献
Antonio PlazaEmail: |
19.
为解决大规模非线性最优化问题的串行求解速度慢的问题,提出应用松弛异步并行算法求解无约束最优化问题。根据无约束最优化问题的BFGS串行算法,在PC机群环境下将其并行化。利用CHOLESKY方法分解系数为对称正定矩阵的线性方程组,运用无序松弛异步并行方法求解解向量和Wolfe-Powell非线性搜索步长,并行求解BFGS修正公式,构建BFGS松弛异步并行算法,并对算法的时间复杂性、加速比进行分析。在PC机群的实验结果表明,该算法提高了无约束最优化问题的求解速度且负载均衡,算法具有线性加速比。 相似文献