共查询到20条相似文献,搜索用时 15 毫秒
1.
Suresh S. Omkar S.N. Mani V. 《Parallel and Distributed Systems, IEEE Transactions on》2005,16(1):24-34
This work presents an efficient mapping scheme for the multilayer perceptron (MLP) network trained using back-propagation (BP) algorithm on network of workstations (NOWs). Hybrid partitioning (HP) scheme is used to partition the network and each partition is mapped on to processors in NOWs. We derive the processing time and memory space required to implement the parallel BP algorithm in NOWs. The performance parameters like speed-up and space reduction factor are evaluated for the HP scheme and it is compared with earlier work involving vertical partitioning (VP) scheme for mapping the MLP on NOWs. The performance of the HP scheme is evaluated by solving optical character recognition (OCR) problem in a network of ALPHA machines. The analytical and experimental performance shows that the proposed parallel algorithm has better speed-up, less communication time, and better space reduction factor than the earlier algorithm. This work also presents a simple and efficient static mapping scheme on heterogeneous system. Using divisible load scheduling theory, a closed-form expression for number of neurons assigned to each processor in the NOW is obtained. Analytical and experimental results for static mapping problem on NOWs are also presented. 相似文献
2.
R. Andonie A. T. Chronopoulos D. Grosu H. Galmeanu 《Concurrency and Computation》2006,18(12):1559-1573
The focus of this study is how we can efficiently implement the neural network backpropagation algorithm on a network of computers (NOC) for concurrent execution. We assume a distributed system with heterogeneous computers and that the neural network is replicated on each computer. We propose an architecture model with efficient pattern allocation that takes into account the speed of processors and overlaps the communication with computation. The training pattern set is distributed among the heterogeneous processors with the mapping being fixed during the learning process. We provide a heuristic pattern allocation algorithm minimizing the execution time of backpropagation learning. The computations are overlapped with communications. Under the condition that each processor has to perform a task directly proportional to its speed, this allocation algorithm has polynomial‐time complexity. We have implemented our model on a dedicated network of heterogeneous computers using Sejnowski's NetTalk benchmark for testing. Copyright © 2005 John Wiley & Sons, Ltd. 相似文献
3.
Tamir Heyman Danny Geist Orna Grumberg Assaf Schuster 《Formal Methods in System Design》2002,21(3):317-338
This paper presents a scalable method for parallelizing symbolic reachability analysis on a distributed-memory environment of workstations. We have developed an adaptive partitioning algorithm that significantly reduces space requirements. The memory balance is maintained by dynamically repartitioning the state space throughout the computation. A compact BDD representation allows coordination by shipping BDDs from one machine to another. This representation allows for different variable orders in the sending and receiving processes. The algorithm uses a distributed termination protocol, with none of the memory modules preserving a complete image of the set of reachable states. No external storage is used on the disk. Rather, we make use of the network, which is much faster.We implemented our method on a standard, loosely-connected environment of workstations, using a high-performance model checker. Initial performance evaluation of several large circuits shows that our method can handle models too large to fit in the memory of a single node. The partitioning algorithm achieves reduction in space, which is linear in the number of workstations employed. A corresponding decrease in space requirements is measured throughout the reachability analysis. Our results show that the relatively slow network does not become a bottleneck, and that computation time is kept reasonably small. 相似文献
4.
《Knowledge and Data Engineering, IEEE Transactions on》2009,21(3):384-400
We consider skyline computation when the underlying data set is horizontally partitioned onto geographically distant servers that are connected to the Internet. The existing solutions are not suitable for our problem, because they have at least one of the following drawbacks: (1) applicable only to distributed systems adopting vertical partitioning or restricted horizontal partitioning, (2) effective only when each server has limited computing and communication abilities, and (3) optimized only for skyline search in subspaces but inefficient in the full space. This paper proposes an algorithm, called feedback-based distributed skyline (FDS), to support arbitrary horizontal partitioning. FDS aims at minimizing the network bandwidth, measured in the number of tuples transmitted over the network. The core of FDS is a novel feedback-driven mechanism, where the coordinator iteratively transmits certain feedback to each participant. Participants can leverage such information to prune a large amount of local data, which otherwise would need to be sent to the coordinator. Extensive experimentation confirms that FDS significantly outperforms alternative approaches in both effectiveness and progressiveness. 相似文献
5.
J. E. Boillat 《Concurrency and Computation》1990,2(4):289-313
We present a fully distributed dynamic load balancing algorithm for parallel MIMD architectures. The algorithm can be described as a system of identical parallel processes, each running on a processor of an arbitrary interconnected network of processors. We show that the algorithm can be interpreted as a Poisson (heath) equation in a graph. This equation is analysed using Markov chain techniques and is proved to converge in polynomial time resulting in a global load balance. We also discuss some important parallel architectures and interconnection schemes such as linear processor arrays, tori, hypercubes, etc. Finally we present two applications where the algorithm has been successfully embedded (process mapping and molecular dynamic simulation). 相似文献
6.
《Simulation Practice and Theory》1997,5(1):83-99
Simulation of complex digital electronic systems requires powerful machines and algorithms. Distributed simulation could improve both the execution time and the availability of a large distributed memory for complex models. Model partitioning onto the available processors has a major impact on simulation efficiency. We report on how various partitioning algorithms affect timewarp-based distributed simulation of combinational and synchronous sequential logic circuits, and try to determine the relationship between circuit parameters (the number of gates, topological levels and the degree of activity in the circuit) and the structure of the partition having the fastest simulation on a heterogeneous network of Sun workstations. 相似文献
7.
Jacob J.C. Soo-Young Lee 《Parallel and Distributed Systems, IEEE Transactions on》1999,10(10):1082-1101
In this paper, we describe how our computational model can be used for the problems of processor allocation and task mapping. The intended applications for this model include the dynamic mapping problems of shrinking or spreading an existing mapping when the available pool of processors changes during execution of the problem. The concept of problem edge class and other features of our model are developed to realistically and efficiently support task partitioning and merging for static and dynamic mapping. The model dictates realistic changes in the computation and communication characteristics of a problem when the problem partitioning is modified dynamically. This model forms the basis of our algorithms for shrinking and spreading, and yields realistic results for a variety of problems mapped onto real systems. An emulation program running on a network of workstations under PVM is used to measure execution times for the mapping solutions found by the algorithms. The results indicate that the problem edge class is a crucial consideration for processor allocation and task mapping 相似文献
8.
Imtiaz AhmadAuthor Vitae 《Computers & Electrical Engineering》2003,29(2):327-356
To minimize the area of the combinational circuit, required to realize a finite state machine (FSM), an efficient assignment of states of the FSM to a set of binary codes is required. As to find an optimal state assignment is NP-hard, therefore heuristic approaches have been taken. One approach generates an adjacency graph from the FSM model and then tries to embed the adjacency graph onto a hypercube with an objective to minimize the cost of mapping. However, hypercube embedding itself is an NP-complete problem. In this paper we present a solution to the hypercube embedding problem by designing a new technique, designated as HARD, that is a hybrid combination of non-linear programming method and a local search. We have transformed our problem from discrete space to continuous space and have applied logarithmic barrier function method, that in turn uses gradient projection approach to minimize the objective function. Each iteration of the gradient projection method produces a valid solution. Local search is performed around solution to improve its quality by using a Kernighan-Lin style algorithm. Two distributed algorithms for the HARD, have also been designed and implemented on network of workstations under message passing interface, to speed up the search. We have carried out a large number of experiments to determine the efficiency of the HARD in terms of solution quality over many other techniques, and have obtained very promising results. 相似文献
9.
10.
Montserrat Abril Miguel A. Salido Federico Barber 《Journal of Intelligent Manufacturing》2010,21(1):101-110
Many real problems can be naturally modelled as constraint satisfaction problems (CSPs). However, some of these problems are of a distributed nature, which requires problems of this kind to be modelled as distributed constraint satisfaction problems (DCSPs). In this work, we present a distributed model for solving CSPs. Our technique carries out a partition over the constraint network using a graph partitioning software; after partitioning, each sub-CSP is arranged into a DFS-tree CSP structure that is used as a hierarchy of communication by our distributed algorithm. We show that our distributed algorithm outperforms well-known centralized algorithms solving partitionable CSPs. 相似文献
11.
This paper deals with the organization of a distributed load-balancing policy for a multicomputer system which consists of a cluster of independent computers that are interconnected by a local area communication network. We introduce three algorithms necessary to maintain load balancing in this system: the local load algorithm, used by each processor to monitor its own load; the exchange algorithm, for exchanging load information between the processors, and the process migration algorithm that uses this information to dynamically migrate processes from overloaded to underloaded processors. The policy that we present is distributed, i.e. each processor uses the same policy. It is both dynamic, responding to load changes without using an a priori knowledge of the resources that each process requires; and stable, unnecessary overloading of a processor is minimized. We give the essential details of the implementation of the policy and initial results on its performance. Our results confirm the feasibility of building distributed systems that are based on network communication for uniform access, resource sharing and improved reliability, as well as the use of workstations without a secondary storage device. 相似文献
12.
《Journal of Systems and Software》2004,73(3):551-561
Vertical partitioning is a process of generating the fragments, each of which is composed of attributes with high affinity. The concept of vertical partitioning has been applied to many research areas, especially databases and distributed systems, in order to improve the performance of query execution and system throughput. However, most previous approaches have focused their attention on generating an optimal partitioning without regard to the number of fragments finally generated, which is called best-fit vertical partitioning in this paper. On the other hand, there are some cases that a certain number of fragments are required to be generated by vertical partitioning, called n-way vertical partitioning in this paper. The n-way vertical partitioning problem has not fully investigated.In this paper, we propose an adaptable vertical partitioning method that can support both best-fit and n-way vertical partitioning. In addition, we present several experimental results to clarify the validness of the proposed algorithm. 相似文献
13.
Kumar V. Shekhar S. Amin M.B. 《Parallel and Distributed Systems, IEEE Transactions on》1994,5(10):1073-1090
We present a new technique for mapping the backpropagation algorithm on hypercube and related architectures. A key component of this technique is a network partitioning scheme called checkerboarding. Checkerboarding allows us to replace the all-to-all broadcast operation performed by the commonly used vertical network partitioning scheme, with operations that are much faster on the hypercubes and related architectures. Checkerboarding can be combined with the pattern partitioning technique to form a hybrid scheme that performs better than either one of these schemes. Theoretical analysis and experimental results on nCUBE and CM5 show that our scheme performs better than the other schemes, for both uniform and nonuniform networks 相似文献
14.
Di Fatta G. Berthold M.R. 《Parallel and Distributed Systems, IEEE Transactions on》2006,17(8):773-785
In molecular biology, it is often desirable to find common properties in large numbers of drug candidates. One family of methods stems from the data mining community, where algorithms to find frequent graphs have received increasing attention over the past years. However, the computational complexity of the underlying problem and the large amount of data to be explored essentially render sequential algorithms useless. In this paper, we present a distributed approach to the frequent subgraph mining problem to discover interesting patterns in molecular compounds. This problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely, a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiver-initiated load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute's HIV-screening data set, where we were able to show close-to linear speedup in a network of workstations. The proposed approach also allows for dynamic resource aggregation in a nondedicated computational environment. These features make it suitable for large-scale, multidomain, heterogeneous environments, such as computational grids. 相似文献
15.
一种递归神经网络的快速并行算法 总被引:6,自引:0,他引:6
针对递归神经网络BP(Back Propagation)学习算法收敛慢的缺陷,提出一种新的递归
神经网络快速并行学习算法.首先,引入递推预报误差(RPE)学习算法,并且证明了其稳定性;
进一步地,为了克服RPE算法集中运算的不足,设计完整的并行结构算法.本算法将计算分配
到神经网络中的每个神经元,完全符合神经网络的并行结构特点,也利于硬件实现.仿真结果表
明,该算法比传统的递归BP学习算法具有更好的收敛性能.理论分析和仿真实验证明,该算法
与RPE集中运算算法相比可以大大节省计算时间. 相似文献
16.
提出一种模拟电路故障的分布式诊断算法,用以解决大数据量故障样本集所带来的网络规模过大,训练时间过长等问题。该算法采用有监督Hebb学习规则,在训练学习过程中添加类别标识,避免了因数据分割而产生的部分知识的丢失。分别用提出的分布式算法和传统的BP算法对实例电路进行故障诊断,实验结果表明,提出的分布式算法不仅和BP算法的诊断正确率相当,而且有效地提高了训练学习的速度。 相似文献
17.
18.
针对超声波热量表采用时差法测量流量时,因受温度影响而存在的非线性问题,提出了分别基于曲面拟合和BP神经网络的温度补偿算法。两种算法通过建立温度与流量之间的非线性映射关系,达到补偿流量测量的目的。建模与仿真可知, BP神经网络补偿算法表现出更好的数据融合及预测能力。验证实验表明,相对于现有查表修正算法和曲面拟合补偿算法,BP神经网络补偿算法补偿效果更佳,补偿后流量测量误差在±2.2%以内,绝对误差方差最大值为0.68,补偿效果显著,具有较高的工程应用价值。 相似文献
19.
20.
结合聚类思想神经网络文本分类技术研究* 总被引:1,自引:0,他引:1
针对传统的基于神经网络文本分类算法收敛速度慢等缺点,在分析了文本分类系统的一般模型,以及在应用了互信息量的特征提取方法提取特征项后,提出了一种基于样本中心的径向基神经网络文本分类算法;并引入了聚类算法的核心思想,改进误差反向传播神经网络分类算法收敛速度较慢的缺点。实验结果表明,提出的改进算法与传统的BP神经网络分类算法相比,具有较高的运算速度和较强的非线性映射能力,在收敛速度和准确程度上也有更好的分类效果。 相似文献