首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We study the problem of the amount of information (advice) about a graph that must be given to its nodes in order to achieve fast distributed computations. The required size of the advice enables to measure the information sensitivity of a network problem. A problem is information sensitive if little advice is enough to solve the problem rapidly (i.e., much faster than in the absence of any advice), whereas it is information insensitive if it requires giving a lot of information to the nodes in order to ensure fast computation of the solution. In this paper, we study the information sensitivity of distributed graph coloring. A preliminary version of this paper appeared in the proceedings of the 34th International Colloquium on Automata, Languages and Programming (ICALP), July 2007. A part of this work was done during the stay of David Ilcinkas at the Research Chair in Distributed Computing of the Université du Québec en Outaouais, as a postdoctoral fellow. P. Fraigniaud received additional support from the ANR project ALADDIN. A. Pelc was supported in part by NSERC discovery grant and by the Research Chair in Distributed Computing of the Université du Québec en Outaouais.  相似文献   

2.
3.
4.
The world-wide computing infrastructure on the growing computer network technology is a leading technology to make a variety of information services accessible through the Internet for every user from the high-performance computing users through many of personal computing users. The important feature of such services is location transparency; information can be obtained irrespective of time or location in virtually shared manner. In this article, we overview Ninf, an ongoing global network-wide computing infrastructure project which allows users to access computational resources including hardware, software and scientific data distributed across a wide area network. Preliminary performance result on measuring software and network overhead is shown, and that promises the future reality of world-wide network computing.  相似文献   

5.
Adapting scientific computing problems to clouds using MapReduce   总被引:1,自引:0,他引:1  
Cloud computing, with its promise of virtually infinite resources, seems to suit well in solving resource greedy scientific computing problems. To study this, we established a scientific computing cloud (SciCloud) project and environment on our internal clusters. The main goal of the project is to study the scope of establishing private clouds at the universities. With these clouds, students and researchers can efficiently use the already existing resources of university computer networks, in solving computationally intensive scientific, mathematical, and academic problems. However, to be able to run the scientific computing applications on the cloud infrastructure, the applications must be reduced to frameworks that can successfully exploit the cloud resources, like the MapReduce framework. This paper summarizes the challenges associated with reducing iterative algorithms to the MapReduce model. Algorithms used by scientific computing are divided into different classes by how they can be adapted to the MapReduce model; examples from each such class are reduced to the MapReduce model and their performance is measured and analyzed. The study mainly focuses on the Hadoop MapReduce framework but also compares it to an alternative MapReduce framework called Twister, which is specifically designed for iterative algorithms. The analysis shows that Hadoop MapReduce has significant trouble with iterative problems while it suits well for embarrassingly parallel problems, and that Twister can handle iterative problems much more efficiently. This work shows how to adapt algorithms from each class into the MapReduce model, what affects the efficiency and scalability of algorithms in each class and allows us to judge which framework is more efficient for each of them, by mapping the advantages and disadvantages of the two frameworks. This study is of significant importance for scientific computing as it often uses complex iterative methods to solve critical problems and adapting such methods to cloud computing frameworks is not a trivial task.  相似文献   

6.
This paper deals with the problem of task allocation (i.e., to which processor should each task of an application be assigned) in heterogeneous distributed computing systems with the goal of maximizing the system reliability. The problem of finding an optimal task allocation is known to be NP-hard in the strong sense. We propose a new swarm intelligence technique based on the honeybee mating optimization (HBMO) algorithm for this problem. The HBMO based approach combines the power of simulated annealing, genetic algorithms with a fast problem specific local search heuristic to find the best possible solution within a reasonable computation time. We study the performance of the algorithm over a wide range of parameters such as the number of tasks, the number of processors, the ratio of average communication time to average computation time, and task interaction density of applications. The effectiveness and efficiency of our algorithm are demonstrated by comparing it with recently proposed task allocation algorithms for maximizing system reliability available in the literature.  相似文献   

7.
The stochastic Newton recursive algorithm is studied for system identification. The main advantage of this algorithm is that it has extensive form and may embrace more performance with flexible parameters. The primary problem is that the sample covariance matrix may be singular with numbers of model parameters and (or) no general input signal; such a situation hinders the identification process. Thus, the main contribution is adopting multi-innovation to correct the parameter estimation. This simple approach has been proven to solve the problem effectively and improve the identification accuracy. Combined with multi-innovation theory, two improved stochastic Newton recursive algorithms are then proposed for time-invariant and time-varying systems. The expressions of the parameter estimation error bounds have been derived via convergence analysis. The consistence and bounded convergence conclusions of the corresponding algorithms are drawn in detail, and the effect from innovation length and forgetting factor on the convergence property has been explained. The final illustrative examples demonstrate the effectiveness and the convergence properties of the recursive algorithms.  相似文献   

8.
Min-Min任务调度算法的思路总是优先调度执行时间较短的小任务,无法得到理想的最优跨度及资源负载平衡.针对该问题,提出基于资源分级的自适应Min-Min算法.分配任务前,先参考现有资源的属性进行分级处理,再与任务在资源中的最小完成时间作乘积得到的最小任务资源组合进行调度;在任务调度过程中,引入自适应阈值,调节长任务的调度等级,从而达到优化效果.通过模拟仿真实验,表明该算法在时间跨度和负载平衡上均有较好性能.  相似文献   

9.
Power system has a highly interconnected network that requires intense computational effort and resources for centralized control. Distributed computing needs the systems to be partitioned optimally into clusters. The network partitioning is an optimization problem whose objective is to minimize the number of nodes in a cluster and the tie lines between the clusters. Harmony Search(HS) Algorithm is one of the recently developed meta heuristic algorithms that can be applied to optimization problems. In this work, the HS algorithm is applied to the network partitioning problem and power flow based equivalencing is done to represent the external system. Simulation is done on IEEE Standard Test Systems. The algorithm is found to be very effective in partitioning the system hierarchically and the equivalencing method gives accurate results in comparison to the centralized control.  相似文献   

10.
OpenMP作为共享存储并行编程标准,以其良好的易用性、支持增量并行等特点成为并行程序设计的主流模型之一.OpenMP标准是针对UMA共享存储结构制定的,其循环调度机制只考虑了负载平衡而无须考虑数据分布.然而在机群OpenMP系统中,数据局部性是影响性能的关键因素.针对OpenMP标准中静态调度策略不适合机群计算的缺点,提出了一个充分体现拥有者计算原则的LBS调度算法,并通过扩展制导的方式在机群OpenMP系统(OpenMP/JIAJIA)上加以实现.测试结果表明,LBS算法对于机群OpenMP系统很有效.  相似文献   

11.
We survey results from distributed computing that show tasks to be impossible, either outright or within given resource bounds, in various models. The parameters of the models considered include synchrony, fault-tolerance, different communication media, and randomization. The resource bounds refer to time, space and message complexity. These results are useful in understanding the inherent difficulty of individual problems and in studying the power of different models of distributed computing. There is a strong emphasis in our presentation on explaining the wide variety of techniques that are used to obtain the results described.Received: September 2001, Accepted: February 2003,  相似文献   

12.
This paper investigates the problem of allocating parallel application tasks to processors in heterogeneous distributed computing systems with the goal of maximizing the system reliability. The problem of finding an optimal task allocation for more than three processors is known to be NP-hard in the strong sense. To deal with this challenging problem, we propose a simple and effective iterative greedy algorithm to find the best possible solution within a reasonable amount of computation time. The algorithm first uses a constructive heuristic to obtain an initial assignment and iteratively improves it in a greedy way. We study the performance of the proposed algorithm over a wide range of parameters including problem size, the ratio of average communication time to average computation time, and task interaction density. The viability and effectiveness of our algorithm is demonstrated by comparing it with recently proposed task allocation algorithms for maximizing system reliability available in the literature.  相似文献   

13.
实现网格计算的一个重要目的在于实现地理分布、异构资源的统一描述方法,提供用户虚拟的统一资源界面,并将用户提出的服务要求透明、动态地分配给最适应的资源上执行。针对目前任务调度的应用现状,提出了一种既能使资源负载均衡又能充分利用系统资源的并行克隆遗传算法,该启发式算法能显著地降低资源最优分配中的计算复杂度,使其能满足实时调度的需要。实验结果表明这种算法优于其他调度算法。  相似文献   

14.
15.
Discrete Element Modelling on a Cluster of Workstations   总被引:1,自引:0,他引:1  
We describe a distributed computing system for discrete element modelling that has been designed for loosely-coupled networks of workstations. The implementation is based on DM 2 , a state-of-the-art discrete element modelling technique for simulating the behaviour of energetic materials and modelling shock compaction phenomena. The underlying computational approach is derived from particle methods, where short-range interactions, both mechanical and thermochemical, determine individual particle movement and state. Using spatial decomposition, a client-server software architecture distributes the computations and, at the language level, Berkeley sockets enable communication between conventional Unix processes on workstations connected by an Ethernet. We evaluate the performance of the system in terms of overall execution time and efficiency, and develop a simple model of computational and communication costs that enables us to predict its performance in other contexts. We conclude that distributed implementations of short-range particle methods can be very effective, even on non-dedicated communication networks.  相似文献   

16.
In-operation construction vibration monitoring records inevitably contain various anomalies caused by sensor faults, system errors, or environmental influence. An accurate and efficient anomaly detection technique is essential for vibration impact assessment. Identifying anomalies using visualization tools is computationally expensive, time-consuming, and labor-intensive. In this study, an unsupervised approach for detecting anomalies in construction vibration monitoring data was proposed based on a temporal convolutional network and autoencoder. The anomalies were autonomously detected on the basis of the reconstruction errors between the original and reconstructed signals. Considering the false and missed detections caused by great variability in vibration signals, an adaptive threshold method was applied to achieve the best identification performance. This method used the log-likelihood of the reconstruction errors to search for an optimal coefficient for anomalies. A distributed training strategy was implemented on a cloud platform to speed up training and perform anomaly detection without significant time delay. Construction-induced accelerations measured by a real vibration monitoring system were used to evaluate the proposed method. Experimental results show that the proposed approach can successfully detect anomalies with high accuracy; and the distributed training can remarkably save training time, thereby realizing anomaly detection for online monitoring systems with accumulated massive data.  相似文献   

17.
Executing large-scale applications in distributed computing infrastructures (DCI), for example modern Cloud environments, involves optimization of several conflicting objectives such as makespan, reliability, energy, or economic cost. Despite this trend, scheduling in heterogeneous DCIs has been traditionally approached as a single or bi-criteria optimization problem. In this paper, we propose a generic multi-objective optimization framework supported by a list scheduling heuristic for scientific workflows in heterogeneous DCIs. The algorithm approximates the optimal solution by considering user-specified constraints on objectives in a dual strategy: maximizing the distance to the user’s constraints for dominant solutions and minimizing it otherwise. We instantiate the framework and algorithm for a four-objective case study comprising makespan, economic cost, energy consumption, and reliability as optimization goals. We implemented our method as part of the ASKALON environment (Fahringer et al., 2007) for Grid and Cloud computing and demonstrate through extensive real and synthetic simulation experiments that our algorithm outperforms related bi-criteria heuristics while meeting the user constraints most of the time.  相似文献   

18.
Unpredictable fluctuations in resource availability often lead to rescheduling decisions that sacrifice a success rate of job completion in batch job scheduling. To overcome this limitation, we consider the problem of assigning a set of sequential batch jobs with demands to a set of resources with constraints such as heterogeneous rescheduling policies and capabilities. The ultimate goal is to find an optimal allocation such that performance benefits in terms of makespan and utilization are maximized according to the principle of Pareto optimality, while maintaining the job failure rate close to an acceptably low bound. To this end, we formulate a multihybrid policy decision problem (MPDP) on the primary-backup fault tolerance model and theoretically show its NP-completeness. The main contribution is to prove that our multihybrid job scheduling (MJS) scheme confidently guarantees the fault-tolerant performance by adaptively combining jobs and resources with different rescheduling policies in MPDP. Furthermore, we demonstrate that the proposed MJS scheme outperforms the five rescheduling heuristics in solution quality, searching adaptability and time efficiency by conducting a set of extensive simulations under various scheduling conditions.  相似文献   

19.
Lauri Forsman 《AI & Society》1998,12(4):328-345
Organisations have eagerly adopted the new opportunities provided by distributed computing technology. These opportunities have also created new dependency on the technology and threats of technical problems. Information technology (IT) management has to choose its position towards these new technical risks. Should the problems be prevented proactively in advance or settled reactively afterwards? This paper draws conclusions from an action research case study aimed at proactive versus reactive end-user support. Between 1994 and 1997 one of the business units in Nokia Telecommunications required a new approach for its distributed information systems (IS) to facilitate rapid organisational growth. The distributed IS and its end-user support were established and organised during a 30-month re-engineering process. These results provide a new view to the dependencies between business processes and IT. The new distributed IT has become, often insidiously, a necessity for vital business processes. Therefore, risk management should be adopted as a standard tool for IS management to identify such dependencies. Proactive actions should be aimed at those areas where IT-related business risks are identified. Proactivity should be supplemented by reactive support to provide daily assistance for the end-users.  相似文献   

20.
Onboard spacecraft computing system is a case of a functionally distributed system that requires continuous interaction among the nodes to control the operations at different nodes. A simple and reliable protocol is desired for such an application. This paper discusses a formal approach to specify the computing system with respect to some important issues encountered in the design and development of a protocol for the onboard distributed system. The issues considered in this paper are concurrency, exclusiveness and sequencing relationships among the various processes at different nodes. A 6-tuple model is developed for the precise specification of the system. The model also enables us to check the consistency of specification and deadlock caused due to improper specification. An example is given to illustrate the use of the proposed methodology for a typical spacecraft configuration. Although the theory is motivated by a specific application the same may be applied to other distributed computing system such as those encountered in process control industries, power plant control and other similar environments.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号