Cloud computing is a big paradigm shift of computing mechanism. It provides high scalability and elasticity with a range of on-demand services. We can execute a variety of distributed applications on cloud’s virtual machines (computing nodes). In a distributed application, virtual machine nodes need to communicate and coordinate with each other. This type of coordination requires that the inter-node latency should be minimal to improve the performance. But in the case of nodes belonging to different clusters of the same cloud or in a multi-cloud environment, there can be a problem of higher network latency. So it becomes more difficult to decide, which node(s) to choose for the distributed application execution, to keep inter-node latency at minimum. In this paper, we propose a solution for this problem. We propose a model for the grouping of nodes with respect to network latency. The application scheduling is done on the basis of network latency. This model is a part of our proposed Cloud Scheduler module, which helps the scheduler in scheduling decisions on the basis of different criteria. Network latency and resultant node grouping on the basis of this latency is one of those criteria. The main essence of the paper is that our proposed latency grouping algorithm not only has no additional network traffic overheads for algorithm computation but also works well with incomplete latency information and performs intelligent grouping on the basis of latency. This paper addresses an important problem in cloud computing, which is locating communicating virtual machines for minimum latency between them and group them with respect to inter-node latency.  相似文献   

Inter‐iteration dependences in loops can hinder loop‐level parallelism. For some loops, existing thread‐level speculation techniques fail to expose their inherent loop‐level parallelism, because some inter‐iteration dependences are too costly to synchronize, predict, pre‐compute and isolate. This paper presents a compiler technique called loop recreation to change the nature of some dependences (by turning some inter‐iteration dependences into intra‐iteration ones and vice versa) in a loop so that the inter‐iteration dependences in the transformed loop are less costly to enforce at runtime than those in the original loop. We present an algorithm for finding an optimal loop recreation transformation with respect to a simple misspeculation cost model and demonstrate the performance advantages of loop recreation over two recent techniques for multicore systems running nine representative irregular applications. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

由于多核微处理消耗更多的能量,导致其热点数目增加,温度分布不平衡加剧,因而对性能产生更大的负面影响。为了解决这个问题,提出一种基于多核微处理器温度感知的线程调度算法来减少热紧急事件、提高性能,并在一个Intel的多核微处理器平台上实现了该算法。实验结果表明,在各种负载组合下,该算法可以减少9.6%~78.5%的动态热管理次数。与Linux标准调度算法相比,吞吐率平均可以提高5.2%,最大可提高9.7%。  相似文献   

异构多核处理器通常由高性能的大核和低能耗的小核组成,在其上进行合理的线程调度可以有效地提高资源利用率,节省能耗。之前论文提出的大小核上的公平性调度并没有考虑核上有不同频率/电压状态的情况,而现在支持DVFS调节的处理器越来越普遍,因此很有必要将线程间公平度的计算进行扩展和改进。提出在每个核有若干种不同的DVFS状态时异构多核处理器上线程公平度的计算方法,对已有的性能预测模型进行改进,采用自适应算法调整模型中的系数,并在此基础上提出了一种调度策略,维持各线程之间的公平度和处理器功率满足提前设定的阈值,同时选取能效最优化的配置,实现减小应用运行能耗的目的。实验结果表明,与所提出的调度策略相比,采用static、DVFS-only、swap-only三种调度方法时,在总的运行时间几乎相同的情况下,平均要多产生20%以上能耗,对于有些应用甚至达到了50%。  相似文献   

In this papaer was present Safe Self-Scheduling (SSS), a new scheduling scheme that schedules parallel loops with variable length iteration execution times not known at compile time. The scheme assumes a shared memory space. SSS combines static scheduling with dynamic scheduling and draws favorable advantages from each. First, it reduces the dynamic scheduling overhead by statically scheduling a major portion of loop iterations. Second, the workload is balanced with a simple and efficient self-scheduling scheme by applying a new measure, thesmallest critical chore size. Experimental results comparing SSS with other scheduling schemes indicate that SSS surpasses other scheduling schemes. In the experiment on Gauss-Jordan, an application that is suitable for static scheduling schemes, SSS is the only self-scheduling scheme that outperforms the static scheduling scheme. This indicates that SSS achieves a balanced workload with a very small amount of overhead. This research has been supported in part by the National Science Foundation under Contract No. CCR-9210568.  相似文献   

多核处理器温度升高会影响芯片的稳定性和性能的发挥,硬件层面的DTM(Dynamic Thermal Management)方法以牺牲处理器性能为代价来降低功耗,提出了在一种软件层面的温度感知调度算法,它可以在线实时获取处理器性能计数器的值并计算各个执行核温度,根据各执行核的温度状况在各个核上合理分配进程,给出了温度感知的启发式方法。基于ATMI温度仿真器的仿真表明,温度感知调度算法较无温度感知的算法可以创建更均匀的功率密度图,且带MST启发式方法的温度感知调度算法能明显减少进程的迁移次数。  相似文献   

Broadcast is a fundamental operation in Wireless Sensor Networks (WSNs) and plays an important role in a communication protocol design. In duty-cycled scenarios, a sensor node can receive a message only in its active time slot, which makes it more difficult to design collision-free scheduling for broadcast operations. Recent studies in this area have focused on minimizing broadcast latency and guaranteeing that all nodes receive a broadcast message. This paper investigates the problem of Minimum Latency Broadcast Scheduling in Duty-Cycled (MLBSDC) WSNs. By using special geometric properties of independent sets of a broadcast tree, we reduce the number of transmissions, consequently reducing the possibility of collision. Allowing multiple transmissions in one working period, our proposed Latency Aware Broadcast Scheduling (LABS) scheme provides a latency-efficient broadcast schedule. Theoretical analysis proves that the scheme has the same approximation ratio and complexity as the previous best algorithm for the MLBSDC problem. Moreover, simulation shows that the new scheme achieves up to 34%, 37%, and 21% performance improvement over previous schemes, in terms of latency, number of transmissions, and energy consumption, respectively.  相似文献   

现有的很多调度算法存在时间复杂度过高或调度成功率低的问题。提出一种新的调度算法(HRTSA),提高实时任务的调度成功率。HRTSA首先通过METC策略初始化分簇,降低算法的时间复杂度;再在放置任务时根据处理器的负载均衡进行处理器负载的有效控制;最后通过任务复制调度以提高任务调度成功率。对比实验分析表明提出的HRTSA算法时间复杂度与RTSDA相比较低,调度成功率较高。  相似文献   

It can be observed from looking backward that processor architecture is improved through spirally shifting from simple to complex and from complex to simple. Nowadays we are facing another shifting from complex to simple, and new innovative architecture will emerge to utilize the continuously increasing transistor budgets. The growing importance of wire delays, changing workloads, power consumption, and design/verification complexity will drive the forthcoming era of Chip Multiprocessors (CMPs). Furthermore, typical CMP projects both from industries and from academics are investigated. Through going into depths for some primary theoretical and implementation problems of CMPs, the great challenges and opportunities to future CMPs are presented and discussed. Finally, the Godson series microprocessors designed in China are introduced.  相似文献   

阳国贵  姜波 《计算机应用》2010,30(8):2052-2055
在多片多核计算机系统中,线程切换的间接开销受到体系结构、负载模式和调度策略的影响,为了获得更为稳定的测试结果,在分析Lmbench测试程序框架和进程切换测试原理的基础上,针对多片多核处理机系统中的线程测试需求,通过集成多种负载模式和调度策略,设计和实现了新的线程切换延时测试程序LTC,为多核系统下的线程切换延时测试与分析提供了有效手段。  相似文献   

This paper describes our work to improve the performance of distributed applications. We aim at certain application characteristics such as balancing load, allowing separately written applications to work better together, allowing a distributed application to adapt its behavior in more flexible ways, and so on. Our approach is to write application‐specific schedulers, which can access the global state of the application in making scheduling decisions. To achieve this goal, we extended our earlier work on CATAPULTS ( C reating A nd T esting AP plication‐specific U ser L evel T hread S chedulers), a domain‐specific language for creating and testing application‐specific user‐level thread schedulers, to distributed applications by adding ‘master schedulers’ for dealing with the distributed parts of applications. This paper presents our design of, experimentation with, and implementation of distributed CATAPULTS. This paper presents several realistic examples to measure the feasibility of this approach, specifically: a website application, an embedded application, and load balancing. Each example has a scheduling goal for which we developed a customized scheduler. We measured the performance with and without the customized scheduler. The customized scheduler for each example was fairly straightforward to develop and each achieved its scheduling goal. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

由于志愿者分布式计算可以为计算量庞大的科研项目提供足够的计算能力,甚至比超级计算机的计算能力还要强大,因此,志愿者分布式计算技术受到了很多研究人员的关注,很多不同的志愿者分布式计算架构被广泛应用。以往的很多志愿者分布式计算架构通常考虑的志愿者主机是PC电脑,或者单纯地把移动设备当作PC电脑一样进行处理。由于移动设备的很多特性跟PC电脑存在着很大的差异,所以很多时候这些志愿者分布式计算架构并不能高效地处理同时拥有PC电脑和移动设备志愿者的志愿计算项目。针对志愿者分布式计算系统上两个主流的志愿者分布式计算任务调度方法——迭代计算的任务调度算法和先来先服务的调度算法FCFS在处理移动设备志愿者计算上存在着的不足,为了提高志愿者分布式计算平台的执行效率,提出了一个面向移动设备的温度感知的任务调度算法TATSA。实验结果表明,TATSA比主流的任务调度算法ISA和FCFS在移动设备志愿者计算时效率明显更高。  相似文献   

A hybrid sliding level Taguchi-based particle swarm optimization (HSLTPSO) algorithm is proposed for solving multi-objective flowshop scheduling problems (FSPs). The proposed HSLTPSO integrates particle swarm optimization, sliding level Taguchi-based crossover, and elitist preservation strategy. The novel contribution of the proposed HSLTPSO is the use of a PSO to explore the optimal feasible region in macro-space, the use of a systematic reasoning mechanism of the sliding level Taguchi-based crossover to exploit the better solution in micro-space, and the use of the elitist preservation strategy to retain the best particles of multi-objective population for next iteration. The sliding level Taguchi-based crossover is embedded in the PSO to find the best solutions and consequently enhance the PSO. Using the systematic reasoning way of the Taguchi-based crossover with considering the influence of tuning factors α, β and γ is presented in this study to solve the conflicting problem of non-feasible solutions and to find the better particles. As a result, it exhibits a significant improvement in Pareto best solutions of the FSP. By combining the advantages of exploration and exploitation, from the computational experiments of the six test problems, the HSLTPSO provides better results compared to the existing methods reported in the literature when solving multi-objective FSPs. Therefore, the HSLTPSO is an effective approach in solving multi-objective FSPs.  相似文献   

Task scheduling is a fundamental issue in achieving high efficiency in cloud computing. However, it is a big challenge for efficient scheduling algorithm design and implementation (as general scheduling problem is NP‐complete). Most existing task‐scheduling methods of cloud computing only consider task resource requirements for CPU and memory, without considering bandwidth requirements. In order to obtain better performance, in this paper, we propose a bandwidth‐aware algorithm for divisible task scheduling in cloud‐computing environments. A nonlinear programming model for the divisible task‐scheduling problem under the bounded multi‐port model is presented. By solving this model, the optimized allocation scheme that determines proper number of tasks assigned to each virtual resource node is obtained. On the basis of the optimized allocation scheme, a heuristic algorithm for divisible load scheduling, called bandwidth‐aware task‐scheduling (BATS) algorithm, is proposed. The performance of algorithm is evaluated using CloudSim toolkit. Experimental result shows that, compared with the fair‐based task‐scheduling algorithm, the bandwidth‐only task‐scheduling algorithm, and the computation‐only task‐scheduling algorithm, the proposed algorithm (BATS) has better performance. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

Grid applications with stringent security requirements introduce challenging concerns because the schedule devised by nonsecurity‐aware scheduling algorithms may suffer in scheduling security constraints tasks. To make security‐aware scheduling, estimation and quantification of security overhead is necessary. The proposed model quantifies security, in the form of security levels, on the basis of the negotiated cipher suite between task and the grid‐node and incorporates it into existing heuristics MinMin and MaxMin to make it security‐aware MinMin(SA) and MaxMin(SA). It also proposes SPMaxMin (Security Prioritized MinMin) and its comparison with three heuristics MinMin(SA), MaxMin(SA), and SPMinMin on heterogeneous grid/task environment. Extensive computer simulation results reveal that the performance of the various heuristics varies with the variation in computational and security heterogeneity. Its analysis over nine heterogeneous grid/task workload situations indicates that an algorithm that performs better for one workload degrades in another. It is conspicuous that for a particular workload one algorithm gives better makespan while another gives better response time. Finally, a security‐aware scheduling model is proposed, which adapts itself to the dynamic nature of the grid and picks the best suited algorithm among the four analyzed heuristics on the basis of job characteristics, grid characteristics, and desired performance metric. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

针对云计算环境下并行任务易受资源失效的影响而无法完成,且动态提供云资源可靠性较低的问题,首先,引入失效恢复机制,由于在失效可恢复情况下资源失效规律动态变化,使用两参数Weibull分布对不同时段资源节点和通信链路失效规律的局部特征进行描述;然后,根据并行任务之间存在的各类交互关系分析,提出了一种基于变参数失效规则的资源可靠性评估模型;最后,将该模型并入粒子群算法得到基于可靠性感知的自适应惯性权重粒子群资源调度算法R PSO,从而在计算适应度时充分考虑备选资源的可靠程度。仿真实验结果表明,当选择了合适的失效恢复参数时,提出的R PSO算法能够大幅度提高云服务可靠性,且只会增加少量的额外失效恢复开销。  相似文献   

Supervisory control theory is a well-established theoretical framework for feedback control of discrete event systems whose behaviours are described by automata and formal languages. In this article, we propose a formal constructive method for optimal fault-tolerant scheduling of real-time multiprocessor systems based on supervisory control theory. In particular, we consider a fault-tolerant and schedulable language which is an achievable set of event sequences meeting given deadlines of accepted aperiodic tasks in the presence of processor faults. Such a language eventually provides information on whether a scheduler (i.e., supervisor) should accept or reject a newly arrived aperiodic task. Moreover, we present a systematic way of computing a largest fault-tolerant and schedulable language which is optimal in that it contains all achievable deadline-meeting sequences.  相似文献   

Multiple-context processors provide register resources that allow rapid context switching between several threads as a means of tolerating long communication and synchronization latencies. When scheduling threads on such a processor, we must first decide which threads should have their state loaded into the multiple contexts, and second, which loaded thread is to execute instructions at any given time. In this paper we show that both decisions are important, and that incorrect choices can lead to serious performance degradation. We propose thread prioritization as a means of guiding both levels of scheduling. Each thread has a priority that can change dynamically, and that the scheduler uses to allocate as many computation resources as possible to critical threads. We briefly describe its implementation, and we show simulation performance results for a number of simple benchmarks in which synchronization performance is critical.  相似文献   

随着任务调度问题的广泛研究,包括遗传算法在内的许多新方法被引入到任务调度领域。然而,传统的遗传算法存在早熟收敛和后期进化停滞两个严重不足。为了克服这些不足,提出了算法MPLS。MPLS算法采用多种群共同进化的思想来维持种群多样性。同时,MPLS算法将水平集概念引入到任务调度研究中,以改进迭代收敛速度。基于第三方测试数据集,将MPLS的性能和GTMS、MSGS和NGS算法进行了对比。比较结果表明,MPLS算法获得的调度长度远好于GTMS、MSGS算法,略好于NGS算法。MPLS算法能将种群多样性维持在一个很高的水平。MPLS算法在调度长度和种群多样性方面要优于其它算法。  相似文献   

数据报丢失控制是在因特网中传输视频时的基本问题,缓冲区管理是防止视频数据报丢失的传统网络技术.为了解决视频通信中数据报丢失的问题,提出了一种新的分组调度方法,并建立视频通信系统环境,进行了编程实验.对运行在路由器内,能够保证视频传过程中视频流内部和视频流之间数据报丢失要求的分组调度方法进行了仿真和分析.结果表明了该分组调度方法在控制数据报丢失和效率方面的优点.  相似文献   

