期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Fair scheduling of bag-of-tasks applications using distributed Lagrangian optimization

Rémi Bertin Sascha Hunold Arnaud Legrand Corinne Touati 《Journal of Parallel and Distributed Computing》2014

Large scale distributed systems typically comprise hundreds to millions of entities (applications, users, companies, universities) that have only a partial view of resources (computers, communication links). How to fairly and efficiently share such resources between entities in a distributed way has thus become a critical question. 相似文献

2.

An intelligent query processing for distributed ontologies

Jihyun Lee Author Vitae Jun-Ki Min^{Author Vitae} 《Journal of Systems and Software》2010,83(1):85-95

In this paper, we propose an intelligent distributed query processing method considering the characteristics of a distributed ontology environment. We suggest more general models of the distributed ontology query and the semantic mapping among distributed ontologies compared with the previous works. Our approach rewrites a distributed ontology query into multiple distributed ontology queries using the semantic mapping, and we can obtain the integrated answer through the execution of these queries. Furthermore, we propose a distributed ontology query processing algorithm with several query optimization techniques: pruning rules to remove unnecessary queries, a cost model considering site load balancing and caching, and a heuristic strategy for scheduling plans to be executed at a local site. Finally, experimental results show that our optimization techniques are effective to reduce the response time. 相似文献

3.

分布式数据库中查询处理的新方法研究

张文东石小艳李明壮夏伟伟《计算机工程与设计》2007,28(19):4600-4602

在分布式数据库系统中,由于数据的分布和冗余,使得分布式查询处理增加了许多新的内容和复杂性,通过分析现有分布式数据库查询处理技术,根据应用实际提出一种新的查询处理方法,该方法通过将常用查询结果存储在本地来减少查询时的数据传输量,从而缩短了响应时间.实验证明了该方法是有效的. 相似文献

4.

An adaptable distributed query processing architecture

Yongluan Zhou Beng Chin Ooi Kian-Lee Tan Wee Hyong Tok 《Data & Knowledge Engineering》2005,53(3):1-309

Traditionally, distributed query optimization techniques generate static query plans at compile time. However, the optimality of these plans depends on many parameters (such as the selectivities of operations, the transmission speeds and workloads of servers) that are not only difficult to estimate but are also often unpredictable and fluctuant at runtime. As the query processor cannot dynamically adjust the plans at runtime, the system performance is often less than satisfactory. In this paper, we introduce a new highly adaptive distributed query processing architecture. Our architecture can quickly detect fluctuations in selectivities of operations, as well as transmission speeds and workloads of servers, and accordingly change the operation order of a distributed query plan during execution. We have implemented a prototype based on the Telegraph system [Telegragraph project. Available from >]. Our experimental study shows that our mechanism can adapt itself to the changes in the environment and hence approach to an optimal plan during execution. 相似文献

5.

Multihybrid job scheduling for fault-tolerant distributed computing in policy-constrained resource networks

《Computer Networks》2015

Unpredictable fluctuations in resource availability often lead to rescheduling decisions that sacrifice a success rate of job completion in batch job scheduling. To overcome this limitation, we consider the problem of assigning a set of sequential batch jobs with demands to a set of resources with constraints such as heterogeneous rescheduling policies and capabilities. The ultimate goal is to find an optimal allocation such that performance benefits in terms of makespan and utilization are maximized according to the principle of Pareto optimality, while maintaining the job failure rate close to an acceptably low bound. To this end, we formulate a multihybrid policy decision problem (MPDP) on the primary-backup fault tolerance model and theoretically show its NP-completeness. The main contribution is to prove that our multihybrid job scheduling (MJS) scheme confidently guarantees the fault-tolerant performance by adaptively combining jobs and resources with different rescheduling policies in MPDP. Furthermore, we demonstrate that the proposed MJS scheme outperforms the five rescheduling heuristics in solution quality, searching adaptability and time efficiency by conducting a set of extensive simulations under various scheduling conditions. 相似文献

6.

Multi-objective list scheduling of workflow applications in distributed computing infrastructures

Hamid Mohammadi Fard Radu Prodan Thomas Fahringer 《Journal of Parallel and Distributed Computing》2014

Executing large-scale applications in distributed computing infrastructures (DCI), for example modern Cloud environments, involves optimization of several conflicting objectives such as makespan, reliability, energy, or economic cost. Despite this trend, scheduling in heterogeneous DCIs has been traditionally approached as a single or bi-criteria optimization problem. In this paper, we propose a generic multi-objective optimization framework supported by a list scheduling heuristic for scientific workflows in heterogeneous DCIs. The algorithm approximates the optimal solution by considering user-specified constraints on objectives in a dual strategy: maximizing the distance to the user’s constraints for dominant solutions and minimizing it otherwise. We instantiate the framework and algorithm for a four-objective case study comprising makespan, economic cost, energy consumption, and reliability as optimization goals. We implemented our method as part of the ASKALON environment (Fahringer et al., 2007) for Grid and Cloud computing and demonstrate through extensive real and synthetic simulation experiments that our algorithm outperforms related bi-criteria heuristics while meeting the user constraints most of the time. 相似文献

7.

Multiple query optimization in middleware using query teamwork

K. O'Gorman A. El Abbadi D. Agrawal 《Software》2005,35(4):361-391

Multiple concurrent queries occur in many database settings. This paper describes the use of middleware as an optimization tool for such queries. Since common subexpressions derive from common data and the data is usually greatest at the source, the middleware exploits the presence of sharable access patterns to underlying data, especially scans of large portions of tables or indexes, in environments where query queuing or batching is an acceptable approach. The results show that simultaneous queries with such sharable accesses have a tendency to form synchronous groups (teams) which benefit each other through the operation of the disk cache, in effect using it as an implicit pipeline. The middleware exploits this tendency by queuing and scheduling the queries to promote this interaction, using an algorithm designed to promote such teamwork. This is implemented as middleware for use with a commercial database engine. The results include tests using the query mix from the TPC Benchmark^? R, achieving a speed‐up of 2.34 over the default scheduling provided by one database. Other results show that the success depends on the details of the computing environment. Copyright © 2004 John Wiley & Sons, Ltd. 相似文献

8.

Case for dynamic deployment in a grid-based distributed query processor

A. MukherjeeAuthor Vitae P. Watson Author Vitae 《Future Generation Computer Systems》2012,28(1):171-183

Grid computing enables users to perform computationally expensive applications on distributed resources acquired dynamically. Users are allowed to combine structured data and analysis components into new applications from distributed sites into new applications. Distributed query processing offers an established way of structuring such computations, and well-known tools like OGSA-DAI and OGSA-DQP provide respectively a common interface to heterogeneous databases, and a way of exploiting distributed resources. Such significant benefits are however often undermined by high communication costs due to the need to move data between distributed resources. This paper describes an approach that addresses this by dynamically deploying query processing engines, analysis services and databases within virtual machines, on an internet-scale, so as to reduce communication costs. Results of internet-scale experiments are presented to demonstrate the performance benefits. Further, the use of dynamic deployment features based on requirements allows the creation of an ad-hoc runtime engine and thus opens up the possibility of creating a virtual marketplace for software and hardware resources. 相似文献

9.

语义缓存查询裁剪优化

李东叶友谢芳勇《计算机应用研究》2008,25(12):3605-3609

查询处理是语义缓存的一个关键问题,但是现有的查询处理算法在时空效率和裁剪结果的复杂度两个方面存在很大的局限性,这在一定程度上限制了语义缓存的实用性。为了克服这些缺陷,本文对语义缓存的裁剪过程进行优化处理,减少了对服务器的无效访问,并给出了生成探测查询和剩余查询的裁剪算法;算法分析从理论上证明了该优化机制的有效性,同时,仿真实验的性能比较也表明该优化方法在提高查询裁剪时空效率和降低剩余查询复杂度等方面均要明显优于没有优化的方法。相似文献

10.

一种高效的分布式序敏感轮廓查询处理算法

下载免费PDF全文

王刚邓波曾玮琳《计算机工程与应用》2008,44(26):162-165

提出了一种新颖的分布环境中的序敏感轮廓查询算法（即找出不被别的对象所“支配”的且聚集值较高的对象）。现有的算法在节点数m较大时会消耗大量的网络带宽。提出了一种新的分布式序敏感轮廓查询处理算法（Distributed Rank-aware Skylining,DRS）。DRS算法在任意数据集上只需要4次交互就能完成,并且通过剪除不必要的对象来减少通讯代价。通过模拟数据验证了DRS算法的效率。实验表明,当节点数m大于4时,DRS算法性能优于现有算法的性能。相似文献

11.

Cloud-aware data intensive workflow scheduling on volunteer computing systems

《Future Generation Computer Systems》2015

Volunteer computing systems offer high computing power to the scientific communities to run large data intensive scientific workflows. However, these computing environments provide the best effort infrastructure to execute high performance jobs. This work aims to schedule scientific and data intensive workflows on hybrid of the volunteer computing system and Cloud resources to enhance the utilization of these environments and increase the percentage of workflow that meets the deadline. The proposed workflow scheduling system partitions a workflow into sub-workflows to minimize data dependencies among the sub-workflows. Then these sub-workflows are scheduled to distribute on volunteer resources according to the proximity of resources and the load balancing policy. The execution time of each sub-workflow on the selected volunteer resources is estimated in this phase. If any of the sub-workflows misses the sub-deadline due to the large waiting time, we consider re-scheduling of this sub-workflow into the public Cloud resources. This re-scheduling improves the system performance by increasing the percentage of workflows that meet the deadline. The proposed Cloud-aware data intensive scheduling algorithm increases the percentage of workflow that meet the deadline with a factor of 75% in average with respect to the execution of workflows on the volunteer resources. 相似文献

12.

ParSA: High-throughput scientific data analysis framework with distributed file system

《Future Generation Computer Systems》2015

Scientific data analysis and visualization have become the key component for nowadays large scale simulations. Due to the rapidly increasing data volume and awkward I/O pattern among high structured files, known serial methods/tools cannot scale well and usually lead to poor performance over traditional architectures. In this paper, we propose a new framework: ParSA (parallel scientific data analysis) for high-throughput and scalable scientific analysis, with distributed file system. ParSA presents the optimization strategies for grouping and splitting logical units to utilize distributed I/O property of distributed file system, scheduling the distribution of block replicas to reduce network reading, as well as to maximize overlapping the data reading, processing, and transferring during computation. Besides, ParSA provides the similar interfaces as the NetCDF Operator (NCO), which is used in most of climate data diagnostic packages, making it easy to use this framework. We utilize ParSA to accelerate well-known analysis methods for climate models on Hadoop Distributed File System (HDFS). Experimental results demonstrate the high efficiency and scalability of ParSA, getting the maximum 1.3 GB/s throughput on a six nodes Hadoop cluster with five disks per node. Yet, it can only get 392 MB/s throughput on a RAID-6 storage node. 相似文献

13.

Adaptive pre-task assignment scheduling strategy for heterogeneous distributed raytracing system

Kalim Qureshi Author Vitae Paul Manuel^{Author Vitae} 《Computers & Electrical Engineering》2007,33(1):70-78

One of the main obstacles in obtaining high performance from heterogeneous distributed computing (HDC) system is the inevitable communication overhead. This occurs when tasks executing on different computing nodes exchange data or the assigned sub-task size is very small. In this paper, we present adaptive pre-task assignment (APA) strategy for heterogeneous distributed raytracing system. In this strategy, the master assigns pre-task to the each node. The size of sub-task for each node is proportional to the node’s performance. One of the main features of this strategy is that it reduces the inter-processes communication, the cost overhead of the node’s idle time and load imbalance, which normally occurs in traditional runtime task scheduling (RTS) strategies. Performances of the RTS and APA strategies are evaluated on manager/master and workers model of HDC system. The experimental results of our proposed (APA) strategy have shown a significant improvement in the performance over RTS strategy. 相似文献

14.

Multi-criteria and satisfaction oriented scheduling for hybrid distributed computing infrastructures

《Future Generation Computer Systems》2016

Assembling and simultaneously using different types of distributed computing infrastructures (DCI) like Grids and Clouds is an increasingly common situation. Because infrastructures are characterized by different attributes such as price, performance, trust, and greenness, the task scheduling problem becomes more complex and challenging. In this paper we present the design for a fault-tolerant and trust-aware scheduler, which allows to execute Bag-of-Tasks applications on elastic and hybrid DCI, following user-defined scheduling strategies. Our approach, named Promethee scheduler, combines a pull-based scheduler with multi-criteria Promethee decision making algorithm. Because multi-criteria scheduling leads to the multiplication of the possible scheduling strategies, we propose SOFT, a methodology that allows to find the optimal scheduling strategies given a set of application requirements. The validation of this method is performed with a simulator that fully implements the Promethee scheduler and recreates an hybrid DCI environment including Internet Desktop Grid, Cloud and Best Effort Grid based on real failure traces. A set of experiments shows that the Promethee scheduler is able to maximize user satisfaction expressed accordingly to three distinct criteria: price, expected completion time and trust, while maximizing the infrastructure useful employment from the resources owner point of view. Finally, we present an optimization which bounds the computation time of the Promethee algorithm, making realistic the possible integration of the scheduler to a wide range of resource management software. 相似文献

15.

A distributed switch scheduling algorithm

Petar 《Performance Evaluation》2007,64(9-12):1053-1061

The maximum weight matching algorithm is a high-performance scheduling algorithm for cross-bar switches. It is known that it performs optimally under heavy loads. However, its centralized nature and high computational complexity limit the algorithm’s applicability. This paper presents a randomized algorithm for distributed switch scheduling that is capable of delivering high throughput. 相似文献

16.

基于分布式文件系统的MPP数据库扫描调度研究

下载免费PDF全文

郭凯龚才鑫龚奕利雷迎春《计算机工程与应用》2018,54(13):84-87

基于分布式文件系统的MPP（大规模并行处理）数据库是目前的研究热点,为改善其执行查询扫描操作前调度执行单元读取数据块的过程,提出一种基于节点负载的调度策略NLS。这种策略同时结合数据本地性和节点负载,通过本地读分配保证调度结果满足良好的数据本地性,基于节点的实时工作负载对中间调度结果进行重分配调整,以达到减少数据扫描操作完成时间的目标。实验结果表明,相比连续性调度策略FCS,NLS在保持90%以上数据本地性的同时,在完成时间上的优化最多达到32%,在测试的9种情况中平均优化25%。相似文献

17.

基于查询优化器的分布式空间查询优化方法

林键刘仁义刘南张丰《计算机工程与应用》2012,48(22):161-165

为了实现分布式空间数据库之间的互操作,需要对分布式查询进行优化处理,这种查询处理指的是在任何一个数据处理语句中它访问的是各个节点的数据而不是仅仅对发起查询的节点。提出了一种查询优化器的体系结构,针对上述查询最优化做了详细的讨论,着重讨论包含空间选择和连接的复杂空间查询。建立了典型的空间数据库的案例程序,通过分析表明,带有过滤和修正的查询优化器在时间与空间上的效率优势比较明显,获得了具有参考价值的结果。相似文献

18.

Proactive scheduling in distributed computing—A reinforcement learning approach

Zhao Tong Zheng Xiao Kenli Li Keqin Li 《Journal of Parallel and Distributed Computing》2014

In distributed computing such as grid computing, online users submit their tasks anytime and anywhere to dynamic resources. Task arrival and execution processes are stochastic. How to adapt to the consequent uncertainties, as well as scheduling overhead and response time, are the main concern in dynamic scheduling. Based on the decision theory, scheduling is formulated as a Markov decision process (MDP). To address this problem, an approach from machine learning is used to learn task arrival and execution patterns online. The proposed algorithm can automatically acquire such knowledge without any aforehand modeling, and proactively allocate tasks on account of the forthcoming tasks and their execution dynamics. Under comparison with four classic algorithms such as Min–Min, Min–Max, Suffrage, and ECT, the proposed algorithm has much less scheduling overhead. The experiments over both synthetic and practical environments reveal that the proposed algorithm outperforms other algorithms in terms of the average response time. The smaller variance of average response time further validates the robustness of our algorithm. 相似文献

19.

数据流多连续查询优化技术

赵宗敏王洋吴海涛《计算机应用》2009,29(Z2)

根据数据流连续达到、大小无界和实时性强的特点,引出数据流多连续查询的基本概念.针对多连续查询的特点和用户的需求,将多连续查询优化技术分为单流多查询和多流多查询.详细论述了单流过滤型多连续查询优化技术和基于共享的多流多连续查询优化技术,通过全面系统地分析每种优化算法的基本思想,得出每种查询技术的优缺点及适用场合. 相似文献

20.

Competitive distributed decision-making

Xiaotie Deng C. H. Papadimitriou 《Algorithmica》1996,16(2):133-150

We study several natural problems in distributed decision-making from the standpoint of competitive analysis; in these problems incomplete information is a result of the distributed nature of the problem, as opposed to the on-line mode of decision making that was heretofore prevalent in this area. In several simple situations of distributed scheduling, the competitive ratio can be computed exactly, and the different ratios can be used as a measure of the value of information and communication between decision-makers. In a more general distributed scheduling situation, we give tight upper and lower bounds on the competitive ratio achievable in the deterministic case, and give an optimal randomized algorithm with a much better competitive ratio.The research of Xiaotie Deng was supported by an NSERC grant and that of C. H. Papadimitriou was supported by an NSF grant. 相似文献