共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Extract-transform-load (ETL) workflows model the population of enterprise data warehouses with information gathered from a large variety of heterogeneous data sources. ETL workflows are complex design structures that run under strict performance requirements and their optimization is crucial for satisfying business objectives. In this paper, we deal with the problem of scheduling the execution of ETL activities (a.k.a. transformations, tasks, operations), with the goal of minimizing ETL execution time and allocated memory. We investigate the effects of four scheduling policies on different flow structures and configurations and experimentally show that the use of different scheduling policies may improve ETL performance in terms of memory consumption and execution time. First, we examine a simple, fair scheduling policy. Then, we study the pros and cons of two other policies: the first opts for emptying the largest input queue of the flow and the second for activating the operation (a.k.a. activity) with the maximum tuple consumption rate. Finally, we examine a fourth policy that combines the advantages of the latter two in synergy with flow parallelization. 相似文献
3.
Scheduling multiprocessor tasks with genetic algorithms 总被引:4,自引:0,他引:4
Correa R.C. Ferreira A. Rebreyend P. 《Parallel and Distributed Systems, IEEE Transactions on》1999,10(8):825-837
In the multiprocessor scheduling problem, a given program is to be scheduled in a given multiprocessor system such that the program's execution time is minimized. This problem being very hard to solve exactly, many heuristic methods for finding a suboptimal schedule exist. We propose a new combined approach, where a genetic algorithm is improved with the introduction of some knowledge about the scheduling problem represented by the use of a list heuristic in the crossover and mutation genetic operations. This knowledge-augmented genetic approach is empirically compared with a “pure” genetic algorithm and with a “pure” list heuristic, both from the literature. Results of the experiments carried out with synthetic instances of the scheduling problem show that our knowledge-augmented algorithm produces much better results in terms of quality of solutions, although being slower in terms of execution time 相似文献
4.
Ho-Fung Leung Hing-Fung Ting 《Parallel and Distributed Systems, IEEE Transactions on》1997,8(5):538-543
In the literature, the problem of global termination detection in parallel systems is usually solved by message passing. In shared-memory systems, this problem can also be solved by using exclusively accessible variables with locking mechanisms. In this paper, we present an algorithm that solves the problem of global termination detection in shared-memory asynchronous multiprocessor systems without using locking. We assume a reasonable computation model in which concurrent reading does not require locking and concurrent writing different values without locking results in an arbitrary one of the values being actually written. For a system of n processors, the algorithm allocates a working space of 2n+1 bits. The worst case time complexity of the algorithm is n+2√+1, which we prove is the lower bound under a reasonable model of computation 相似文献
5.
《The Journal of Logic Programming》1991,10(2):155-178
This paper presents the implementation and performance results of anand-parallel execution model of logic programs on a shared-memory multiprocessor. The execution model is meant for logic programs with “don't-know nondeterminism”, and handles binding conflicts by dynamically detecting dependencies among literals. The model also incorporates intelligent backtracking at the clause level. Our implementation of this model is based upon the Warren Abstract Machine (WAM); hence it retains most of the efficiency of the WAM for sequential segments of logic programs. Performance results on Sequent Balance 21000 show that on suitable programs, our parallel implementation can achieve linear speedup on dozens of processors. We also present an analysis of different overheads encountered in the implementation of the execution model. 相似文献
6.
Tamás Kis 《Theoretical computer science》2009,410(47-49):4864-4873
7.
Nikolaos M. Missirlis 《Parallel Computing》1987,5(3):295-302
The paper describes the implementation of the Successive Overrelaxation (SOR) method on an asynchronous multiprocessor computer for solving large, linear systems. The parallel algorithm is derived by dividing the serial SOR method into noninterfering tasks which are then combined with an optimal schedule of a feasible number of processors. The important features of the algorithm are: (i) achieves a speedup Sp O(N/3) and an efficiency Ep 2/3 using P = [N/2] processors, where N is the number of the equations, (ii) contains a high level of inherent parallelism, whereas on the other hand, the convergence theory of the parallel SOR method is the same as its sequential counterpart and (iii) may be modified to use block methods in order to minimise the overhead due to communication and synchronisation of the processors. 相似文献
8.
《Performance Evaluation》1994,20(4):361-371
In the classical scheduling theory it is widely assumed that any task requires for its processing only one processor at a time. In this paper the problem of deterministic scheduling of tasks requiring for their processing more than one processor at a time, i.e., a constant set of dedicated processors, is analyzed. Schedule length is assumed to be a performance measure. Tasks are assumed to be preemptable and independent. Low order polynomial algorithms for simple cases of the problem are given. Then a method to solve the general version of the problem for a limited number of processors is presented, while the case of an arbitrary number of processors is known to be NP-hard. Finally, a version of the problem, where besides processors every task can also require additional resources, is considered. 相似文献
9.
Real-time systems (RTS) are omnipresent in several domains. The trend is to use multiprocessor architecture to satisfy the timing constraints of such systems. The model-checking methods have proven to be useful for making the development process reliable at a high abstraction level. Based on this approach, the present paper proposes a new technique for scheduling analysis of a partitioned multiprocessor RTS. Starting from a model with dynamic priority time Petri Nets modeling the system, we have proposed a generation of a reduced states graph. Thus, through the properties of the graph the schedulability is checked. Our approach provides an implementation of a Partition Checker tool, which produces an affirmation of the schedulability or a counterexample in the case of non-schedulable system to reduce the SW/HW space exploration. 相似文献
10.
《Information and Software Technology》2005,47(8):565-574
Various toolkits exist today for the distributed execution of computational algorithms on clusters of machines. These toolkits are often referred to by the terms ‘Grid Toolkits’, ‘Job Execution Environments’, and ‘Problem Solving Environments (PSEs)’. Here, we introduce iJob—an Internet-based job execution environment that sets out to meet many of the goals of PSEs, such as providing facilities and services to solve a class of problems. In addition, the iJob software allows execution of computational algorithms utilizing standard Internet technologies such as Java, XML, and asynchronous communication protocols. The goals of this project include: (1) deploying the toolkit easily to multiple platforms using the Java technologies; (2) running multiple types of algorithms and supporting multiple users simultaneously; (3) providing a web-based GUI for monitoring and controlling the status of jobs; and (4) providing security at both the user-level and at the network-level. The toolkit has been tested using several simulation codes on pools of Windows 2000 and Solaris systems. 相似文献
11.
N. S. Kovalenko 《Cybernetics and Systems Analysis》1998,34(5):759-765
Conclusion The mathematical model of execution of asynchronous competing processes in a macropipelined MS considered in this article
makes it possible to estimate the minimum overall execution time of given volumes of computation and to find the optimal balancing
of transfer and computing, the ratio of the number of processors and channels in the MS. Moreover, the proposed mathematical
model and the derived balancing conditions fully corroborate the basic principle of macropipelined computation advanced previously
by Glushkov [6]. This principle states that when the work is allocated to processors, each processor is assigned in the current
step a task that will keep it busy for the longest possible time without requiring interaction with other processors.
Further research of this model can proceed in several directions. First, it is very interesting to determine the total idle
time of the processors due to busy channels, and also the “idle” time of transfer blocks. Second, it is relevant to calculate
the efficiency of the macropipelined method of computation. A similar study of efficiency estimates has been previously conducted
in [7]. Third, it is necessary to derive formulas for the total computing time and the corresponding balancing conditions
for other classes of competing processes and various operating regimes of channels and processors.
Translated from Kibernetika i Sistemnyi Analiz, No. 5, pp. 150–158, September–October, 1998. 相似文献
12.
We consider many-core processors with a task-graph oriented programming model, whereby scheduling constraints among tasks are decided offline, and are then enforced by the runtime system using dedicated hardware. Here, exposing and beneficially exploiting fine grain data and control parallelism is increasingly important. Therefore, high expressive power for stating such constraints/directives, along with the ability to implement them in fast, simple hardware, is critical for success. In this paper, we focus on the relationship among different duplicable (multi-instance) tasks, which are used to express and exploit data parallelism. We extend the conventional Start-After-Complete (precedence) constraint to also be usable between replicas of different such tasks rather than only between entire tasks, thereby increasing the exposable parallelism. Additionally, we propose the parameterized Start-After-Start constraint, which can be used to control the degree of “lockstep” among multiple such tasks, e.g., in order to improve cache performance when the tasks work on the same data. Also, we briefly describe several additional interesting directives. Finally, we show that the directives can be supported efficiently in hardware. Hypercore, a very efficient CREW PRAM-like shared-cache architecture, which is very challenging because it has extremely fast dispatching for basic constraints, is used in the discussion. However, the new directives have broader applicability. Having shown the possibility of simple implementation and indications of benefit, this motivates further exploration of these directives and their implementation in hardware, as well as their support by programming tools. 相似文献
13.
14.
Scheduling multiprocessor job with resource and timing constraintsusing neural networks 总被引:1,自引:0,他引:1
Yueh-Min Huang Ruey-Maw Chen 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》1999,29(4):490-502
The Hopfield neural network is extensively applied to obtaining an optimal/feasible solution in many different applications such as the traveling salesman problem (TSP), a typical discrete combinatorial problem. Although providing rapid convergence to the solution, TSP frequently converges to a local minimum. Stochastic simulated annealing is a highly effective means of obtaining an optimal solution capable of preventing the local minimum. This important feature is embedded into a Hopfield neural network to derive a new technique, i.e., mean field annealing. This work applies the Hopfield neural network and the normalized mean field annealing technique, respectively, to resolve a multiprocessor problem (known to be a NP-hard problem) with no process migration, constrained times (execution time and deadline) and limited resources. Simulation results demonstrate that the derived energy function works effectively for this class of problems. 相似文献
15.
The use of parallel operations in automation, such as part fabrication and assembly, computation and control of industrial robots, etc., may yield a minimum production time and thereby increase productivity. However, the coupling between consecutive phases of operations results in series-parallel precedence constraints that may create unavoidable idle time intervals during the operations. To solve the problem, idle time intervals must be broken down and reduced. An algorithm that determines a minimum time-ordered schedule for the parallel operaitons is developed based on the Program Evaluation and Review Technique using the method of “variable” branch and bound. 相似文献
16.
Growth in availability of data collection devices has allowed individual researchers to gain access to large quantities of data that needs to be analyzed. As a result, many labs and departments have acquired considerable compute resources. However, effective and efficient utilization of those resources remains a barrier for the individual researchers because the distributed computing environments are difficult to understand and control. We introduce a methodology and a tool that automatically manipulates and understands job submission parameters to realize a range of job execution alternatives across a distributed compute infrastructure. Generated alternatives are presented to a user at the time of job submission in the form of tradeoffs mapped onto two conflicting objectives, namely job cost and runtime. Such presentation of job execution alternatives allows a user to immediately and quantitatively observe viable options regarding their job execution, and thus allows the user to interact with the environment at a true service level. Generated job execution alternatives have been tested through simulation and on real-world resources and, in both cases, the average accuracy of the runtime of the generated and perceived job alternatives is within 5%. 相似文献
17.
As the impact of the communication architecture on performance grows in a Multiprocessor System-on-Chip (MPSoC) design, the need for performance analysis in the early stage in order to consider various communication architectures is also increasing. While a simulation is commonly performed for performance evaluation of an MPSoC, it often suffers from a lengthy run time as well as poor performance coverage due to limited input stimuli or their ad hoc applications. In this paper, we propose a novel system-level performance analysis method to estimate the performance distribution of an MPSoC. Our approach consists of two techniques: (1) analytical model of on-chip crossbar-based communication architectures and (2) enumeration of task-level execution time variations for a target application. The execution time variation of tasks is efficiently captured by a memory access workload model. Thus, the proposed approach leads to better performance coverage for an MPSoC application in a reasonable computation time than the simulation-based approach. The experimental results validate the accuracy, efficiency, and practical usage of the proposed approach. 相似文献
18.
A single server is assigned to M parallel queues with independent Poisson arrivals. Service times are constant, but the server has the opportunity to initiate service at a given queue only at times forming a Poisson process. Four related scheduling policies are investigated: a simple first-come, first-serve policy for which the stability region is determined: a policy with maximum throughput, but requiring the server to have advance knowledge of service opportunities; a policy of threshold type, which is shown to be optimal among nonlookahead policies with preemption; and an adaptive policy, which when M=2 is shown to provide stability for all arrival rate vectors for which stability is possible under any nonlookahead policy with preemption. The work is motivated by the problem of transmission scheduling for a packet-switched, low-altitude, multiple-satellite system 相似文献
19.
Scheduling with dynamic voltage/speed adjustment using slack reclamation in multiprocessor real-time systems 总被引:1,自引:0,他引:1
Zhu D. Melhem R. Childers B.R. 《Parallel and Distributed Systems, IEEE Transactions on》2003,14(7):686-700
The high power consumption of modern processors becomes a major concern because it leads to decreased mission duration (for battery-operated systems), increased heat dissipation, and decreased reliability. While many techniques have been proposed to reduce power consumption for uniprocessor systems, there has been considerably less work on multiprocessor systems. In this paper, based on the concept of slack sharing among processors, we propose two novel power-aware scheduling algorithms for task sets with and without precedence constraints executing on multiprocessor systems. These scheduling techniques reclaim the time unused by a task to reduce the execution speed of future tasks and, thus, reduce the total energy consumption of the system. We also study the effect of discrete voltage/speed levels on the energy savings for multiprocessor systems and propose a new scheme of slack reservation to incorporate voltage/speed adjustment overhead in the scheduling algorithms. Simulation and trace-based results indicate that our algorithms achieve substantial energy savings on systems with variable voltage processors. Moreover, processors with a few discrete voltage/speed levels obtain nearly the same energy savings as processors with continuous voltage/speed, and the effect of voltage/speed adjustment overhead on the energy savings is relatively small. 相似文献
20.
N. V. Kolesov M. V. Tolmacheva P. V. Yukhta 《Journal of Computer and Systems Sciences International》2012,51(5):636-647
An approach to scheduling computational processes in real-time distributed computing systems is considered. It is assumed that the task execution time is inexactly; more precisely, it is assumed to belog to a certain time interval. The problem is formulated as the scheduling of jobs of which each is characterized by its priority and consists of a set of tasks (with respect to the number of processors) executing on different processors and associated by a hierarchical precedence relationship. The proposed approach is based on algorithms with low computational complexity for suboptimal scheduling of equal-priority tasks. 相似文献