Simultaneous Scheduling of Replication and Computation for Data-Intensive Applications on the Grid |
| |
Authors: | Frédéric Desprez Antoine Vernois |
| |
Affiliation: | (1) LIP Laboratory/GRAAL Project, UMR CNRS, ENS Lyon, INRIA, Univ. Claude Bernard Lyon 1, 46 Allée d'Italie, F-69364 Lyon Cedex 07, France |
| |
Abstract: | Managing large datasets has become one major application of Grids. Life science applications usually manage large databases that should be replicated to scale applications. The growing number of users and the simple access to Internet-based application has stressed Grid middleware. Such environment are thus asked to manage data and schedule computation tasks at the same time. These two important operations have to be tightly coupled. This paper presents an algorithm (Scheduling and Replication Algorithm, SRA) that combines data management and scheduling using a steady-state approach. Using a model of the platform, the number of requests as well as their distribution, the number and size of databases, we define a linear program to satisfy all the constraints at every level of the platform in steady-state. The solution of this linear program will give us a placement for the databases on the servers as well as providing, for each kind of job, the server on which they should be executed. Our theoretical results are validated using simulation and logs from a large life science application. This work was supported in part by the ACI GRID and Grid5000 projects of the French Department of Research. |
| |
Keywords: | bioinformatics applications data management Grid computing scheduling |
本文献已被 SpringerLink 等数据库收录! |
|