首页 | 本学科首页   官方微博 | 高级检索  
     


Simultaneous Scheduling of Replication and Computation for Data-Intensive Applications on the Grid
Authors:Frédéric Desprez  Antoine Vernois
Affiliation:(1) LIP Laboratory/GRAAL Project, UMR CNRS, ENS Lyon, INRIA, Univ. Claude Bernard Lyon 1, 46 Allée d'Italie, F-69364 Lyon Cedex 07, France
Abstract:Managing large datasets has become one major application of Grids. Life science applications usually manage large databases that should be replicated to scale applications. The growing number of users and the simple access to Internet-based application has stressed Grid middleware. Such environment are thus asked to manage data and schedule computation tasks at the same time. These two important operations have to be tightly coupled. This paper presents an algorithm (Scheduling and Replication Algorithm, SRA) that combines data management and scheduling using a steady-state approach. Using a model of the platform, the number of requests as well as their distribution, the number and size of databases, we define a linear program to satisfy all the constraints at every level of the platform in steady-state. The solution of this linear program will give us a placement for the databases on the servers as well as providing, for each kind of job, the server on which they should be executed. Our theoretical results are validated using simulation and logs from a large life science application. This work was supported in part by the ACI GRID and Grid5000 projects of the French Department of Research.
Keywords:bioinformatics applications  data management  Grid computing  scheduling
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号