Searching for Software on the EGEE Infrastructure |
| |
Authors: | George Pallis Asterios Katsifodimos Marios D. Dikaiakos |
| |
Affiliation: | (1) School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland |
| |
Abstract: | Several large-scale Grid infrastructures are currently in operation around the world, federating an impressive collection of computational resources, a wide variety of application software, and hundreds of user communities. To better serve the current and prospective users of Grid infrastructures, it is important to develop advanced software retrieval services that could help users locate software components suitable to their needs. In this paper, we present the design and implementation of Minersoft, a distributed, multi-threaded harvester for application software located in large-scale Grid infrastructures. Minersoft crawls the sites of a Grid infrastructure, discovers installed software resources, annotates them with keyword-rich metadata, and creates inverted indexes that can be used to support full-text software retrieval. We present insights derived from the implementation and deployment of Minersoft on EGEE, one of the largest Grid production services currently in operation. Experimental results show that Minersoft achieves a high performance in crawling EGEE sites and discovering software-related files, and a high efficiency in supporting software retrieval. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|