首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A Grid environment can be viewed as a virtual computing architecture that provides the ability to perform higher throughput computing by taking advantage of many computers geographically dispersed and connected by a network. Bioinformatics applications stand to gain in such a distributed environment in terms of increased availability, reliability and efficiency of computational resources. There is already considerable research in progress toward applying parallel computing techniques on bioinformatics methods, such as multiple sequence alignment, gene expression analysis and phylogenetic studies. In order to cope with the dimensionality issue, most machine learning methods either focus on specific groups of proteins or reduce the size of the original data set and/or the number of attributes involved. Grid computing could potentially provide an alternative solution to this problem, by combining multiple approaches in a seamless way. In this paper we introduce a unifying methodology coupling the strengths of the Grid with the specific needs and constraints of the major bioinformatics approaches. We also present a tool that implements this process and allows researchers to assess the computational needs for a specific task and optimize the allocation of available resources for its efficient completion.  相似文献   

2.
Improvements in the performance of processors and networks have made it feasible to treat collections of workstations, servers, clusters and supercomputers as integrated computing resources or Grids. However, the very heterogeneity that is the strength of computational and data Grids can also make application development for such an environment extremely difficult. Application development in a Grid computing environment faces significant challenges in the form of problem granularity, latency and bandwidth issues as well as job scheduling. Currently existing Grid technologies limit the development of Grid applications to certain classes, namely, embarrassingly parallel, hierarchical parallelism, work flow and database applications. Of all these classes, embarrassingly parallel applications are the easiest to develop in a Grid computing framework. The work presented here deals with creating a Grid‐enabled, high‐throughput, standalone version of a bioinformatics application, BLAST, using Globus as the Grid middleware. BLAST is a sequence alignment and search technique that is embarrassingly parallel in nature and thus amenable to adaptation to a Grid environment. A detailed methodology for creating the Grid‐enabled application is presented, which can be used as a template for the development of similar applications. The application has been tested on a ‘mini‐Grid’ testbed and the results presented here show that for large problem sizes, a distributed, Grid‐enabled version can help in significantly reducing execution times. Copyright © 2005 John Wiley & Sons, Ltd.  相似文献   

3.
We have developed the SYNSEIS (SYNthetic SEISmogram) tool within the GEON (GEOscience Network) project to enable efficient computations of synthetic seismic waveforms for research and education. SYNSEIS is built as a distributed system to support the calculation of synthetic seismograms in 2D/3D media. The underlying simulation software is a finite difference code, E3D, developed by LLNL (S. Larsen). This code is embedded within the SYNSEIS environment and used by our SYNSEIS tool to simulate seismic waveforms of either earthquakes or explosions at regional distances (<1000 km). The SYNSEIS architecture is based around a Web service model. Especially, the computing Web services seamlessly access Grid computing resources by hiding the complexity of grid technologies. Even though the Grid computing is well-established in many computing communities, its use among domain scientists still is not trivial because of multiple levels of complexities encountered. We have also developed the grid-enabling E3D application code which takes our own dialect XML inputs that include geological models that are accessible through standard Web services. Also, the XML inputs for this application code contain structural geometries, source parameters, seismic velocity, density, attenuation values, number of time steps to compute, and number of stations.In this paper, we emphasize the development of a state-of-the-art web-based scientific computational environment. Our system can be used to promote an efficient and effective modeling environment to help scientists as well as educators in their daily activities and speed up the scientific discovery process.  相似文献   

4.
基于GOS的国家网格集成环境及应用实例开发*   总被引:3,自引:0,他引:3  
中国国家网格作为国内最主要的网格为用户提供了良好的计算服务.详细讨论了网格中间件系统软件GOS的功能和结构,以及基于GOS的中国国家网格集成环境,给出了生物信息软件MEME在国家网格环境下的集成实现,并提出了今后的工作方向.  相似文献   

5.
Grid users always expect to meet some challenges to employ Grid resources, such as customized computing environment and QoS support. In this paper, we propose a new methodology for Grid computing – to use virtual machines as computing resources and provide Virtual Distributed Environments (VDE) for Grid users. It is declared that employing virtual environment for Grid computing can bring various advantages, for instance, computing environment customization, QoS guarantee and easy management. A light weight Grid middleware, Grid Virtualization Engine, is developed accordingly to provide functions of building virtual environment for Grids. We also present a typical use case, on-demand build a virtual e-Science infrastructure to justify the methodology.  相似文献   

6.
The Data Grid provides massive aggregated computing resources and distributed storage space to deal with data-intensive applications. Due to the limitation of available resources in the grid as well as production of large volumes of data, efficient use of the Grid resources becomes an important challenge. Data replication is a key optimization technique for reducing access latency and managing large data by storing data in a wise manner. Effective scheduling in the Grid can reduce the amount of data transferred among nodes by submitting a job to a node where most of the requested data files are available. In this paper two strategies are proposed, first a novel job scheduling strategy called Weighted Scheduling Strategy (WSS) that uses hierarchical scheduling to reduce the search time for an appropriate computing node. It considers the number of jobs waiting in a queue, the location of the required data for the job and the computing capacity of the sites Second, a dynamic data replication strategy, called Enhanced Dynamic Hierarchical Replication (EDHR) that improves file access time. This strategy is an enhanced version of the Dynamic Hierarchical Replication strategy. It uses an economic model for file deletion when there is not enough space for the replica. The economic model is based on the future value of a data file. Best replica placement plays an important role for obtaining maximum benefit from replication as well as reducing storage cost and mean job execution time. So, it is considered in this paper. The proposed strategies are implemented by OptorSim, the European Data Grid simulator. Experiment results show that the proposed strategies achieve better performance by minimizing the data access time and avoiding unnecessary replication.  相似文献   

7.
Grid computing promises access to large amounts of computing power, but so far adoption of Grid computing has been limited to highly specialized experts for three reasons. First, users are used to batch systems, and interfaces to Grid software are often complex and different to those in batch systems. Second, users are used to having transparent file access, which Grid software does not conveniently provide. Third, efforts to achieve wide‐spread coordination of computers while solving the first two problems is hampered when clusters are on private networks. Here we bring together a variety of software that allows users to almost transparently use Grid resources as if they were local resources while providing transparent access to files, even when private networks intervene. As a motivating example, the BaBar Monte Carlo production system is deployed on a truly distributed environment, the European DataGrid, without any modification to the application itself. Copyright © 2005 John Wiley & Sons, Ltd.  相似文献   

8.
网格环境下的集群系统作业管理研究   总被引:2,自引:4,他引:2  
网格计算已经逐渐形成一个重要的新领域。相对于传统的分布式计算,它的显著之处在于它能够共享网络上的各种资源,包括地理上分布的各种计算资源。PBS是广泛应用于并行计算机的作业管理系统,它可以按照用户定义的配置参数相对公平地为每个作业分配系统资源。但是在网格环境范围内对集群系统进行管理仍然是一门有待研究的课题。利用网格系统软件和集群系统管理软件,实现了一种在网格环境下对集群系统作业进行管理的方法。  相似文献   

9.
QoS guided Min-Min heuristic for grid task scheduling   总被引:75,自引:1,他引:74       下载免费PDF全文
Task scheduling is an integrated component of computing.With the emergence of Grid and ubiquitous computing,new challenges appear in task scheduling based on properties such as security,quality of service,and lack of central control within distributed administrative domains.A Grid task scheduling framework must be able to deal with these issues.One of the goals of Grid task scheduling is to achivev high system throughput while matching applications with the available computing resources.This matching of resources in a non-deterministically shared heterogeneous environment leads to concerns over Quality of Service (QoS).In this paper a novel QoS guided task scheduling algorithm for Grid computing is introduced.The proposed novel algorithm is based on a general adaptive scheduling heuristics that includes QoS guidance.The algorithm is evaluated within a simulated Grid environment.The experimental results show that the nwe QoS guided Min-Min heuristic can lead to significant performance gain for a variety of applications.The approach is compared with others based on the quality of the prediction formulated by inaccurate information.  相似文献   

10.
如何聚合网络中分布异构的计算资源来解决大规模的科学计算问题,和如何减少并行程序设计的复杂性,一直是网格计算研究的难点之一。文章提出了一种基于CORBA构件技术的计算网格新思想,构造了一个计算网格的模型(CCGM)。该模型能够充分地利用构件技术带来的可组装和易管理的特性来形成网格计算。并通过问题的抽象定义和使用ParIDL工具将问题的定义映射到CCGM之上,简化了计算网格应用的开发。通过测试和分析CCG(ComponentbasedComputationalGrid)系统,表明CCG系统具有较好的加速比。  相似文献   

11.
In recent years, network bandwidth and quality has been drastically improved, even much faster than the enhancement of computer performance. The various communication and computing tasks in the fields such as telecommunication, multimedia, information technology, and construction simulation, can be integrated and applied in a distributed computing environment nowadays. However, as the demands of many researches for computing resources gradually grow, Grid Computing integrated with a distributed computing environment and the Internet (network) has gained more attention. The so-called Grid Computing is to utilize the idle computing resources (nodes) on the network to facilitate the execution of complicated tasks that require large-scale computing. In other words, the composition of Grid resources is dynamic and varies with time. Thus, when selecting nodes for executing a task, the dynamic of the nodes in the Grid must be considered, and to exploit the effectiveness of the resources, they have to be properly selected according to the properties of the task. This study proposed a hybrid load balancing policy which integrated static and dynamic load balancing technologies to assist in the selection for effective nodes. In addition, if any selected node can no longer provide resources, it can be promptly identified and replaced with a substitutive node to maintain the execution performance and the load balancing of the system.  相似文献   

12.
Biologists, medical experts, biochemical engineers and researchers working on DNA microarray experiments are increasingly turning on Grid computing with the scope of leveraging the Grid’s computing power, immense storage resources, and quality of service to the expedient processing of a wide range of datasets. In this paper we present a combined experience of grid application experts and bioinformatics scientists in deploying a pilot service enabling computationally efficient processing and analysis of data stemming from microarray experiments. This pilot service is accessible over the Hellenic portion of the EGEE grid and has been demonstrated in the scope of several public events. We highlight the process of grid application enablement, grid deployment challenges, as well as lessons learnt from a bi-annual effort to port and deploy a MATLAB DNA microarray application on a production grid. In addition to describing the parallelization of the application, we also emphasize on the development of a distributed federated database for storing and post-processing the results of the microarray experiments. Overall we believe that our experience could be proven valuable not only to microarray data scientists but also to other Grid users that intend to Grid-enable and deploy their applications.  相似文献   

13.
The grid design strongly depends on not only a network infrastructure but also a superstructure, that is, a social structure of virtual organizations where people trust each other, share resources and work together. Open Bioinformatics Grid (OBIGrid) is a grid aimed at building a cooperative bioinformatics environment for computer sicentists and biologists. In October 2003, OBIGrid consisted of 293 nodes with 492 CPUs provided by 27 sites at universities, laboratories and other enterprises, connected by a virtual private network over the Internet. So many organizations have participated because OBIGrid has been conscious of constructing a superstructure on a grid as well as a grid infrastructure. For the benefit of OBIGrid participants, we have developed a series of life science application services: an open bioinformatics environment (OBIEnv), a scalable genome database (OBISgd), a genome annotation system (OBITco), a biochemical network simulator (OBIYagns), and to name a few. Akihiko Konagaya, Dr.Eng.: He is Project Director of Bioinformatics Group, RIKEN Genomic Sciences Center. He received his B.S. and M.S. from Tokyo Institute of Technology in 1978 and 1980 in Informatics Science, and joined NEC Corporation in 1980, Japan Advanced Institute of Science and Technology in 1997, RIKEN GSC in 2003. His research covers wide area from computer architectures to bioinformatics. He has been much involved into the Open Bioinformatics Grid project since 2002. Fumikazu Konishi, Dr.Eng.: He is researcher at Bioinformatics Group, RIKEN Genomic Sciences Center since 2000. He received his M.S. (1996) and Ph.D. (2001) from Tokyo Metropolitan Institute of Technology. He served as an assistant in Department of Production and Information Systems Engineering, Tokyo Metropolitan Institute of Technology since 2000. He also works in Structurome Research Group, RIKEN Harima Institute from 2001. His research interests include concurrent engineering, bioinformatics and the Grid. He has deeply affected to the design of OBIGrid. Mariko Hatakeyama, Ph.D.: She recieved her Ph.D. degree from Tokyo University of Agriculture and Technology. She is Research Scientist at Bioinformactis Group, RIKEN Genomic Sciences Center. Her research topics are: microbiology, enzymology and signal transduction of mammalian cells. She is now working on computational simulation of signal transduction systems and on thermophilic bacteria project. Kenji Satou, Ph.D.: He is Associate Professor of School of Knowledge Science at Japan Advanced Institute of Science and Technology. He received B.S., M.E. and Ph.D. degrees from Kyushu University, in 1987, 1989 and 1995 respectively. For each degree, he majored in computer engineering. His research interests have progressed from deductive database application through data mining to Grid computing and natural language processing. His current field of research is bioinformatics. He prefers set-oriented manner of thinking, and usually wonders how he can construct an intelligent-looking system based on large amount of heterogeneous data and computer resources.  相似文献   

14.
Structural bioinformatics applies computational methods to analyze and model three-dimensional molecular structures. There is a huge number of applications available to work with structural data on large scale. Using these tools on distributed computing infrastructures (DCIs), however, is often complicated due to a lack of suitable interfaces. The MoSGrid (Molecular Simulation Grid) science gateway provides an intuitive user interface to several widely-used applications for structural bioinformatics, molecular modeling, and quantum chemistry. It ensures the confidentiality, integrity, and availability of data via a granular security concept, which covers all layers of the infrastructure. The security concept applies SAML (Security Assertion Markup Language) and allows trust delegation from the user interface layer across the high-level middleware layer and the Grid middleware layer down to the HPC facilities. SAML assertions had to be integrated into the MoSGrid infrastructure in several places: the workflow-enabled Grid portal WS-PGRADE (Web Services Parallel Grid Runtime and Developer Environment), the gUSE (Grid User Support Environment) DCI services, and the cloud file system XtreemFS. The presented security infrastructure allows a single sign-on process to all involved DCI components and, therefore, lowers the hurdle for users to utilize large HPC infrastructures for structural bioinformatics.  相似文献   

15.
To achieve high performance distributed data access and computing in Grid environment, monitoring of resource and network performance is vital. Our proposed Grid network monitoring architecture is modeled by the Grid scheduler. The proposed Grid network monitoring retrieves network metrics using sensors as network monitoring tools. The mobile agents are migrated to start the sensors to measure the network metrics in all Grid Resources from the Resource Broker. The raw data provided by the monitoring tools is used to produce a high level view of the Grid through the set of internal cost functions. The network cost function is formed by combining various network metrics such as bandwidth, Round Trip Time, jitter and packet loss to measure the network performance. This paper presents the Grid Resource Brokering strategy which analyzes the network metrics along with the resource metrics for the selection of the Grid resource to submit the job and the proposed approach is integrated with CARE Resource Broker (CRB) for job submission. The experimental results are evident for the minimization of job completion time for the submitted job. The simulation results also prove that the more number of jobs are completed with the proposed strategy which influences the better utilization of the Grid resources.  相似文献   

16.
网格具有异构、动态、多域的特点,这给网格的安全研究带来了新的挑战。网格安全基础设施(GSI)解决了网格环境下的安全认证和安全通信,但没有对访问控制问题足够重视。传统的访问控制方法仅仅从访问资源的角度来解决安全问题。主体操作方式的多样性和用户计算环境的异构性导致了网格环境的动态性和不确定性。当这种动态性对访问主体造成影响时就需要改进访问控制方法,要求访问控制系统能够动态适应网格环境的安全状态变化。针对该问题本文提出了在访问控制前加入安全评估模型(SEMFG),由该模型对访问环境和访问主体进行综合评估,监控网格环境和访问主体的行为,并用评估结果动态指导访问控制。  相似文献   

17.
Grid programming: some indications where we are headed   总被引:2,自引:0,他引:2  
D. Laforenza 《Parallel Computing》2002,28(12):1733-1752
Grid computing enables the development of large scientific applications on an unprecedented scale. Grid-aware applications, also called meta-applications or multi-disciplinary applications, make use of coupled computational resources that are not available at a single site. In this light, the Grids let scientists solve larger or new problems by pooling together resources that could not be coupled easily before. It is well known that the programmer’s productivity in designing and implementing efficient distributed/parallel applications on high-performance computers is still usually a very time-consuming task. Grid computing makes the situation worse. Consequently, the development of Grid programming environments that would enable programmers to efficiently exploit this technology is an important and hot research issue.

After an introduction on the main Grid programming issues, this paper will review the most important approaches/projects conducted in this field worldwide.  相似文献   


18.
Fault tolerant Grid computing is of vital importance as the Grid and Mobile computing worlds converge to the Mobile Grid computing paradigm. We present an efficient scheme based on task replication, which utilizes the Weibull reliability function for the Grid resources so as to estimate the number of replicas that are going to be scheduled in order to guarantee a specific fault tolerance level for the Grid environment. The additional workload that is produced by the replication is handled by a resource management scheme which is based on the knapsack formulation and which aims to maximize the utilization and profit of the Grid infrastructure. The proposed model has been evaluated through simulation and has shown its efficiency for being used in a middleware approach in future mobile Grid environments.  相似文献   

19.
Data Grid integrates graphically distributed resources for solving data intensive scientific applications. Effective scheduling in Grid can reduce the amount of data transferred among nodes by submitting a job to a node, where most of the requested data files are available. Scheduling is a traditional problem in parallel and distributed system. However, due to special issues and goals of Grid, traditional approach is not effective in this environment any more. Therefore, it is necessary to propose methods specialized for this kind of parallel and distributed system. Another solution is to use a data replication strategy to create multiple copies of files and store them in convenient locations to shorten file access times. To utilize the above two concepts, in this paper we develop a job scheduling policy, called hierarchical job scheduling strategy (HJSS), and a dynamic data replication strategy, called advanced dynamic hierarchical replication strategy (ADHRS), to improve the data access efficiencies in a hierarchical Data Grid. HJSS uses hierarchical scheduling to reduce the search time for an appropriate computing node. It considers network characteristics, number of jobs waiting in queue, file locations, and disk read speed of storage drive at data sources. Moreover, due to the limited storage capacity, a good replica replacement algorithm is needed. We present a novel replacement strategy which deletes files in two steps when free space is not enough for the new replica: first, it deletes those files with minimum time for transferring. Second, if space is still insufficient then it considers the last time the replica was requested, number of access, size of replica and file transfer time. The simulation results show that our proposed algorithm has better performance in comparison with other algorithms in terms of job execution time, number of intercommunications, number of replications, hit ratio, computing resource usage and storage usage.  相似文献   

20.
Grid computing now becomes a practical computing paradigm and solution for distributed systems and applications. Currently increasing resources are involved in Grid environments and a large number of applications are running on computational Grids. Unfortunately Grid computing technologies are still far away from reach of inexperienced application users, e.g., computational scientists and engineers. A software layer is required to provide an easy interface of Grids to end users.To meet this requirement HEAVEN (Hosting European Application Virtual ENvironment) upperware is proposed to build on top of Grid middleware. This paper presents HEAVEN philosophy of virtual computing for Grids – a combinational idea of simulation and emulation approaches. The concept of Virtual Private Computing Environment (VPCE) is thereafter proposed and defined. The design and current implementation of HEAVEN upperware are discussed in detail. Use case of Ag2D application justifies the philosophy of HEAVEN virtual computing methodology and the design/implementation of HEAVEN upperware.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号